Cypher Queries ============== PyGraphDB includes an initial read-only Cypher API through ``GraphDB.query(cypher)``. The current implementation is intentionally small and maps directly to features that already have efficient database APIs: indexed label scans, anchored typed traversal, and typed path sampling. Relationship types are read from ``edge.properties["type"]``. Node labels are stored natively through ``Node(labels=[...])`` and maintained in a sorted label index. Supported Feature Matrix ------------------------ The table below distinguishes features available through the Python database API from features exposed through the Cypher API. Legend: ✅ supported, 🟡 partially supported, ❌ not supported. .. list-table:: Current DB API and Cypher API support :header-rows: 1 :widths: 32 18 18 32 * - Feature - DB API - Cypher API - Notes * - Node and edge property storage - ✅ - ✅ - Cypher can return bound ``Node`` and ``Edge`` objects and project properties such as ``RETURN n.name`` or ``RETURN r.score``. * - Native node labels - ✅ - ✅ - DB API supports ``Node(labels=[...])`` and ``nodes_by_label``. Cypher supports ``MATCH (n:Label) RETURN n``. * - Exact-match node property indexes - ✅ - 🟡 - DB API supports explicit indexes via ``create_node_property_index``. Cypher uses them for ``MATCH (n:Label {name: "..."}) RETURN n`` when registered. * - Exact-match edge property indexes - ✅ - ❌ - DB API supports explicit indexes via ``create_edge_property_index``. Cypher edge property predicates are not implemented yet. * - Dedicated relationship type field - 🟡 - 🟡 - Typed traversal uses ``edge.properties["type"]`` instead of a dedicated ``Edge.type`` field. * - Relationship type catalog - ✅ - ❌ - DB API supports ``edges_by_type``. Cypher does not yet support unanchored ``MATCH ()-[:TYPE]->()`` scans. * - Anchored one-hop typed traversal - ✅ - ✅ - DB API uses ``iter_typed_adjacency`` or ``neighbors_by_edge_type``. Cypher supports ``MATCH (a {id: "..."})-[:TYPE]->(b)``. * - Anchored multi-hop typed traversal - ✅ - ✅ - Cypher supports repeated outgoing typed hops from an anchored start node. * - Reverse typed traversal - ✅ - ✅ - DB API supports ``direction="in"``. Cypher supports ``<-[:TYPE]-`` from an anchored node. * - Undirected typed traversal - ✅ - ✅ - DB API supports ``direction="any"``. Cypher supports ``-[:TYPE]-`` from an anchored node. * - Untyped BFS traversal - ✅ - ❌ - Available as ``GraphDB.bfs`` over legacy adjacency lists. * - Single-hop typed neighbor sampling - ✅ - ❌ - Available as ``GraphDB.sample_neighbors``. * - Multi-hop typed path sampling - ✅ - ✅ - Cypher exposes this through ``CALL pg.sample_typed_paths(...) YIELD path RETURN path``. * - Materialized sampled subgraph - ✅ - ❌ - Available as ``GraphDB.sample_typed_subgraph``. * - Property filtering with ``WHERE`` - 🟡 - ❌ - DB API has exact-match index lookup helpers, but Cypher ``WHERE`` parsing is future work. * - Result limiting - ✅ - ✅ - Cypher supports ``LIMIT`` on label scans, anchored typed traversals, and ``pg.sample_typed_paths`` calls. * - Mutating Cypher queries - ✅ - ❌ - Use ``put_node``, ``put_edge``, ``put_edges_bulk``, and ingestion APIs directly. Indexed Label Scans ------------------- Create nodes with native labels, then query by label without scanning every node. .. code-block:: python graph_db.put_node(Node(node_id="drug-1", labels=["Drug"], properties={"name": "Aspirin"})) result = graph_db.query('MATCH (d:Drug) RETURN d') for record in result: print(record["d"].get_id) Indexed Label and Property Lookup --------------------------------- Exact-match property indexes are explicit. Register an index before relying on it for performance-sensitive lookup. .. code-block:: python graph_db.create_node_property_index("name") result = graph_db.query('MATCH (d:Drug {name: "Aspirin"}) RETURN d') for record in result: print(record["d"].properties["name"]) If a property index is not registered, Cypher still restricts the search to the label index and then filters decoded nodes in Python. Property Projections and Limits ------------------------------- Use dot notation in ``RETURN`` to project values from bound nodes and relationships. Missing properties return ``None``. The special fields ``id`` and ``labels`` are available on nodes; ``id``, ``source``, and ``target`` are available on relationships. .. code-block:: python result = graph_db.query('MATCH (d:Drug) RETURN d.id, d.name LIMIT 10') for record in result: print(record["d.id"], record["d.name"]) Anchored One-Hop Traversal -------------------------- Use ``GraphDB.query`` for an anchored outgoing typed traversal. The start node must be constrained by ``id``. .. code-block:: python result = graph_db.query( 'MATCH (d {id: "drug-1"})-[:drug-to-protein]->(p) RETURN d, p' ) for record in result: print(record["d"].get_id, record["p"].get_id) The result object exposes ``columns`` and ``records``: .. code-block:: python print(result.columns) # ("d", "p") print(len(result)) Relationship Variables ---------------------- Relationship variables can be bound and returned. .. code-block:: python result = graph_db.query( 'MATCH (d {id: "drug-1"})-[r:drug-to-disease]->(x) RETURN d, r, x' ) for record in result: print(record["r"].get_id, record["r"].properties) Relationship properties can be projected directly. .. code-block:: python result = graph_db.query( 'MATCH (d {id: "drug-1"})-[r:drug-to-disease]->(x) RETURN r.id, r.type LIMIT 1' ) Anchored Multi-Hop Traversal ---------------------------- Cypher supports repeated outgoing typed hops from the anchored start node. .. code-block:: python result = graph_db.query( 'MATCH (d {id: "drug-1"})-[:drug-to-protein]->(p)-[:protein-to-disease]->(x) RETURN d, p, x' ) for record in result: print(record["d"].get_id, record["p"].get_id, record["x"].get_id) Relationship variables can be used across multiple hops as well. .. code-block:: python result = graph_db.query( 'MATCH (d {id: "drug-1"})-[r1:drug-to-protein]->(p)-[r2:protein-to-disease]->(x) RETURN r1, r2, x' ) Reverse and Undirected Traversal -------------------------------- Anchored typed traversals can follow outgoing, incoming, or either-direction relationships. .. code-block:: python incoming = graph_db.query( 'MATCH (p {id: "protein-1"})<-[:drug-to-protein]-(d) RETURN p, d' ) undirected = graph_db.query( 'MATCH (p {id: "protein-1"})-[:drug-to-protein]-(n) RETURN n' ) Direction can vary by hop: .. code-block:: python result = graph_db.query( 'MATCH (x {id: "disease-1"})<-[:protein-to-disease]-(p)<-[:drug-to-protein]-(d) RETURN x, p, d' ) Sampling Procedure ------------------ Typed path sampling is exposed as a PyGraphDB-specific procedure call. This is not standard openCypher syntax; it delegates to ``GraphDB.sample_typed_paths``. .. code-block:: python result = graph_db.query( 'CALL pg.sample_typed_paths(["drug-1"], ' '[{"edge_type": "drug-to-protein", "direction": "out", "sample_size": 2}, ' '{"edge_type": "protein-to-disease", "direction": "out", "sample_size": 1}]) ' 'YIELD path RETURN path' ) for record in result: print(record["path"]) Current Cypher Limitations -------------------------- Unsupported Cypher features raise ``ValueError`` with a message describing the supported subset. The current Cypher API does not yet support: - Multiple labels in one node pattern, such as ``(n:Drug:Approved)``. - Unanchored all-node scans such as ``MATCH (n) RETURN n``. - ``WHERE`` predicates. - ``ORDER BY``, aggregation, joins across separate patterns, or mutation clauses.