Quickstart
==========

Create a Graph
--------------

.. code-block:: python

   from pygraphdb.graphdb import Edge, GraphDB, Node
   from pygraphdb.kvstores import LMDBStore
   from pygraphdb.serializers import PickleSerializer

   graph_db = GraphDB(LMDBStore(path="quickstart_lmdb"), PickleSerializer())

   alice = Node(node_id="alice", labels=["Person"], properties={"name": "Alice", "age": 30})
   bob = Node(node_id="bob", labels=["Person"], properties={"name": "Bob", "age": 25})

   graph_db.put_node(alice)
   graph_db.put_node(bob)

   edge = Edge(
       edge_id="alice-bob",
       source=alice.get_id,
       target=bob.get_id,
       properties={"type": "friend", "weight": 0.9},
   )
   graph_db.put_edge(edge)

   print(graph_db.get_node(b"alice").to_dict())
   print(graph_db.get_edge(b"alice-bob").to_dict())

   graph_db.close()

Bulk Insert Edges
-----------------

``put_edges_bulk`` stores many edge records and updates adjacency indexes in one
operation.

.. code-block:: python

   nodes = [Node(node_id=f"user-{idx}") for idx in range(4)]
   for node in nodes:
       graph_db.put_node(node)

   edges = [
       Edge(edge_id="u0-u1", source="user-0", target="user-1", properties={"type": "follows"}),
       Edge(edge_id="u0-u2", source="user-0", target="user-2", properties={"type": "follows"}),
       Edge(edge_id="u2-u3", source="user-2", target="user-3", properties={"type": "follows"}),
   ]
   graph_db.put_edges_bulk(edges)

For append-only ingestion where edge IDs are known to be new, skip replacement
checks to avoid one existing-edge read per edge:

.. code-block:: python

   graph_db.put_edges_bulk(edges, check_existing=False)

Fetch Nodes in Bulk
-------------------

.. code-block:: python

   fetched = graph_db.get_nodes([b"user-0", b"user-1", b"missing"])
   for node in fetched:
       print(None if node is None else node.get_id)

Labels and Exact-Match Indexes
------------------------------

Labels are stored natively on ``Node`` objects and maintained in a sorted label
index. Label lookups avoid full node scans.

.. code-block:: python

   graph_db.put_node(Node(node_id="drug-1", labels=["Drug"], properties={"name": "Aspirin", "kind": "drug"}))
   graph_db.put_node(Node(node_id="protein-1", labels=["Protein"], properties={"name": "PTGS1", "kind": "protein"}))

   drugs = graph_db.nodes_by_label("Drug")
   print([node.get_id for node in drugs])

Property indexes are explicit so ingestion does not pay for indexes you do not
need. Register an exact-match node or edge property index before using it for
performance-sensitive lookup.

.. code-block:: python

   graph_db.create_node_property_index("name")
   aspirin = graph_db.nodes_by_property("name", "Aspirin")

   graph_db.create_edge_property_index("weight")
   strong_edges = graph_db.edges_by_property("weight", 0.9)

Relationship types are indexed through the existing ``edge.properties["type"]``
convention.

.. code-block:: python

   friend_edges = graph_db.edges_by_type("friend")

Columnar Ingestion
------------------

``ingest_nodes_arrow`` and ``ingest_edges_arrow`` accept Arrow-like columns or
plain Python sequences. The first implementation requires caller-provided
serialized ``node_value`` and ``edge_value`` payloads so existing serializer
behavior remains unchanged.

.. code-block:: python

   nodes = [
       Node(node_id="drug-1", properties={"kind": "drug"}),
       Node(node_id="protein-1", properties={"kind": "protein"}),
   ]
   graph_db.ingest_nodes_arrow(
       [node.get_id for node in nodes],
       [graph_db.serialize_node_value(node) for node in nodes],
   )

   edge = Edge(
       edge_id="d1-p1",
       source="drug-1",
       target="protein-1",
       properties={"type": "drug-to-protein", "score": 0.9},
   )
   graph_db.ingest_edges_arrow(
       [edge.get_id],
       [edge.source],
       [edge.target],
       [edge.get_type],
       [graph_db.serialize_edge_value(edge)],
       append_only=True,
   )

Polars users can use ``ingest_nodes_polars`` and ``ingest_edges_polars`` with
``node_value`` and ``edge_value`` binary columns. With ``PyRexStore`` and
``pyrex-rocksdb>=0.3.0a0``, these methods use native RocksDB columnar batch
writes when available. Other stores use the Python bulk fallback.

Traverse With BFS
-----------------

.. code-block:: python

   visited = graph_db.bfs(b"user-0", direction="any")
   print(visited)

Query With Cypher
-----------------

``GraphDB.query`` supports an initial read-only Cypher subset for indexed label
scans, anchored typed traversal, and typed path sampling.

.. code-block:: python

   result = graph_db.query(
       'MATCH (drug {id: "drug-1"})-[:drug-to-protein]->(protein) RETURN drug, protein'
   )

   for record in result:
       print(record["drug"].get_id, record["protein"].get_id)

Indexed labels are available through Cypher as well:

.. code-block:: python

   result = graph_db.query('MATCH (drug:Drug {name: "Aspirin"}) RETURN drug')

Multi-hop typed traversal is supported when each hop is outgoing and has an edge
type:

.. code-block:: python

   result = graph_db.query(
       'MATCH (drug {id: "drug-1"})-[:drug-to-protein]->(protein)-[:protein-to-disease]->(disease) RETURN drug, protein, disease'
   )

See :doc:`cypher` for the full supported subset and current limitations.

Use Stable IDs
--------------

Stable IDs make notebooks, tests, and serialized records easier to inspect.

.. code-block:: python

   drug = Node(node_id="drug-1", properties={"kind": "drug", "name": "Aspirin"})
   protein = Node(node_id="protein-1", properties={"kind": "protein"})
   edge = Edge(
       edge_id="drug-1-protein-1",
       source=drug.get_id,
       target=protein.get_id,
       properties={"type": "drug-to-protein"},
   )