Typed Traversal and Sampling ============================ Typed traversal uses ``edge.properties["type"]``. When an edge has a type, PyGraphDB stores typed adjacency records for efficient directional scans. Create a Typed Graph -------------------- .. code-block:: python from pygraphdb.graphdb import Edge, GraphDB, Node from pygraphdb.kvstores import LMDBStore from pygraphdb.serializers import PickleSerializer graph_db = GraphDB(LMDBStore(path="typed_graph_lmdb"), PickleSerializer()) for node_id, kind in [ ("drug-1", "drug"), ("protein-1", "protein"), ("protein-2", "protein"), ("disease-1", "disease"), ]: graph_db.put_node(Node(node_id=node_id, properties={"kind": kind})) graph_db.put_edges_bulk([ Edge(edge_id="d1-p1", source="drug-1", target="protein-1", properties={"type": "drug-to-protein"}), Edge(edge_id="d1-p2", source="drug-1", target="protein-2", properties={"type": "drug-to-protein"}), Edge(edge_id="p1-dis1", source="protein-1", target="disease-1", properties={"type": "protein-to-disease"}), ]) Query Typed Neighbors --------------------- .. code-block:: python proteins = graph_db.neighbors_by_edge_type( "drug-1", "drug-to-protein", direction="out", ) print(proteins) Query Typed Edges ----------------- .. code-block:: python edge_ids = graph_db.edges_by_edge_type( "drug-1", "drug-to-protein", direction="out", ) Sample Neighbors ---------------- ``sample_neighbors`` uses reservoir sampling, so memory is bounded by ``sample_size`` instead of by node degree. .. code-block:: python import random sample = graph_db.sample_neighbors( "drug-1", "drug-to-protein", direction="out", sample_size=1, rng=random.Random(7), ) Object-Based Sampling Patterns ------------------------------ Use ``SamplingHop`` and ``SamplingPattern`` for validated, documented sampling configuration objects. .. code-block:: python import random from pygraphdb.sampling import SamplingHop, SamplingPattern pattern = SamplingPattern([ SamplingHop("drug-to-protein", direction="out", sample_size=2), SamplingHop("protein-to-disease", direction="out", sample_size=1), ]) paths = graph_db.sample_typed_paths( seed_ids=["drug-1"], pattern=pattern, rng=random.Random(3), ) Dictionary-Based Sampling Patterns ---------------------------------- Existing dictionary configurations are still supported. .. code-block:: python pattern = [ {"edge_type": "drug-to-protein", "direction": "out", "sample_size": 2}, {"edge_type": "protein-to-disease", "direction": "out", "sample_size": 1}, ] paths = graph_db.sample_typed_paths(["drug-1"], pattern) Cypher Sampling Procedure ------------------------- The Cypher API exposes multi-hop typed path sampling through a PyGraphDB-specific procedure call. This delegates to ``GraphDB.sample_typed_paths`` and returns one ``path`` value per sampled path. .. code-block:: python result = graph_db.query( 'CALL pg.sample_typed_paths(["drug-1"], ' '[{"edge_type": "drug-to-protein", "direction": "out", "sample_size": 2}, ' '{"edge_type": "protein-to-disease", "direction": "out", "sample_size": 1}]) ' 'YIELD path RETURN path' ) for record in result: print(record["path"]) Sample a Materialized Subgraph ------------------------------ .. code-block:: python subgraph = graph_db.sample_typed_subgraph( seed_ids=["drug-1"], pattern=pattern, ) print(subgraph["nodes"].keys()) print(subgraph["edges"].keys()) print(subgraph["paths"]) Rebuild Typed Adjacency ----------------------- If edge records already exist but typed adjacency indexes are missing, rebuild them from stored edges. .. code-block:: python rebuilt = graph_db.rebuild_typed_adjacency() print(f"rebuilt {rebuilt} typed adjacency records")