API Reference¶
Graph Models and Database¶
- class pygraphdb.graphdb.Edge(edge_id=None, source=None, target=None, properties=None)[source]¶
Bases:
objectDirected graph edge with source, target, and properties.
- Parameters:
edge_id – Optional stable edge identifier. A UUID is generated when omitted.
source – Source node ID or
Nodeinstance.target – Target node ID or
Nodeinstance.properties – Optional edge attributes. Typed traversal reads
properties["type"].
Examples
>>> Edge(edge_id="d1-p1", source="drug-1", target="protein-1").source 'drug-1'
- __init__(edge_id=None, source=None, target=None, properties=None)[source]¶
If no edge_id is provided, generate a UUID.
- property get_id¶
Unique identifier for this edge.
- property get_id_bytes¶
Return the edge ID encoded as UTF-8 bytes.
Examples
>>> Edge(edge_id="d1-p1").get_id_bytes b'd1-p1'
- property get_type¶
Return the typed traversal edge type.
Examples
>>> Edge(properties={"type": "drug-to-protein"}).get_type 'drug-to-protein'
- class pygraphdb.graphdb.GraphDB(store, serializer, indexed_node_properties=None, indexed_edge_properties=None)[source]¶
Bases:
objectHigh-level interface to manage Node/Edge storing, retrieval, and indexing.
- Parameters:
- __init__(store, serializer, indexed_node_properties=None, indexed_edge_properties=None)[source]¶
Initialize a graph database wrapper.
- Parameters:
store (KVStore) –
KVStoreinstance such asLMDBStore,LevelDBStore, orPyRexStore.serializer (Serializer) – Serializer for node, edge, and adjacency payloads.
indexed_node_properties (Optional[list[str]]) – Optional exact-match node property indexes to maintain for future writes.
indexed_edge_properties (Optional[list[str]]) – Optional exact-match edge property indexes to maintain for future writes.
Examples
>>> from pygraphdb.kvstores import LMDBStore >>> from pygraphdb.serializers import PickleSerializer >>> graph = GraphDB(LMDBStore(path="/tmp/example"), PickleSerializer(), indexed_node_properties=["name"])
- bfs(start_node_id, direction='any', edge_key_serializer=<function GraphDB.<lambda>>, node_key_serializer=<function GraphDB.<lambda>>)[source]¶
Returns a list of node_ids in BFS order starting from start_node_id. Demonstrates how adjacency is used for graph traversal.
- create_edge_property_index(property_name)[source]¶
Register and rebuild an exact-match edge property index.
- Parameters:
property_name (str) – Edge property to index for exact-match lookup.
- Returns:
Number of existing edges added to the index.
Examples
>>> graph_db.create_edge_property_index("score") 7
- create_node_property_index(property_name)[source]¶
Register and rebuild an exact-match node property index.
- Parameters:
property_name (str) – Node property to index for exact-match lookup.
- Returns:
Number of existing nodes added to the index.
Examples
>>> graph_db.create_node_property_index("kind") 10
- delete_edge(edge_id, edge_key_serializer=<function GraphDB.<lambda>>)[source]¶
Removes the edge from the edge store, and from adjacency of both source and target nodes. If either node doesn’t exist, we skip gracefully.
- Parameters:
edge_id (str)
- delete_node(node_id)[source]¶
Delete a node by byte key.
- Parameters:
node_id – Node ID bytes.
Examples
>>> graph_db.delete_node(b"drug-1")
- edge_key_to_bytes(edge_key)[source]¶
Normalize an edge key to bytes.
- Parameters:
edge_key – String or bytes edge key.
- Returns:
UTF-8 encoded bytes.
Examples
>>> GraphDB.edge_key_to_bytes(None, "d1-p1") b'd1-p1'
- edge_type(edge)[source]¶
Return the type used by typed traversal for an edge.
- Parameters:
edge (Edge) – Edge to inspect.
- Returns:
Edge type string, or
None.
Examples
>>> GraphDB.edge_type(None, Edge(properties={"type": "drug-to-protein"})) 'drug-to-protein'
- edges_by_edge_type(node_id, edge_type, direction='out')[source]¶
Return edge IDs connected by a specific edge type.
- Parameters:
- Returns:
List of edge ID bytes.
Examples
>>> graph_db.edges_by_edge_type("drug-1", "drug-to-protein")
- edges_by_property(property_name, value)[source]¶
Return edges using an exact-match property index.
- Parameters:
property_name (str) – Indexed edge property name.
value – Exact property value to match.
- Returns:
List of decoded
Edgeobjects.
Examples
>>> graph_db.edges_by_property("score", 1)
- edges_by_type(edge_type)[source]¶
Return edges using the relationship type catalog.
- Parameters:
edge_type (str) – Relationship type stored in
edge.properties["type"].- Returns:
List of decoded
Edgeobjects.
Examples
>>> graph_db.edges_by_type("drug-to-protein")
- get_adjacency_list(node_id, direction='forward', return_raw=False)[source]¶
Returns the list of edge IDs connected to node_id. If none found, returns an empty list.
- Parameters:
node_id (bytes) – a string representing the node_id
direction – ‘forward’, ‘backward’ or ‘any’ -> controls whether the source, target, or un-directed adjacency of the node will be returned.
return_raw – if this flag is true it will return the data as they are stored (e.g., a dictionary of ‘source’ and ‘target’ lists. )
- Return type:
- get_edge(edge_id)[source]¶
Return an edge by byte key.
- Parameters:
edge_id – Edge ID bytes as stored in the backend.
- Returns:
The decoded edge, or
Nonewhen absent.- Return type:
Examples
>>> graph_db.get_edge(b"d1-p1")
- get_node(node_id)[source]¶
Return a node by byte key.
- Parameters:
node_id – Node ID bytes as stored in the backend.
- Returns:
The decoded node, or
Nonewhen absent.- Return type:
Examples
>>> graph_db.get_node(b"drug-1")
- get_node_keys_generator(num_nodes=None, key_offset=None)[source]¶
Yield node keys from the backing store.
- Parameters:
num_nodes – Optional maximum number of keys to yield.
key_offset – Optional starting key.
- Returns:
Generator of node key bytes.
Examples
>>> list(graph_db.get_node_keys_generator(num_nodes=10))
- get_nodes(node_ids)[source]¶
Use store.get_nodes_bulk(…) and deserialize each one. Return a list of Node (in the same order as node_ids, or possibly just all found).
- get_typed_adjacency(node_id, edge_type, direction='out')[source]¶
Return typed adjacency records with clean direction semantics.
out means source -> target, in means target -> source, and any returns the union of both directions.
- ingest_edges_arrow(edge_ids, sources, targets, edge_types, edge_values, *, append_only=True, native=True, chunk_size=100000)[source]¶
Ingest typed edges from Arrow-like columns.
edge_valuesis required and must contain serialized edge payloads compatible with the currentGraphDBserializer. This ingestion path writes edge records and typed adjacency records only; it intentionally skips legacy adjacency blobs for append-friendly bulk loading.- Parameters:
edge_ids – Arrow-like or Python column of edge IDs.
sources – Arrow-like or Python column of source node IDs.
targets – Arrow-like or Python column of target node IDs.
edge_types – Arrow-like or Python column of typed traversal labels.
edge_values – Arrow-like or Python column of serialized edge bytes.
append_only (bool) – Columnar ingestion currently requires
True.native (bool) – Use native backend columnar ingestion when available.
chunk_size (int) – Maximum rows per backend write.
- Returns:
Number of ingested edges.
- ingest_edges_polars(df, *, edge_id='edge_id', source='source', target='target', edge_type='edge_type', edge_value='edge_value', append_only=True, native=True, chunk_size=100000)[source]¶
Ingest typed edges from a Polars DataFrame.
The
edge_valuecolumn is required and must contain serialized edge payload bytes compatible with the currentGraphDBserializer.
- ingest_nodes_arrow(node_ids, node_values, *, native=True, chunk_size=100000)[source]¶
Ingest attributed nodes from Arrow-like columns.
node_valuesis required and must contain serialized node payloads compatible with the currentGraphDBserializer.
- ingest_nodes_polars(df, *, node_id='node_id', node_value='node_value', native=True, chunk_size=100000)[source]¶
Ingest attributed nodes from a Polars DataFrame.
The
node_valuecolumn is required and must contain serialized node payload bytes compatible with the currentGraphDBserializer.
- iter_edge_ids_by_property(property_name, value)[source]¶
Yield edge IDs from an exact-match edge property index.
- Parameters:
property_name (str) – Indexed edge property name.
value – Exact property value to match.
- Yields:
Edge ID bytes matching the property value.
Examples
>>> list(graph_db.iter_edge_ids_by_property("score", 1)) [b'e1']
- iter_edge_ids_by_type(edge_type)[source]¶
Yield edge IDs from the relationship type catalog.
- Parameters:
edge_type (str) – Relationship type stored in
edge.properties["type"].- Yields:
Edge ID bytes with the requested relationship type.
Examples
>>> list(graph_db.iter_edge_ids_by_type("drug-to-protein")) [b'd1-p1']
- iter_node_ids_by_label(label)[source]¶
Yield node IDs from the label index.
- Parameters:
label (str) – Node label to scan.
- Yields:
Node ID bytes with the requested label.
Examples
>>> list(graph_db.iter_node_ids_by_label("Drug")) [b'drug-1']
- iter_node_ids_by_property(property_name, value)[source]¶
Yield node IDs from an exact-match property index.
- Parameters:
property_name (str) – Indexed node property name.
value – Exact property value to match.
- Yields:
Node ID bytes matching the property value.
Examples
>>> list(graph_db.iter_node_ids_by_property("kind", "drug")) [b'drug-1']
- iter_typed_adjacency(node_id, edge_type, direction='out')[source]¶
Yield typed adjacency records with clean direction semantics.
- Parameters:
- Yields:
Typed adjacency records containing edge, neighbor, source, target, edge type, and concrete direction fields.
Examples
>>> graph_db.iter_typed_adjacency("drug-1", "drug-to-protein")
- key_to_string(key)[source]¶
Normalize a key to a string.
- Parameters:
key – String or UTF-8 bytes key.
- Returns:
String key.
Examples
>>> GraphDB.key_to_string(None, b"drug-1") 'drug-1'
- neighbors_by_edge_type(node_id, edge_type, direction='out')[source]¶
Return neighbor IDs connected by a specific edge type.
- Parameters:
- Returns:
List of neighbor ID bytes.
Examples
>>> graph_db.neighbors_by_edge_type("drug-1", "drug-to-protein")
- node_key_to_bytes(node_key)[source]¶
Normalize a node key to bytes.
- Parameters:
node_key – String or bytes node key.
- Returns:
UTF-8 encoded bytes.
Examples
>>> GraphDB.node_key_to_bytes(None, "drug-1") b'drug-1'
- nodes_by_label(label)[source]¶
Return nodes with a label using the label index.
- Parameters:
label (str) – Node label to scan.
- Returns:
List of decoded
Nodeobjects.
Examples
>>> graph_db.nodes_by_label("Drug")
- nodes_by_property(property_name, value)[source]¶
Return nodes using an exact-match property index.
- Parameters:
property_name (str) – Indexed node property name.
value – Exact property value to match.
- Returns:
List of decoded
Nodeobjects.
Examples
>>> graph_db.nodes_by_property("kind", "drug")
- put_edge(edge, update_adjacency=True)[source]¶
Store an edge and update adjacency indexes.
- Parameters:
edge (Edge) – Edge to serialize and write.
update_adjacency – Whether to update the legacy untyped adjacency list.
Examples
>>> graph_db.put_edge(Edge(source="drug-1", target="protein-1"))
- put_edges_bulk(edges, check_existing=True)[source]¶
Store multiple edges and update adjacency indexes in bulk.
- Parameters:
Examples
>>> graph_db.put_edges_bulk([Edge(source="drug-1", target="protein-1")], check_existing=False)
- put_node(node)[source]¶
Store a node.
- Parameters:
node (Node) – Node to serialize and write.
Examples
>>> graph_db.put_node(Node(node_id="drug-1"))
- put_nodes(nodes)[source]¶
Store multiple nodes and maintain label/property indexes.
Examples
>>> graph_db.put_nodes([Node(node_id="drug-1", labels=["Drug"])])
- query(cypher, parameters=None)[source]¶
Execute a supported read-only Cypher query.
- Parameters:
- Returns:
pygraphdb.cypher.QueryResultcontaining projected records.
Examples
>>> graph_db.query('MATCH (n:Drug) RETURN n') >>> graph_db.query('MATCH (a {id: "drug-1"})-[:drug-to-protein]->(b) RETURN a, b')
- range_query_nodes(property_name, start_val, end_val)[source]¶
Example stub: You might rely on the underlying store to handle indexing for nodes.
- Parameters:
property_name (str)
- rebuild_edge_property_index(property_name)[source]¶
Rebuild an exact-match edge property index from stored edges.
- Parameters:
property_name (str) – Edge property to index.
- Returns:
Number of indexed edge records.
Examples
>>> graph_db.rebuild_edge_property_index("score") 7
- rebuild_label_index()[source]¶
Rebuild the node label index from stored nodes.
- Returns:
Number of label index entries written.
Examples
>>> graph_db.rebuild_label_index() 12
- rebuild_node_property_index(property_name)[source]¶
Rebuild an exact-match node property index from stored nodes.
- Parameters:
property_name (str) – Node property to index.
- Returns:
Number of indexed node records.
Examples
>>> graph_db.rebuild_node_property_index("name") 3
- rebuild_relationship_type_index()[source]¶
Rebuild the relationship type catalog from stored edges.
- Returns:
Number of typed edge records indexed.
Examples
>>> graph_db.rebuild_relationship_type_index() 20
- rebuild_typed_adjacency()[source]¶
Rebuild typed adjacency indexes from stored edge records.
- Returns:
Number of typed edges indexed.
Examples
>>> graph_db.rebuild_typed_adjacency()
- sample_neighbors(node_id, edge_type, direction='out', sample_size=10, rng=None)[source]¶
Sample typed neighbors using reservoir sampling.
- Parameters:
- Returns:
List of typed adjacency records.
Examples
>>> graph_db.sample_neighbors("drug-1", "drug-to-protein", sample_size=2)
- sample_typed_paths(seed_ids, pattern, rng=None)[source]¶
Sample paths that follow an ordered typed edge pattern.
- Parameters:
seed_ids – Starting node IDs as strings or bytes.
pattern (SamplingPattern | list[dict]) –
SamplingPatternor list of dictionaries such as{"edge_type": "drug-to-protein", "direction": "out", "sample_size": 2}.rng – Optional random number generator with
randrange.
- Returns:
List of dictionaries with
seedand sampledpathrecords.
Examples
>>> from pygraphdb.sampling import SamplingHop, SamplingPattern >>> pattern = SamplingPattern([SamplingHop("drug-to-protein", sample_size=2)]) >>> graph_db.sample_typed_paths(["drug-1"], pattern)
- sample_typed_subgraph(seed_ids, pattern, rng=None)[source]¶
Sample and materialize a typed subgraph around seed nodes.
- Parameters:
seed_ids – Starting node IDs as strings or bytes.
pattern (SamplingPattern | list[dict]) –
SamplingPatternor list of dictionary hop configs.rng – Optional random number generator with
randrange.
- Returns:
Dictionary with
nodes,edges, andpathsentries.
Examples
>>> pattern = [{"edge_type": "drug-to-protein", "direction": "out", "sample_size": 2}] >>> graph_db.sample_typed_subgraph(["drug-1"], pattern)
- class pygraphdb.graphdb.GraphEntityDictSerializer(serializer)[source]¶
Bases:
objectSerialize graph entities through a dictionary-compatible serializer.
- Parameters:
serializer (Serializer) – Serializer used for the final bytes conversion.
Examples
>>> from pygraphdb.serializers import JSONSerializer >>> s = GraphEntityDictSerializer(JSONSerializer()) >>> s.deserialize(s.serialize(Node(node_id="n1"), "Node"), "Node").get_id 'n1'
- __init__(serializer)[source]¶
Initialize the entity serializer wrapper.
- Parameters:
serializer (Serializer) – Serializer used to encode dictionaries as bytes.
- deserialize(val, entity_type)[source]¶
Deserializer (conditional on entity type)
- Parameters:
val – bytes containing the data
entity_type (str) – (str) is Edge, Node, AdjacencyList
- serialize(entity, entity_type)[source]¶
Serialize a graph entity by entity type.
- Parameters:
entity –
Node,Edge, or adjacency-list object.entity_type (str) – One of
"Node","Edge", or"AdjacencyList".
- Returns:
Serialized bytes.
Examples
>>> from pygraphdb.serializers import PickleSerializer >>> GraphEntityDictSerializer(PickleSerializer()).serialize(Node("n1"), "Node")[:1] b'\x80'
- class pygraphdb.graphdb.Node(node_id=None, properties=None, labels=None)[source]¶
Bases:
objectGraph node with an ID, native labels, and arbitrary properties.
- Parameters:
node_id – Optional stable node identifier. A UUID is generated when omitted.
properties – Optional dictionary of node attributes.
labels – Optional iterable of node labels. Labels are stored natively and maintained in the label index by
GraphDB.
Examples
>>> Node(node_id="drug-1", labels=["Drug"], properties={"kind": "drug"}).get_id 'drug-1' >>> Node(node_id="drug-1", labels=["Drug", "Drug"]).labels ('Drug',)
- __init__(node_id=None, properties=None, labels=None)[source]¶
Initialize a node, generating a UUID when
node_idis omitted.
- classmethod from_dict(data)[source]¶
Create a node from serialized dictionary data.
- Parameters:
data (dict) – Dictionary produced by
to_dict. Older dictionaries withoutlabelsdeserialize with an empty label tuple.- Returns:
Nodeinstance.
Examples
>>> Node.from_dict({"id": "n1", "properties": {}}).labels ()
- property get_id¶
Unique identifier for this node.
- property get_id_bytes¶
Return the node ID encoded as UTF-8 bytes.
Examples
>>> Node(node_id="drug-1").get_id_bytes b'drug-1'
- class pygraphdb.graphdb.TimeIndexedEdge(timestamp_dat, *args, **kwargs)[source]¶
Bases:
EdgeEdge whose byte key is prefixed by a timestamp.
- Parameters:
timestamp_dat – Datetime used as the sortable key prefix.
*args – Positional arguments passed to
Edge.**kwargs – Keyword arguments passed to
Edge.
Examples
>>> edge = TimeIndexedEdge(datetime.datetime(1970, 1, 1, tzinfo=datetime.timezone.utc), edge_id="e1") >>> edge.get_id_bytes.endswith(b':e1') True
- __init__(timestamp_dat, *args, **kwargs)[source]¶
Initialize a timestamp-prefixed edge.
- Parameters:
timestamp_dat – Datetime used as the sortable key prefix.
*args – Positional arguments passed to
Edge.**kwargs – Keyword arguments passed to
Edge.
- property get_id_bytes¶
Return timestamp-prefixed edge ID bytes.
Examples
>>> edge = TimeIndexedEdge(datetime.datetime(1970, 1, 1, tzinfo=datetime.timezone.utc), edge_id="e1") >>> edge.get_id_bytes.endswith(b':e1') True
- pygraphdb.graphdb.bytes_to_datetime(b, tzinfo=datetime.timezone.utc)[source]¶
Convert bytes produced by
datetime_to_bytesback to a datetime.- Parameters:
b (bytes) – Eight-byte timestamp generated by
datetime_to_bytes.tzinfo – Time zone used for the epoch reference.
- Returns:
Decoded datetime.
- Return type:
Examples
>>> bytes_to_datetime(b'\x00' * 8) datetime.datetime(1970, 1, 1, 0, 0, tzinfo=datetime.timezone.utc)
- pygraphdb.graphdb.datetime_to_bytes(dt, tzinfo=datetime.timezone.utc)[source]¶
Convert a datetime to big-endian microseconds since the Unix epoch.
- Parameters:
dt (datetime) – Datetime at or after 1970-01-01.
tzinfo – Time zone used for the epoch reference.
- Returns:
Eight bytes containing the timestamp as an unsigned integer.
- Return type:
Examples
>>> datetime_to_bytes(datetime.datetime(1970, 1, 1, tzinfo=datetime.timezone.utc)) b'\x00\x00\x00\x00\x00\x00\x00\x00'
Sampling Configuration¶
Typed sampling configuration objects for PyGraphDB.
The graph sampling APIs accept these objects as a structured alternative to plain dictionaries while preserving dict compatibility.
- class pygraphdb.sampling.SamplingHop(edge_type, direction='out', sample_size=10)[source]¶
Bases:
objectConfiguration for one typed sampling hop.
- Parameters:
edge_type (str) – Edge type to traverse, read from
edge.properties["type"].direction (str) – Traversal direction. Use
"out"for source to target,"in"for target to source, or"any"for both directions.sample_size (int) – Maximum number of neighbors to sample at this hop for each node in the current frontier.
Examples
>>> hop = SamplingHop("drug-to-protein", direction="out", sample_size=2) >>> hop.to_dict() {'edge_type': 'drug-to-protein', 'direction': 'out', 'sample_size': 2}
- classmethod from_dict(data)[source]¶
Create a hop from a dictionary-style sampling configuration.
- Parameters:
data (Mapping[str, object]) – Mapping with
edge_typeand optionaldirectionandsample_sizekeys.- Returns:
A validated
SamplingHopinstance.- Return type:
Examples
>>> SamplingHop.from_dict({'edge_type': 'drug-to-protein', 'sample_size': 2}) SamplingHop(edge_type='drug-to-protein', direction='out', sample_size=2)
- class pygraphdb.sampling.SamplingPattern(hops)[source]¶
Bases:
objectOrdered typed sampling pattern.
- Parameters:
hops (Sequence[SamplingHop | Mapping[str, object]]) – Sequence of
SamplingHopobjects or dictionary-style hop configurations.
Examples
>>> pattern = SamplingPattern([ ... SamplingHop("drug-to-protein", sample_size=2), ... {"edge_type": "protein-to-disease", "direction": "out"}, ... ]) >>> len(pattern) 2
- classmethod from_dicts(hops)[source]¶
Create a pattern from dictionary-style hop configurations.
- Parameters:
hops (Iterable[Mapping[str, object]]) – Iterable of mappings accepted by
SamplingHop.from_dict.- Returns:
A normalized sampling pattern.
- Return type:
Examples
>>> SamplingPattern.from_dicts([{'edge_type': 'drug-to-protein'}]).to_dicts()[0]['edge_type'] 'drug-to-protein'
- pygraphdb.sampling.as_sampling_hop(hop)[source]¶
Normalize a hop configuration to
SamplingHop.- Parameters:
hop (SamplingHop | Mapping[str, object]) – Either a
SamplingHopor dictionary-style hop configuration.- Returns:
A
SamplingHopinstance.- Return type:
Examples
>>> as_sampling_hop({'edge_type': 'drug-to-protein'}).edge_type 'drug-to-protein'
- pygraphdb.sampling.as_sampling_pattern(pattern)[source]¶
Normalize a sampling pattern to
SamplingPattern.- Parameters:
pattern (SamplingPattern | Iterable[SamplingHop | Mapping[str, object]]) – A
SamplingPatternor iterable of hop configurations.- Returns:
A
SamplingPatterninstance.- Return type:
Examples
>>> as_sampling_pattern([{'edge_type': 'drug-to-protein'}]).hops[0].sample_size 10
Columnar Ingestion¶
Columnar ingestion containers for PyGraphDB.
- class pygraphdb.ingestion.EdgeList(edge_ids, sources, targets, edge_types, edge_values)[source]¶
Bases:
objectColumnar typed edges with caller-provided serialized edge values.
- Parameters:
- classmethod from_arrow(edge_ids, sources, targets, edge_types, edge_values)[source]¶
Create an edge list from Arrow-like or Python columns.
- class pygraphdb.ingestion.NodeList(node_ids, node_values)[source]¶
Bases:
objectColumnar nodes with caller-provided serialized node values.
- classmethod from_arrow(node_ids, node_values)[source]¶
Create a node list from Arrow-like or Python columns.
Cypher Queries¶
Minimal read-only Cypher support for PyGraphDB.
The supported subset maps directly to existing typed adjacency and sampling APIs:
MATCH (a {id: “node-id”})-[:TYPE1]->(b)<-[:TYPE2]-(c) RETURN a.name, b LIMIT 10 CALL pg.sample_typed_paths([“node-id”], [{“edge_type”: “TYPE”, “sample_size”: 2}]) YIELD path RETURN path
- class pygraphdb.cypher.QueryResult(columns, records)[source]¶
Bases:
objectTabular query result returned by
GraphDB.query.columnscontains projected column names in return order.recordsis a list of dictionaries keyed by column name.Examples
>>> result = QueryResult(columns=("n",), records=[{"n": "node"}]) >>> len(result) 1 >>> list(result)[0]["n"] 'node'
- pygraphdb.cypher.execute(graph, query, parameters=None)[source]¶
Execute a supported Cypher query against a
GraphDBinstance.- Parameters:
- Returns:
QueryResultwith projected records.- Return type:
Examples
>>> execute(graph_db, 'MATCH (n:Drug) RETURN n')
- pygraphdb.cypher.parse(query)[source]¶
Parse the supported Cypher subset.
- Parameters:
query (str) – Cypher query text.
- Returns:
Parsed query object.
- Raises:
ValueError – If the query is outside the supported subset.
- Return type:
MatchQuery | SampleTypedPathsCall | NodeScanQuery | RelationshipScanQuery
Examples
>>> parse('MATCH (n:Drug) RETURN n').label 'Drug'
Key-Value Stores¶
- class pygraphdb.kvstores.KVStore[source]¶
Bases:
objectAbstract interface for a simple key-value store.
- delete_index_entry(index_name, key_parts, value)[source]¶
Delete one sorted index entry.
- Parameters:
Examples
>>> store.delete_index_entry("node_label", [b"Drug"], b"drug-1")
- delete_typed_adjacency(source_id, target_id, edge_type, edge_id)[source]¶
Delete typed adjacency records for an edge.
- ingest_edges_columnar(edge_list, *, append_only=True, native=True)[source]¶
Store columnar typed edges with caller-provided serialized values.
- ingest_nodes_columnar(node_list, *, native=True)[source]¶
Store columnar nodes with caller-provided serialized values.
- Parameters:
native (bool)
- iter_index_prefix(index_name, key_parts)[source]¶
Yield values whose index key starts with
key_parts.- Parameters:
- Yields:
Values associated with matching index entries.
Examples
>>> list(store.iter_index_prefix("node_label", [b"Drug"])) [b'drug-1']
- iter_typed_adjacency(node_id, edge_type, direction='out')[source]¶
Yield typed adjacency records for a node and edge type.
- put_index_entries_bulk(entries)[source]¶
Store many sorted index entries.
- Parameters:
entries (list[tuple[str, list[bytes], bytes]]) – Tuples of
(index_name, key_parts, value).
Examples
>>> store.put_index_entries_bulk([("node_label", [b"Drug"], b"drug-1")])
- put_index_entry(index_name, key_parts, value)[source]¶
Store one sorted index entry.
- Parameters:
Examples
>>> store.put_index_entry("node_label", [b"Drug"], b"drug-1")
- put_nodes_bulk(keys_and_values)[source]¶
Store multiple node (serialized) values in a single batch/transaction if possible.
- put_typed_adjacency(source_id, target_id, edge_type, edge_id)[source]¶
Store typed adjacency records for an edge.
- class pygraphdb.kvstores.LMDBStore(path='graph_lmdb', map_size=10485760, map_id=True, map_keys=False)[source]¶
Bases:
KVStoreLMDB implementation of the PyGraphDB key-value store.
Examples
>>> store = LMDBStore(path="/tmp/example_graph_lmdb")
- __init__(path='graph_lmdb', map_size=10485760, map_id=True, map_keys=False)[source]¶
- Creates/opens an LMDB environment with three named sub-databases:
b’nodes’ for node data
b’edges’ for edge data
b’adj’ for adjacency lists
- delete(key)[source]¶
Placeholder generic delete; graph code uses specialized methods.
- Parameters:
key (bytes)
- delete_typed_adjacency(source_id, target_id, edge_type, edge_id)[source]¶
Delete forward and reverse typed adjacency records.
- get_adjacency_bulk(node_ids)[source]¶
Retrieve multiple adjacency lists in a single read transaction. Returns a dict { node_id: serialized adjacency } for all found items.
- get_edge_keys_generator(num_edges=None, key_offset=None)[source]¶
Yield edge keys from the edge database.
- get_node_keys_generator(num_nodes=None, key_offset=None)[source]¶
Yield node keys from the node database.
- iter_index_prefix(index_name, key_parts)[source]¶
Yield values whose index key starts with
key_parts.
- iter_typed_adjacency(node_id, edge_type, direction='out')[source]¶
Yield typed adjacency
(edge_id, neighbor_id)pairs.
- put_adjacency_bulk(adj_dict)[source]¶
Insert/update multiple adjacency lists in one transaction. :param adj_dict: a dict mapping node_id -> serialized adjacency (list of edges)
- put_typed_adjacency(source_id, target_id, edge_type, edge_id)[source]¶
Store forward and reverse typed adjacency records.
- class pygraphdb.kvstores.LevelDBStore(path='graph_leveldb')[source]¶
Bases:
KVStoreLevelDB implementation backed by
plyvel.- Parameters:
path – Directory that will contain the LevelDB sub-databases.
Examples
>>> store = LevelDBStore(path="/tmp/example_graph_leveldb")
- __init__(path='graph_leveldb')[source]¶
Create or open a LevelDB store. We’ll store nodes/edges by prefix.
- delete_typed_adjacency(source_id, target_id, edge_type, edge_id)[source]¶
Delete forward and reverse typed adjacency records.
- get_db_path(db_string='nodes')[source]¶
Return the relative path for a named LevelDB database.
Examples
>>> LevelDBStore.get_db_path.__name__ 'get_db_path'
- get_edge_keys_generator(num_edges=None, key_offset=None)[source]¶
Yield edge keys from the edge database.
- get_node_keys_generator(num_nodes=None, key_offset=None)[source]¶
Yield node keys from the node database.
- iter_index_prefix(index_name, key_parts)[source]¶
Yield values whose index key starts with
key_parts.
- iter_typed_adjacency(node_id, edge_type, direction='out')[source]¶
Yield typed adjacency
(edge_id, neighbor_id)pairs.
- put_adjacency_bulk(adj_dict)[source]¶
Insert/update multiple adjacency lists in one write batch. :param adj_dict: a dict mapping node_id -> serialized adjacency
- put_typed_adjacency(source_id, target_id, edge_type, edge_id)[source]¶
Store forward and reverse typed adjacency records.
- class pygraphdb.kvstores.PyRexStore(path='graph_rocksdb', parallelism=None, max_background_jobs=None, write_buffer_size=None, bloom_bits_per_key=None, disable_wal=False)[source]¶
Bases:
KVStoreRocksDB implementation backed by
pyrex-rocksdb.PyRexStoreuses one physical RocksDB database with prefixed keys instead of separate databases. This lets node, edge, adjacency, and typed adjacency records share RocksDB’s write path and makes it possible to benchmark RocksDB tuning options against the existing LevelDB backend.- Parameters:
path – Directory for the RocksDB database.
parallelism – Optional number of RocksDB background threads.
max_background_jobs – Optional RocksDB background job limit.
write_buffer_size – Optional write buffer size in bytes.
bloom_bits_per_key – Optional block-based Bloom filter bits per key.
disable_wal – Disable RocksDB’s write-ahead log for faster but less durable ingestion benchmarks.
Examples
>>> store = PyRexStore(path="/tmp/example_graph_rocksdb")
- __init__(path='graph_rocksdb', parallelism=None, max_background_jobs=None, write_buffer_size=None, bloom_bits_per_key=None, disable_wal=False)[source]¶
Open a PyRex/RocksDB store with optional tuning settings.
- delete_typed_adjacency(source_id, target_id, edge_type, edge_id)[source]¶
Delete forward and reverse typed adjacency records.
- get_edge_keys_generator(num_edges=None, key_offset=None)[source]¶
Yield edge keys from the shared RocksDB keyspace.
- get_node_keys_generator(num_nodes=None, key_offset=None)[source]¶
Yield node keys from the shared RocksDB keyspace.
- has_native_columnar_ingestion()[source]¶
Return whether this PyRex runtime exposes native columnar writes.
- Return type:
- ingest_edges_columnar(edge_list, *, append_only=True, native=True)[source]¶
Store columnar typed edges, using native PyRex ingestion when available.
- ingest_nodes_columnar(node_list, *, native=True)[source]¶
Store columnar nodes, using native PyRex ingestion when available.
- Parameters:
native (bool)
- iter_index_prefix(index_name, key_parts)[source]¶
Yield values whose index key starts with
key_parts.
- iter_typed_adjacency(node_id, edge_type, direction='out')[source]¶
Yield typed adjacency
(edge_id, neighbor_id)pairs.
- put_adjacency_bulk(adj_dict)[source]¶
Store many serialized adjacency lists in one RocksDB write batch.
- put_typed_adjacency(source_id, target_id, edge_type, edge_id)[source]¶
Store forward and reverse typed adjacency records.
- class pygraphdb.kvstores.SimpleIndexCounterKVStore(dbenv=None, db_path=b'nodes')[source]¶
Bases:
objectThis is to help with lowering storage requirements for edge and node keys, by casting them to long ints.
It makes use of the struct.pack and struct.unpack functions and a simple counter (also stored in the medatadata) to count the number of keys (and hence the index) already entered.
- __init__(dbenv=None, db_path=b'nodes')[source]¶
Initialize an index counter helper.
- Parameters:
dbenv – LMDB environment.
db_path – Named LMDB database for the counter mapping.
- encode_db_key(key)[source]¶
If the key exists, it will return the existing key. if the key does not exist, it will add it to the KV store with a new increment, and return that.
- class pygraphdb.kvstores.SimpleKV(db_path)[source]¶
Bases:
objectSmall LMDB-backed helper for metadata key/value access.
- Parameters:
db_path – LMDB database handle or name used by transactions.
Serializers¶
- class pygraphdb.serializers.JSONSerializer[source]¶
Bases:
SerializerUses JSON for serialization.
- class pygraphdb.serializers.MessagePackSerializer[source]¶
Bases:
SerializerUses MessagePack for serialization.
- deserialize(data)[source]¶
Deserialize MessagePack bytes.
- Raises:
ImportError – If the optional
msgpackpackage is missing.- Parameters:
data (bytes)
- Return type:
Examples
>>> MessagePackSerializer().deserialize(MessagePackSerializer().serialize({"a": 1})) {'a': 1}
- serialize(obj)[source]¶
Serialize an object with MessagePack.
- Raises:
ImportError – If the optional
msgpackpackage is missing.- Parameters:
obj (dict)
- Return type:
Examples
>>> MessagePackSerializer().deserialize(MessagePackSerializer().serialize({"a": 1})) {'a': 1}
- class pygraphdb.serializers.PickleSerializer[source]¶
Bases:
SerializerUses Python’s pickle for serialization.
- class pygraphdb.serializers.ProtobufSerializer[source]¶
Bases:
SerializerUses google.protobuf Struct for JSON-like dictionaries.
Struct does not have native integer or bytes types. This serializer tags those values before encoding so Python dictionaries round-trip without losing them.
- deserialize(data)[source]¶
Deserialize protobuf Struct bytes.
- Parameters:
data (bytes) – Protobuf binary payload.
- Returns:
Decoded dictionary.
- Raises:
ImportError – If the optional
protobufpackage is missing.- Return type:
- serialize(obj)[source]¶
Serialize a JSON-like dictionary with protobuf Struct.
- Parameters:
obj (dict) – Dictionary containing JSON-like values plus tagged ints/bytes.
- Returns:
Protobuf binary payload.
- Raises:
ImportError – If the optional
protobufpackage is missing.- Return type: