Storage Backends¶
PyGraphDB separates graph logic from storage. GraphDB receives a key-value
store instance and a serializer instance.
LMDB Backend¶
Use LMDBStore for a mature embedded backend with named sub-databases.
from pygraphdb.graphdb import GraphDB
from pygraphdb.kvstores import LMDBStore
from pygraphdb.serializers import PickleSerializer
store = LMDBStore(path="graph_lmdb", map_size=2**30)
graph_db = GraphDB(store, PickleSerializer())
LMDB keeps separate databases for nodes, edges, adjacency, typed adjacency, and
sorted indexes. Increase map_size when loading large graphs.
LevelDB Backend¶
Use LevelDBStore when you want LevelDB through plyvel.
from pygraphdb.graphdb import GraphDB
from pygraphdb.kvstores import LevelDBStore
from pygraphdb.serializers import PickleSerializer
store = LevelDBStore(path="graph_leveldb")
graph_db = GraphDB(store, PickleSerializer())
plyvel requires compatible CPython wheels or local LevelDB build tooling. If
installation fails on Python 3.14 or a free-threaded interpreter, create a Python
3.12 environment and install pygraphdb[leveldb] there.
RocksDB Backend¶
Use PyRexStore for RocksDB through the optional pyrex-rocksdb package.
This backend uses one physical RocksDB database with prefixed keys and exposes
several RocksDB tuning knobs.
from pygraphdb.graphdb import GraphDB
from pygraphdb.kvstores import PyRexStore
from pygraphdb.serializers import PickleSerializer
store = PyRexStore(
path="graph_rocksdb",
parallelism=4,
max_background_jobs=4,
write_buffer_size=64 * 1024 * 1024,
bloom_bits_per_key=10,
)
graph_db = GraphDB(store, PickleSerializer())
disable_wal=True can be useful for bulk-loading experiments, but it weakens
durability and should not be used as a safe default.
When installed with pyrex-rocksdb>=0.3.0a0, PyRexStore can use PyRex’s
native write_columnar_batch API through GraphDB.ingest_nodes_arrow and
GraphDB.ingest_edges_arrow. The columnar methods currently require
caller-provided serialized node_value and edge_value payloads and edge
ingestion is append-only.
Sorted Indexes¶
All backends implement a small sorted index interface used by labels, relationship type catalogs, and explicit exact-match property indexes:
put_index_entry(index_name, key_parts, value)put_index_entries_bulk(entries)delete_index_entry(index_name, key_parts, value)iter_index_prefix(index_name, key_parts)
These indexes are prefix-scanned by the backend rather than by deserializing all nodes or edges. Current high-level indexes include:
node_labelforNode.labelsandGraphDB.nodes_by_label.node_propertyfor explicitly registered node properties.edge_typeforedge.properties["type"]andGraphDB.edges_by_type.edge_propertyfor explicitly registered edge properties.
Property indexes are intentionally explicit. Register them only for predicates you expect to use frequently:
graph_db.create_node_property_index("name")
graph_db.create_edge_property_index("score")
Backend Selection Pattern¶
from pathlib import Path
from pygraphdb.graphdb import GraphDB
from pygraphdb.kvstores import LMDBStore, LevelDBStore, PyRexStore
from pygraphdb.serializers import PickleSerializer
def open_graph(path: str, backend: str = "lmdb") -> GraphDB:
Path(path).parent.mkdir(parents=True, exist_ok=True)
if backend == "lmdb":
store = LMDBStore(path=path, map_size=2**30)
elif backend == "leveldb":
store = LevelDBStore(path=path)
elif backend == "rocksdb":
store = PyRexStore(path=path)
else:
raise ValueError(f"unknown backend: {backend}")
return GraphDB(store, PickleSerializer())
Cleanup¶
Always close stores when a script or notebook cell is finished with them.
graph_db = GraphDB(LMDBStore(path="example_lmdb"), PickleSerializer())
try:
graph_db.put_node(Node(node_id="n1"))
finally:
graph_db.close()