Storage Backends

PyGraphDB separates graph logic from storage. GraphDB receives a key-value store instance and a serializer instance.

LMDB Backend

Use LMDBStore for a mature embedded backend with named sub-databases.

from pygraphdb.graphdb import GraphDB
from pygraphdb.kvstores import LMDBStore
from pygraphdb.serializers import PickleSerializer

store = LMDBStore(path="graph_lmdb", map_size=2**30)
graph_db = GraphDB(store, PickleSerializer())

LMDB keeps separate databases for nodes, edges, adjacency, typed adjacency, and sorted indexes. Increase map_size when loading large graphs.

LevelDB Backend

Use LevelDBStore when you want LevelDB through plyvel.

from pygraphdb.graphdb import GraphDB
from pygraphdb.kvstores import LevelDBStore
from pygraphdb.serializers import PickleSerializer

store = LevelDBStore(path="graph_leveldb")
graph_db = GraphDB(store, PickleSerializer())

plyvel requires compatible CPython wheels or local LevelDB build tooling. If installation fails on Python 3.14 or a free-threaded interpreter, create a Python 3.12 environment and install pygraphdb[leveldb] there.

RocksDB Backend

Use PyRexStore for RocksDB through the optional pyrex-rocksdb package. This backend uses one physical RocksDB database with prefixed keys and exposes several RocksDB tuning knobs.

from pygraphdb.graphdb import GraphDB
from pygraphdb.kvstores import PyRexStore
from pygraphdb.serializers import PickleSerializer

store = PyRexStore(
    path="graph_rocksdb",
    parallelism=4,
    max_background_jobs=4,
    write_buffer_size=64 * 1024 * 1024,
    bloom_bits_per_key=10,
)
graph_db = GraphDB(store, PickleSerializer())

disable_wal=True can be useful for bulk-loading experiments, but it weakens durability and should not be used as a safe default.

When installed with pyrex-rocksdb>=0.3.0a0, PyRexStore can use PyRex’s native write_columnar_batch API through GraphDB.ingest_nodes_arrow and GraphDB.ingest_edges_arrow. The columnar methods currently require caller-provided serialized node_value and edge_value payloads and edge ingestion is append-only.

Sorted Indexes

All backends implement a small sorted index interface used by labels, relationship type catalogs, and explicit exact-match property indexes:

  • put_index_entry(index_name, key_parts, value)

  • put_index_entries_bulk(entries)

  • delete_index_entry(index_name, key_parts, value)

  • iter_index_prefix(index_name, key_parts)

These indexes are prefix-scanned by the backend rather than by deserializing all nodes or edges. Current high-level indexes include:

  • node_label for Node.labels and GraphDB.nodes_by_label.

  • node_property for explicitly registered node properties.

  • edge_type for edge.properties["type"] and GraphDB.edges_by_type.

  • edge_property for explicitly registered edge properties.

Property indexes are intentionally explicit. Register them only for predicates you expect to use frequently:

graph_db.create_node_property_index("name")
graph_db.create_edge_property_index("score")

Backend Selection Pattern

from pathlib import Path

from pygraphdb.graphdb import GraphDB
from pygraphdb.kvstores import LMDBStore, LevelDBStore, PyRexStore
from pygraphdb.serializers import PickleSerializer

def open_graph(path: str, backend: str = "lmdb") -> GraphDB:
    Path(path).parent.mkdir(parents=True, exist_ok=True)
    if backend == "lmdb":
        store = LMDBStore(path=path, map_size=2**30)
    elif backend == "leveldb":
        store = LevelDBStore(path=path)
    elif backend == "rocksdb":
        store = PyRexStore(path=path)
    else:
        raise ValueError(f"unknown backend: {backend}")
    return GraphDB(store, PickleSerializer())

Cleanup

Always close stores when a script or notebook cell is finished with them.

graph_db = GraphDB(LMDBStore(path="example_lmdb"), PickleSerializer())
try:
    graph_db.put_node(Node(node_id="n1"))
finally:
    graph_db.close()