Storage Backends ================ PyGraphDB separates graph logic from storage. ``GraphDB`` receives a key-value store instance and a serializer instance. LMDB Backend ------------ Use ``LMDBStore`` for a mature embedded backend with named sub-databases. .. code-block:: python from pygraphdb.graphdb import GraphDB from pygraphdb.kvstores import LMDBStore from pygraphdb.serializers import PickleSerializer store = LMDBStore(path="graph_lmdb", map_size=2**30) graph_db = GraphDB(store, PickleSerializer()) LMDB keeps separate databases for nodes, edges, adjacency, typed adjacency, and sorted indexes. Increase ``map_size`` when loading large graphs. LevelDB Backend --------------- Use ``LevelDBStore`` when you want LevelDB through ``plyvel``. .. code-block:: python from pygraphdb.graphdb import GraphDB from pygraphdb.kvstores import LevelDBStore from pygraphdb.serializers import PickleSerializer store = LevelDBStore(path="graph_leveldb") graph_db = GraphDB(store, PickleSerializer()) ``plyvel`` requires compatible CPython wheels or local LevelDB build tooling. If installation fails on Python 3.14 or a free-threaded interpreter, create a Python 3.12 environment and install ``pygraphdb[leveldb]`` there. RocksDB Backend --------------- Use ``PyRexStore`` for RocksDB through the optional ``pyrex-rocksdb`` package. This backend uses one physical RocksDB database with prefixed keys and exposes several RocksDB tuning knobs. .. code-block:: python from pygraphdb.graphdb import GraphDB from pygraphdb.kvstores import PyRexStore from pygraphdb.serializers import PickleSerializer store = PyRexStore( path="graph_rocksdb", parallelism=4, max_background_jobs=4, write_buffer_size=64 * 1024 * 1024, bloom_bits_per_key=10, ) graph_db = GraphDB(store, PickleSerializer()) ``disable_wal=True`` can be useful for bulk-loading experiments, but it weakens durability and should not be used as a safe default. When installed with ``pyrex-rocksdb>=0.3.0a0``, ``PyRexStore`` can use PyRex's native ``write_columnar_batch`` API through ``GraphDB.ingest_nodes_arrow`` and ``GraphDB.ingest_edges_arrow``. The columnar methods currently require caller-provided serialized ``node_value`` and ``edge_value`` payloads and edge ingestion is append-only. Sorted Indexes -------------- All backends implement a small sorted index interface used by labels, relationship type catalogs, and explicit exact-match property indexes: - ``put_index_entry(index_name, key_parts, value)`` - ``put_index_entries_bulk(entries)`` - ``delete_index_entry(index_name, key_parts, value)`` - ``iter_index_prefix(index_name, key_parts)`` These indexes are prefix-scanned by the backend rather than by deserializing all nodes or edges. Current high-level indexes include: - ``node_label`` for ``Node.labels`` and ``GraphDB.nodes_by_label``. - ``node_property`` for explicitly registered node properties. - ``edge_type`` for ``edge.properties["type"]`` and ``GraphDB.edges_by_type``. - ``edge_property`` for explicitly registered edge properties. Property indexes are intentionally explicit. Register them only for predicates you expect to use frequently: .. code-block:: python graph_db.create_node_property_index("name") graph_db.create_edge_property_index("score") Backend Selection Pattern ------------------------- .. code-block:: python from pathlib import Path from pygraphdb.graphdb import GraphDB from pygraphdb.kvstores import LMDBStore, LevelDBStore, PyRexStore from pygraphdb.serializers import PickleSerializer def open_graph(path: str, backend: str = "lmdb") -> GraphDB: Path(path).parent.mkdir(parents=True, exist_ok=True) if backend == "lmdb": store = LMDBStore(path=path, map_size=2**30) elif backend == "leveldb": store = LevelDBStore(path=path) elif backend == "rocksdb": store = PyRexStore(path=path) else: raise ValueError(f"unknown backend: {backend}") return GraphDB(store, PickleSerializer()) Cleanup ------- Always close stores when a script or notebook cell is finished with them. .. code-block:: python graph_db = GraphDB(LMDBStore(path="example_lmdb"), PickleSerializer()) try: graph_db.put_node(Node(node_id="n1")) finally: graph_db.close()