ArcadeDB vs PyGraphDB Benchmarks ================================ This page records a local embedded ArcadeDB comparison against pygraphdb using the RocksDB/PyRex backend. The benchmark script is ``scripts/benchmark_arcadedb_vs_pygraphdb.py``. Benchmark Setup --------------- Command: .. code-block:: sh uv run --with arcadedb-embedded python scripts/benchmark_arcadedb_vs_pygraphdb.py \ --engines pygraphdb arcadedb \ --workloads columnar_ingest star_traversal bfs_depth typed_path rocksdb_compaction \ --nodes 1000 \ --edges 3000 \ --batch-size 1000 \ --iterations 5 \ --repetitions 10 \ --compaction-keys 1000 \ --compaction-passes 2 \ --arcadedb-heap-size 1g \ --output-dir benchmark_results/arcadedb_embedded_10x_20260625 Outputs: ``benchmark_results/arcadedb_embedded_10x_20260625/arcadedb_vs_pygraphdb_results.csv`` Per-run raw rows. ``benchmark_results/arcadedb_embedded_10x_20260625/arcadedb_vs_pygraphdb_summary.csv`` Mean and sample standard deviation by engine and workload. The run used Python ``3.11.14`` on Linux ``6.17.0-35-generic-x86_64``. ArcadeDB used the ``arcadedb-embedded`` package with a ``1g`` JVM heap. The timings below are mean +/- sample standard deviation over 10 repetitions. They include first-run Python/JVM warm-up costs, which is why the standard deviation is high for some ingest-heavy workloads. Overall Results --------------- Lower total time is better. .. list-table:: :header-rows: 1 * - Workload - PyGraphDB/RocksDB - ArcadeDB embedded - Relative result * - ``columnar_ingest`` - 0.0358 +/- 0.0507 s - 0.0506 +/- 0.0620 s - PyGraphDB 1.41x faster * - ``star_traversal`` - 0.0383 +/- 0.0014 s - 0.0333 +/- 0.0154 s - ArcadeDB 1.15x faster * - ``bfs_depth`` - 0.0303 +/- 0.0022 s - 0.0366 +/- 0.0141 s - PyGraphDB 1.21x faster * - ``typed_path`` - 0.0293 +/- 0.0023 s - 0.0404 +/- 0.0052 s - PyGraphDB 1.38x faster * - ``rocksdb_compaction`` - 0.0022 +/- 0.0002 s - Not applicable - PyGraphDB only Ingestion Results ----------------- These timings include graph creation and loading. For ArcadeDB this uses embedded ``GraphBatch``. For pygraphdb, ``columnar_ingest`` uses Arrow/RocksDB columnar ingestion; the traversal workloads use object ingestion so the graph can be queried immediately afterward. .. list-table:: :header-rows: 1 * - Workload - PyGraphDB/RocksDB - ArcadeDB embedded - Relative result * - ``columnar_ingest`` - 0.0358 +/- 0.0507 s - 0.0506 +/- 0.0620 s - PyGraphDB 1.41x faster * - ``star_traversal`` - 0.0286 +/- 0.0012 s - 0.0273 +/- 0.0046 s - ArcadeDB 1.05x faster * - ``bfs_depth`` - 0.0297 +/- 0.0022 s - 0.0322 +/- 0.0083 s - PyGraphDB 1.08x faster * - ``typed_path`` - 0.0292 +/- 0.0023 s - 0.0359 +/- 0.0053 s - PyGraphDB 1.23x faster Query Results ------------- These timings exclude ingestion and measure only the repeated query/traversal portion. ``columnar_ingest`` has no query phase. .. list-table:: :header-rows: 1 * - Workload - PyGraphDB/RocksDB - ArcadeDB embedded - Relative result * - ``star_traversal`` - 0.0097 +/- 0.0003 s - 0.0060 +/- 0.0130 s - ArcadeDB 1.61x faster * - ``bfs_depth`` - 0.0006 +/- 0.0000 s - 0.0044 +/- 0.0061 s - PyGraphDB 7.71x faster * - ``typed_path`` - 0.0001 +/- 0.0000 s - 0.0045 +/- 0.0029 s - PyGraphDB 39.08x faster * - ``rocksdb_compaction`` - 0.0022 +/- 0.0002 s - Not applicable - PyGraphDB only Interpretation -------------- ``columnar_ingest`` PyGraphDB/RocksDB was 1.41x faster on total time. This workload exercises the serialized Arrow ingestion path and RocksDB's native ``write_columnar_batch`` support when available. ArcadeDB used embedded ``GraphBatch`` and still performed in the same order of magnitude for this small graph. ``star_traversal`` ArcadeDB was 1.15x faster overall and 1.61x faster in the query phase. This is the workload that most directly benefits from ArcadeDB's native vertex-local adjacency representation. The total-time advantage is smaller than the query advantage because both systems still pay graph-loading costs. ``bfs_depth`` PyGraphDB/RocksDB was 1.21x faster overall and 7.71x faster in the query phase. In this synthetic shape, pygraphdb's typed-adjacency prefix scans were faster than ArcadeDB's SQL ``MATCH`` query execution for the bounded traversal. ``typed_path`` PyGraphDB/RocksDB was 1.38x faster overall and 39.08x faster in the query phase. The result favors pygraphdb's direct typed-adjacency iteration for this tiny two-hop pattern. It should not be generalized to complex graph patterns where ArcadeDB's query optimizer has more room to help. ``rocksdb_compaction`` This workload is intentionally pygraphdb/RocksDB-only. It directly writes a repeated permuted overwrite pattern into RocksDB to exercise LSM compaction behavior, so there is no equivalent ArcadeDB property-graph result. Important Caveats ----------------- These are small local smoke benchmarks, not a full database benchmark campaign. They are useful for catching regressions and showing workload-specific behavior, but larger graphs, more repetitions, warm-up exclusion, pinned CPU frequency, and isolated disks are needed before drawing broad conclusions. The first repetition includes one-time costs such as JVM startup for ArcadeDB and Python module initialization for pygraphdb's optional ingestion stack. The script reports standard deviation so this warm-up effect is visible rather than hidden.