ArcadeDB vs PyGraphDB Benchmarks¶
This page records a local embedded ArcadeDB comparison against pygraphdb using
the RocksDB/PyRex backend. The benchmark script is
scripts/benchmark_arcadedb_vs_pygraphdb.py.
Benchmark Setup¶
Command:
uv run --with arcadedb-embedded python scripts/benchmark_arcadedb_vs_pygraphdb.py \
--engines pygraphdb arcadedb \
--workloads columnar_ingest star_traversal bfs_depth typed_path rocksdb_compaction \
--nodes 1000 \
--edges 3000 \
--batch-size 1000 \
--iterations 5 \
--repetitions 10 \
--compaction-keys 1000 \
--compaction-passes 2 \
--arcadedb-heap-size 1g \
--output-dir benchmark_results/arcadedb_embedded_10x_20260625
Outputs:
benchmark_results/arcadedb_embedded_10x_20260625/arcadedb_vs_pygraphdb_results.csvPer-run raw rows.
benchmark_results/arcadedb_embedded_10x_20260625/arcadedb_vs_pygraphdb_summary.csvMean and sample standard deviation by engine and workload.
The run used Python 3.11.14 on Linux
6.17.0-35-generic-x86_64. ArcadeDB used the arcadedb-embedded package
with a 1g JVM heap. The timings below are mean +/- sample standard deviation
over 10 repetitions. They include first-run Python/JVM warm-up costs, which is
why the standard deviation is high for some ingest-heavy workloads.
Overall Results¶
Lower total time is better.
Workload |
PyGraphDB/RocksDB |
ArcadeDB embedded |
Relative result |
|---|---|---|---|
|
0.0358 +/- 0.0507 s |
0.0506 +/- 0.0620 s |
PyGraphDB 1.41x faster |
|
0.0383 +/- 0.0014 s |
0.0333 +/- 0.0154 s |
ArcadeDB 1.15x faster |
|
0.0303 +/- 0.0022 s |
0.0366 +/- 0.0141 s |
PyGraphDB 1.21x faster |
|
0.0293 +/- 0.0023 s |
0.0404 +/- 0.0052 s |
PyGraphDB 1.38x faster |
|
0.0022 +/- 0.0002 s |
Not applicable |
PyGraphDB only |
Ingestion Results¶
These timings include graph creation and loading. For ArcadeDB this uses embedded
GraphBatch. For pygraphdb, columnar_ingest uses Arrow/RocksDB columnar
ingestion; the traversal workloads use object ingestion so the graph can be
queried immediately afterward.
Workload |
PyGraphDB/RocksDB |
ArcadeDB embedded |
Relative result |
|---|---|---|---|
|
0.0358 +/- 0.0507 s |
0.0506 +/- 0.0620 s |
PyGraphDB 1.41x faster |
|
0.0286 +/- 0.0012 s |
0.0273 +/- 0.0046 s |
ArcadeDB 1.05x faster |
|
0.0297 +/- 0.0022 s |
0.0322 +/- 0.0083 s |
PyGraphDB 1.08x faster |
|
0.0292 +/- 0.0023 s |
0.0359 +/- 0.0053 s |
PyGraphDB 1.23x faster |
Query Results¶
These timings exclude ingestion and measure only the repeated query/traversal
portion. columnar_ingest has no query phase.
Workload |
PyGraphDB/RocksDB |
ArcadeDB embedded |
Relative result |
|---|---|---|---|
|
0.0097 +/- 0.0003 s |
0.0060 +/- 0.0130 s |
ArcadeDB 1.61x faster |
|
0.0006 +/- 0.0000 s |
0.0044 +/- 0.0061 s |
PyGraphDB 7.71x faster |
|
0.0001 +/- 0.0000 s |
0.0045 +/- 0.0029 s |
PyGraphDB 39.08x faster |
|
0.0022 +/- 0.0002 s |
Not applicable |
PyGraphDB only |
Interpretation¶
columnar_ingestPyGraphDB/RocksDB was 1.41x faster on total time. This workload exercises the serialized Arrow ingestion path and RocksDB’s native
write_columnar_batchsupport when available. ArcadeDB used embeddedGraphBatchand still performed in the same order of magnitude for this small graph.star_traversalArcadeDB was 1.15x faster overall and 1.61x faster in the query phase. This is the workload that most directly benefits from ArcadeDB’s native vertex-local adjacency representation. The total-time advantage is smaller than the query advantage because both systems still pay graph-loading costs.
bfs_depthPyGraphDB/RocksDB was 1.21x faster overall and 7.71x faster in the query phase. In this synthetic shape, pygraphdb’s typed-adjacency prefix scans were faster than ArcadeDB’s SQL
MATCHquery execution for the bounded traversal.typed_pathPyGraphDB/RocksDB was 1.38x faster overall and 39.08x faster in the query phase. The result favors pygraphdb’s direct typed-adjacency iteration for this tiny two-hop pattern. It should not be generalized to complex graph patterns where ArcadeDB’s query optimizer has more room to help.
rocksdb_compactionThis workload is intentionally pygraphdb/RocksDB-only. It directly writes a repeated permuted overwrite pattern into RocksDB to exercise LSM compaction behavior, so there is no equivalent ArcadeDB property-graph result.
Important Caveats¶
These are small local smoke benchmarks, not a full database benchmark campaign. They are useful for catching regressions and showing workload-specific behavior, but larger graphs, more repetitions, warm-up exclusion, pinned CPU frequency, and isolated disks are needed before drawing broad conclusions.
The first repetition includes one-time costs such as JVM startup for ArcadeDB and Python module initialization for pygraphdb’s optional ingestion stack. The script reports standard deviation so this warm-up effect is visible rather than hidden.