Benchmarks

Reproducible performance comparisons at 1K, 100K, 1M, and 10M row scales.

Test environment

CPU Intel Core i7-13700KF (8 threads in Docker)
Memory 15.6 GB
OS Ubuntu 24.04.4 LTS (WSL2, kernel 6.6.87.2)
SIMD AVX2
Methodology 20 iterations per operation, p50 reported (p95/p99 available in raw output)
Engine Category In-process
DoDB Columnar HTAP (FASM x86-64 + AVX2) Yes
DuckDB Columnar analytical (embedded) Yes
SQLite Row-store (embedded) Yes
PostgreSQL Row-store (client/server) No
Dragonfly Redis-compatible KV store No

Throughput overview

Rows per second at 1M rows across all engines (log scale).

DoDB vs DuckDB

The closest competitor. Both are columnar, in-memory, SIMD-aware.

OperationDoDBDuckDBWinnerSpeedupDoDB rows/s
Bulk insert12.82ms56.85msDoDB4.4x78M/s
Full scan0.823ms1.61msDoDB2.0x1.22B/s
Filter COUNT (50%)1.12ms1.01msDuckDB1.1x893M/s
Filter COUNT (10%)0.146ms1.02msDoDB7.0x6.85B/s
SUM(value)0.132ms0.792msDoDB6.0x7.58B/s
COUNT(*)0.0004ms0.722msDoDB1,914.2x2,500B/s
MIN(value)0.125ms0.891msDoDB7.1x8.0B/s
MAX(value)0.127ms1.01msDoDB8.0x7.87B/s
SUM WHERE cat<501.62ms0.946msDuckDB1.7x617M/s
GROUP BY SUM1.58ms1.74msDoDB1.1x633M/s
WHERE AND0.849ms1.05msDoDB1.2x1.18B/s
WHERE OR0.484ms3.19msDoDB6.6x2.07B/s

Note: At 1M rows DoDB is competitive across all operations. At 10M rows, DuckDB's vectorized execution catches up on filtered aggregates (SUM WHERE, GROUP BY, WHERE AND) — DoDB's scalar WHERE evaluation path becomes the bottleneck.

Multi-type performance

SUM throughput across all integer types at 1M rows. Narrower types fit more values per SIMD register.

Type DoDB Scan DuckDB Scan Scan Speedup DoDB SUM DuckDB SUM SUM Speedup
INT8 0.012ms 1.98ms 166x 0.051ms 1.05ms 21x
INT16 0.013ms 1.98ms 156x 0.056ms 1.05ms 19x
INT32 0.009ms 1.96ms 221x 0.135ms 0.88ms 6.6x
INT64 0.009ms 1.75ms 188x 0.281ms 1.1ms 3.9x
UINT8 0.013ms 1.78ms 141x 0.051ms 0.98ms 19x
UINT16 0.012ms 1.86ms 155x 0.051ms 1.11ms 22x
UINT32 0.012ms 1.73ms 142x 0.151ms 1.12ms 7.4x
UINT64 0.012ms 1.75ms 141x 0.332ms 1.8ms 5.4x

DoDB vs SQLite

Columnar SIMD engine vs the world's most deployed embedded database. 200-275x faster on analytical queries.

Operation DoDB SQLite Speedup DoDB rows/s
Bulk insert 14.5ms 361ms 25x 69M/s
SUM(value) 0.15ms 31.3ms 206x 6.6B/s
COUNT(*) 0.001ms 0.14ms 257x 1,799B/s
MIN(value) 0.13ms 34.4ms 276x 8.0B/s
MAX(value) 0.12ms 33.3ms 268x 8.1B/s
Filter COUNT (50%) 0.13ms 29.5ms 221x 7.5B/s
Filter COUNT (10%) 0.14ms 29.2ms 217x 7.4B/s
SUM WHERE cat=1 0.14ms 28.5ms 210x 7.4B/s

DoDB vs PostgreSQL

In-process columnar engine vs industry-standard server database over TCP. PostgreSQL uses UNLOGGED tables for fairest comparison.

Operation DoDB PostgreSQL Speedup DoDB rows/s
Bulk insert 13.1ms 628ms 48x 76M/s
SUM(value) 0.14ms 26.4ms 186x 7.0B/s
COUNT(*) 0.001ms 22ms 37,122x 1,689B/s
MIN(value) 0.13ms 27.4ms 219x 8.0B/s
MAX(value) 0.15ms 26.9ms 183x 6.8B/s
Filter COUNT (50%) 0.15ms 27.8ms 190x 6.8B/s
Filter COUNT (10%) 0.17ms 25.3ms 191x 6.1B/s
SUM WHERE cat=1 0.18ms 24.6ms 137x 5.6B/s

DoDB vs Dragonfly

Different paradigm: SQL columnar vs key-value over TCP.

Operation DoDB Dragonfly Speedup
Write (bulk) 13.1ms 1724ms 131x
Point lookup (1000x) 122ms 182ms 1.5x
Bulk read (100K) 0.45ms 720ms 1,611x
Full scan (count) 0.001ms 1707ms 2,812,243x

Architecture comparison

How DoDB's design differs from existing engines.

Feature DoDB DuckDB SQLite PostgreSQL
Storage layout Columnar (SoA) Columnar Row (B-tree) Row (heap)
SIMD Hand-written AVX2 Auto-vectorized None None
Query execution Vectorized batch Vectorized pipeline Row-at-a-time Row-at-a-time
Memory model mmap + arena Buffer manager malloc + pages Shared buffers
Parallelism Single-threaded Multi-threaded Single-writer Multi-process
COUNT(*) O(1) stored O(n) scan O(n) B-tree count O(n) seq scan
Index support None (full scan) ART + zonemap B-tree B-tree, hash, GIN

Apple Silicon (NEON)

Native AArch64 assembly with 128-bit NEON SIMD on Apple M1 Pro (10 cores).

CPU Apple M1 Pro (10 cores)
SIMD NEON (128-bit), CRC32, LSE atomics
Runtime Native (no Docker, no Rosetta)

DoDB (NEON) vs DuckDB — 1M rows

Operation DoDB DuckDB Winner Speedup DoDB rows/s
Bulk insert 9.17ms 19.27ms DoDB 2.1x 109M/s
Full scan 0.97ms 0.71ms DuckDB 1.4x 1.04B/s
Filter COUNT (50%) 0.41ms 0.51ms DoDB 1.3x 2.47B/s
SUM(value) 0.21ms 0.34ms DoDB 1.6x 4.70B/s
COUNT(*) 0.0003ms 0.25ms DoDB 838x 3,425B/s
MIN(value) 0.21ms 0.4ms DoDB 1.9x 4.76B/s
MAX(value) 0.21ms 0.33ms DoDB 1.6x 4.76B/s
GROUP BY SUM 13.82ms 1.29ms DuckDB 10.7x 72.3M/s
WHERE AND 0.0006ms 0.52ms DoDB 833x 1,600B/s
WHERE OR 0.0006ms 1.06ms DoDB 1,815x 1,715B/s

AVX2 vs NEON (DoDB vs DoDB)

Cross-architecture comparison of the same hand-written assembly engine.

Operation x86-64 AVX2 AArch64 NEON Faster Ratio
Bulk insert 12.82ms 9.17ms NEON 1.4x
SUM(value) 0.132ms 0.21ms AVX2 1.6x
Filter COUNT (50%) 1.12ms 0.41ms NEON 2.7x
MIN(value) 0.125ms 0.21ms AVX2 1.7x
COUNT(*) 0.0004ms 0.0003ms NEON 1.3x
WHERE AND 0.849ms 0.0006ms NEON 1,415x
GROUP BY SUM 1.58ms 13.82ms AVX2 8.7x

Reproducibility

All benchmarks are reproducible with a single command.

# Run all benchmarks (requires Docker):
$ make docker-bench

# Or manually:
$ docker compose build
$ docker compose up -d dragonfly postgres
$ sleep 3
$ docker compose run --rm dodb dodb-bench
$ docker compose stop dragonfly postgres