Benchmarks
Reproducible performance comparisons at 1K, 100K, 1M, and 10M row scales.
Test environment
| CPU | Intel Core i7-13700KF (8 threads in Docker) |
| Memory | 15.6 GB |
| OS | Ubuntu 24.04.4 LTS (WSL2, kernel 6.6.87.2) |
| SIMD | AVX2 |
| Methodology | 20 iterations per operation, p50 reported (p95/p99 available in raw output) |
| Engine | Category | In-process |
|---|---|---|
| DoDB | Columnar HTAP (FASM x86-64 + AVX2) | Yes |
| DuckDB | Columnar analytical (embedded) | Yes |
| SQLite | Row-store (embedded) | Yes |
| PostgreSQL | Row-store (client/server) | No |
| Dragonfly | Redis-compatible KV store | No |
Throughput overview
Rows per second at 1M rows across all engines (log scale).
DoDB vs DuckDB
The closest competitor. Both are columnar, in-memory, SIMD-aware.
| Operation | DoDB | DuckDB | Winner | Speedup | DoDB rows/s |
|---|---|---|---|---|---|
| Bulk insert | 12.82ms | 56.85ms | DoDB | 4.4x | 78M/s |
| Full scan | 0.823ms | 1.61ms | DoDB | 2.0x | 1.22B/s |
| Filter COUNT (50%) | 1.12ms | 1.01ms | DuckDB | 1.1x | 893M/s |
| Filter COUNT (10%) | 0.146ms | 1.02ms | DoDB | 7.0x | 6.85B/s |
| SUM(value) | 0.132ms | 0.792ms | DoDB | 6.0x | 7.58B/s |
| COUNT(*) | 0.0004ms | 0.722ms | DoDB | 1,914.2x | 2,500B/s |
| MIN(value) | 0.125ms | 0.891ms | DoDB | 7.1x | 8.0B/s |
| MAX(value) | 0.127ms | 1.01ms | DoDB | 8.0x | 7.87B/s |
| SUM WHERE cat<50 | 1.62ms | 0.946ms | DuckDB | 1.7x | 617M/s |
| GROUP BY SUM | 1.58ms | 1.74ms | DoDB | 1.1x | 633M/s |
| WHERE AND | 0.849ms | 1.05ms | DoDB | 1.2x | 1.18B/s |
| WHERE OR | 0.484ms | 3.19ms | DoDB | 6.6x | 2.07B/s |
Note: At 1M rows DoDB is competitive across all operations. At 10M rows, DuckDB's vectorized execution catches up on filtered aggregates (SUM WHERE, GROUP BY, WHERE AND) — DoDB's scalar WHERE evaluation path becomes the bottleneck.
Multi-type performance
SUM throughput across all integer types at 1M rows. Narrower types fit more values per SIMD register.
| Type | DoDB Scan | DuckDB Scan | Scan Speedup | DoDB SUM | DuckDB SUM | SUM Speedup |
|---|---|---|---|---|---|---|
| INT8 | 0.012ms | 1.98ms | 166x | 0.051ms | 1.05ms | 21x |
| INT16 | 0.013ms | 1.98ms | 156x | 0.056ms | 1.05ms | 19x |
| INT32 | 0.009ms | 1.96ms | 221x | 0.135ms | 0.88ms | 6.6x |
| INT64 | 0.009ms | 1.75ms | 188x | 0.281ms | 1.1ms | 3.9x |
| UINT8 | 0.013ms | 1.78ms | 141x | 0.051ms | 0.98ms | 19x |
| UINT16 | 0.012ms | 1.86ms | 155x | 0.051ms | 1.11ms | 22x |
| UINT32 | 0.012ms | 1.73ms | 142x | 0.151ms | 1.12ms | 7.4x |
| UINT64 | 0.012ms | 1.75ms | 141x | 0.332ms | 1.8ms | 5.4x |
DoDB vs SQLite
Columnar SIMD engine vs the world's most deployed embedded database. 200-275x faster on analytical queries.
| Operation | DoDB | SQLite | Speedup | DoDB rows/s |
|---|---|---|---|---|
| Bulk insert | 14.5ms | 361ms | 25x | 69M/s |
| SUM(value) | 0.15ms | 31.3ms | 206x | 6.6B/s |
| COUNT(*) | 0.001ms | 0.14ms | 257x | 1,799B/s |
| MIN(value) | 0.13ms | 34.4ms | 276x | 8.0B/s |
| MAX(value) | 0.12ms | 33.3ms | 268x | 8.1B/s |
| Filter COUNT (50%) | 0.13ms | 29.5ms | 221x | 7.5B/s |
| Filter COUNT (10%) | 0.14ms | 29.2ms | 217x | 7.4B/s |
| SUM WHERE cat=1 | 0.14ms | 28.5ms | 210x | 7.4B/s |
DoDB vs PostgreSQL
In-process columnar engine vs industry-standard server database over TCP. PostgreSQL uses UNLOGGED tables for fairest comparison.
| Operation | DoDB | PostgreSQL | Speedup | DoDB rows/s |
|---|---|---|---|---|
| Bulk insert | 13.1ms | 628ms | 48x | 76M/s |
| SUM(value) | 0.14ms | 26.4ms | 186x | 7.0B/s |
| COUNT(*) | 0.001ms | 22ms | 37,122x | 1,689B/s |
| MIN(value) | 0.13ms | 27.4ms | 219x | 8.0B/s |
| MAX(value) | 0.15ms | 26.9ms | 183x | 6.8B/s |
| Filter COUNT (50%) | 0.15ms | 27.8ms | 190x | 6.8B/s |
| Filter COUNT (10%) | 0.17ms | 25.3ms | 191x | 6.1B/s |
| SUM WHERE cat=1 | 0.18ms | 24.6ms | 137x | 5.6B/s |
DoDB vs Dragonfly
Different paradigm: SQL columnar vs key-value over TCP.
| Operation | DoDB | Dragonfly | Speedup |
|---|---|---|---|
| Write (bulk) | 13.1ms | 1724ms | 131x |
| Point lookup (1000x) | 122ms | 182ms | 1.5x |
| Bulk read (100K) | 0.45ms | 720ms | 1,611x |
| Full scan (count) | 0.001ms | 1707ms | 2,812,243x |
Architecture comparison
How DoDB's design differs from existing engines.
| Feature | DoDB | DuckDB | SQLite | PostgreSQL |
|---|---|---|---|---|
| Storage layout | Columnar (SoA) | Columnar | Row (B-tree) | Row (heap) |
| SIMD | Hand-written AVX2 | Auto-vectorized | None | None |
| Query execution | Vectorized batch | Vectorized pipeline | Row-at-a-time | Row-at-a-time |
| Memory model | mmap + arena | Buffer manager | malloc + pages | Shared buffers |
| Parallelism | Single-threaded | Multi-threaded | Single-writer | Multi-process |
| COUNT(*) | O(1) stored | O(n) scan | O(n) B-tree count | O(n) seq scan |
| Index support | None (full scan) | ART + zonemap | B-tree | B-tree, hash, GIN |
Apple Silicon (NEON)
Native AArch64 assembly with 128-bit NEON SIMD on Apple M1 Pro (10 cores).
| CPU | Apple M1 Pro (10 cores) |
| SIMD | NEON (128-bit), CRC32, LSE atomics |
| Runtime | Native (no Docker, no Rosetta) |
DoDB (NEON) vs DuckDB — 1M rows
| Operation | DoDB | DuckDB | Winner | Speedup | DoDB rows/s |
|---|---|---|---|---|---|
| Bulk insert | 9.17ms | 19.27ms | DoDB | 2.1x | 109M/s |
| Full scan | 0.97ms | 0.71ms | DuckDB | 1.4x | 1.04B/s |
| Filter COUNT (50%) | 0.41ms | 0.51ms | DoDB | 1.3x | 2.47B/s |
| SUM(value) | 0.21ms | 0.34ms | DoDB | 1.6x | 4.70B/s |
| COUNT(*) | 0.0003ms | 0.25ms | DoDB | 838x | 3,425B/s |
| MIN(value) | 0.21ms | 0.4ms | DoDB | 1.9x | 4.76B/s |
| MAX(value) | 0.21ms | 0.33ms | DoDB | 1.6x | 4.76B/s |
| GROUP BY SUM | 13.82ms | 1.29ms | DuckDB | 10.7x | 72.3M/s |
| WHERE AND | 0.0006ms | 0.52ms | DoDB | 833x | 1,600B/s |
| WHERE OR | 0.0006ms | 1.06ms | DoDB | 1,815x | 1,715B/s |
AVX2 vs NEON (DoDB vs DoDB)
Cross-architecture comparison of the same hand-written assembly engine.
| Operation | x86-64 AVX2 | AArch64 NEON | Faster | Ratio |
|---|---|---|---|---|
| Bulk insert | 12.82ms | 9.17ms | NEON | 1.4x |
| SUM(value) | 0.132ms | 0.21ms | AVX2 | 1.6x |
| Filter COUNT (50%) | 1.12ms | 0.41ms | NEON | 2.7x |
| MIN(value) | 0.125ms | 0.21ms | AVX2 | 1.7x |
| COUNT(*) | 0.0004ms | 0.0003ms | NEON | 1.3x |
| WHERE AND | 0.849ms | 0.0006ms | NEON | 1,415x |
| GROUP BY SUM | 1.58ms | 13.82ms | AVX2 | 8.7x |
Reproducibility
All benchmarks are reproducible with a single command.
# Run all benchmarks (requires Docker):
$ make docker-bench
# Or manually:
$ docker compose build
$ docker compose up -d dragonfly postgres
$ sleep 3
$ docker compose run --rm dodb dodb-bench
$ docker compose stop dragonfly postgres