Skip to content

Benchmarks

Sairo ships with a benchmark suite you can run against your own deployment. All numbers on the landing page are derived from these reproducible tests.

ParameterValue
RuntimeDocker container, single Uvicorn process
HostmacOS, Apple Silicon (Docker Desktop)
Local S3MinIO (localhost:9000)
Production S3Remote S3-compatible object storage
Production dataset134,707 objects, 38.25 TB
Methodology30 iterations per measurement, percentile reporting

FTS5 trigram search against the production bucket (134,707 objects, 38.25 TB):

QueryResultsp50p95
parquet (limit=100)1003.1ms4.8ms
events2002.4ms16.2ms
tracking2002.3ms3.0ms
analytics2002.5ms3.4ms
ingest2002.4ms5.5ms
metadata2002.4ms4.3ms
20262002.2ms22.0ms

Fastest observed: 1.7ms. Typical queries return in 2-3ms p50 against 134K objects.

BucketObjectsDurationThroughput
bench-small1,0001.08s926 obj/s
bench-mixed2,4161.08s1,348 obj/s
production-bucket134,707completedProduction-scale crawl

Indexing rate: 1,000–1,350 objects/second on local MinIO.

File Sizep50Throughput
1 KB44.9ms
1 MB51.9ms19.3 MB/s
10 MB130.7ms76.5 MB/s
50 MB436.2ms114.6 MB/s
Endpointp50p95
/healthz2.1ms3.6ms
/api/buckets4.3ms5.8ms
Object listing (production)2.4ms4.6ms
Presigned URL generation3.1ms5.6ms

Production S3, concurrent search queries:

Concurrent UsersRequests/sec
5236
10333
25528

Benchmarked on production data running in Docker against Leaseweb StorageGRID:

BucketObjectsBefore (v2.0)After (v3.0)Speedup
ssp-production-reports557K114ms0.048ms2,378x
ds-mletl-data139K27ms0.056ms486x
druid-lw-prod2M (was >1M, disabled)312ms (DISTINCT fallback)0.002ms191,231x

Folder listing uses pre-computed prefix_children via SQL-only aggregation. Previously disabled for buckets >1M objects due to OOM — now works at any scale.

QueryBeforeAfterDataset
COUNT(*)2.0ms1.5ms557K objects
COUNT(*)11.3ms7.3ms2M objects
SUM(size)40ms62ms557K objects
Folder stats rebuild63ms56ms139K objects

PRAGMAs applied: cache_size=-64000 (64MB), mmap_size=268435456 (256MB), temp_store=MEMORY.

SettingBeforeAfter
Crawl workers612
Prefix workers416
Batch size2,00010,000
Update chunks5002,000
FTS rebuildBlocks 15+ minBackground thread
Sub-prefix splittingNoneAuto for buckets with few prefixes
OperationExpected Performance
Folder listing< 1ms (constant time, index lookup)
Search~1ms (FTS5, always available)
Storage breakdown~500ms (full scan with PRAGMA tuning)
Crawl (50M objects)~25 min (with sub-prefix splitting)
FTS rebuild (50M objects)~83 min (background, non-blocking)

The benchmark suite lives in the benchmark/ directory of the repository.

Terminal window
# Requires MinIO CLI (mc) configured with alias "local"
cd benchmark
./seed-data.sh # Seeds bench-small (1K) + bench-mixed (2.4K)
./seed-data.sh medium # Seeds bench-medium (10K objects)
./seed-data.sh large # Seeds bench-large (50K objects)

The seeder creates four buckets with realistic file structures:

BucketObjectsPattern
bench-small1,0005 dirs × 10 months × 20 files
bench-mixed~2,400Parquet data lake, logs, configs, CSV reports
bench-medium10,00010 dirs × 10 months × 10 sub-dirs × 10 files
bench-large50,00010 dirs × 50 partitions × 100 records
Terminal window
# Run all benchmarks
./run-benchmarks.sh
# Run specific benchmark categories
./run-benchmarks.sh search # Search latency only
./run-benchmarks.sh crawl # Crawl/indexing only
./run-benchmarks.sh crawl listing # Crawl + listing
  • Sairo running on localhost:8000 (or set SAIRO_URL)
  • MinIO running on localhost:9000
  • Test buckets seeded via seed-data.sh
  • Default admin credentials (admin/password) or set ADMIN_USER/ADMIN_PASS

Results are saved to benchmark/results/:

  • JSON — machine-readable per-run results (benchmark-YYYYMMDD-HHMMSS.json)
  • Markdown — human-readable summary (LATEST-RESULTS.md)

Every number on the landing page maps to a specific benchmark result:

Landing Page ClaimBenchmark Evidence
”Single-digit millisecond search”Production p50 = 2.2–3.1ms on 134K objects
”1,300+ obj/sec indexing”1,348 obj/s measured on bench-mixed
”Sub-5ms API responses”healthz p50 = 2.1ms, most endpoints < 5ms
”500+ requests/second”528 req/s at 25 concurrent users (production)
“114 MB/s upload”50 MB file upload sustained throughput