Frontend
A React 18 single-page application served as static files by the same container. All UI assets are bundled at build time.
This page explains the core ideas behind Sairo so you can operate it effectively and troubleshoot when needed.
Sairo is a single container that sits between your browser and any S3-compatible storage backend. There are no external databases, no message queues, and no additional services to manage.
Frontend
A React 18 single-page application served as static files by the same container. All UI assets are bundled at build time.
Backend
A FastAPI application that handles authentication, serves the API, manages the SQLite index, and proxies S3 operations.
Storage
Any S3-compatible endpoint (AWS S3, MinIO, Ceph, R2, B2, etc.) accessed via boto3 with Signature Version 4 authentication.
Index
Per-bucket SQLite databases with FTS5 full-text search. WAL mode enables concurrent reads during indexing.
Sairo maintains a lightweight local index of every object in your S3 buckets so that browsing and searching are instant, even across millions of objects.
Each bucket gets its own SQLite database file in the /data directory inside the container (configurable via DB_DIR):
This isolation means a large or slow bucket does not affect the responsiveness of others.
All databases use SQLite’s Write-Ahead Logging (WAL) mode. WAL allows concurrent reads while the crawler writes new entries, so the UI stays responsive during indexing. This is critical — without WAL, browsing would block during recrawl cycles.
Each database includes an FTS5 virtual table that indexes object keys. This is what powers the / search feature in the UI:
| What’s indexed | Example match |
|---|---|
| Object key segments | invoices/2024/march/report.pdf matches march report |
| File extensions | .parquet matches all Parquet files |
| Prefix paths | backups/daily/ matches all daily backups |
Results return in milliseconds, even across buckets with 100K+ objects.
Sairo’s background crawler keeps the local index in sync with your S3 storage.
Startup crawl
When the container starts, Sairo immediately begins a full crawl of every accessible bucket. It uses ListObjectsV2 to enumerate all objects and inserts their metadata into the corresponding SQLite databases.
Parallel indexing
The crawler processes 4 prefix threads per bucket simultaneously. For buckets with deep prefix hierarchies, this dramatically reduces indexing time compared to a single-threaded approach.
Metadata stored
For each object, the index stores:
Auto-recrawl loop
After the initial crawl completes, Sairo waits RECRAWL_INTERVAL seconds (default: 120) and then starts another crawl. This loop repeats indefinitely, keeping the index reasonably fresh.
Sairo uses a simple but secure authentication system based on JSON Web Tokens.
Login — User submits credentials (local, LDAP, or OAuth). The server validates them and issues a signed JWT.
Cookie storage — The JWT is stored in an httpOnly, Secure cookie. httpOnly prevents XSS token theft. The Secure flag ensures transmission only over HTTPS.
Request auth — Every subsequent API request includes the cookie automatically. The server verifies the JWT signature and checks expiration.
Session expiry — Tokens expire after SESSION_HOURS hours (default: 24). The user must re-authenticate.
Sairo supports two roles with granular bucket-level permissions:
Sairo supports multiple authentication backends. They can be used independently or combined:
| Provider | Use case | Config |
|---|---|---|
| Local | Default admin account, standalone users | ADMIN_USER, ADMIN_PASS |
| LDAP | Corporate directory integration | LDAP_* env vars |
| Google OAuth | Google Workspace SSO | OAUTH_GOOGLE_* env vars |
| GitHub OAuth | GitHub organization SSO | OAUTH_GITHUB_* env vars |
| API Tokens | CI/CD pipelines, automation | Generated in the UI |
See Authentication and OAuth & LDAP for setup guides.