AI Storage Intelligence (MCP)
Sairo includes an optional MCP server (Model Context Protocol) that lets AI assistants like Claude, Cursor, and Windsurf analyze your entire storage infrastructure using natural language.
Instead of clicking through dashboards, just ask:
- “What’s eating all the space in production-reports?”
- “How much is my storage costing on Leaseweb?”
- “Find all parquet files modified this week”
- “Are there any duplicate files wasting storage?”
- “Is my data pipeline still running?”
The AI picks the right tools automatically. No tool names to memorize, no queries to write.
Quick Setup (5 minutes)
Section titled “Quick Setup (5 minutes)”-
Create an API token in Sairo
Log into your Sairo instance, go to API Tokens in the top nav, and click Create Token. Give it a name like
mcp-serverand select the admin role. Copy the token — it starts withsairo_. -
Start the MCP server alongside Sairo
If you’re using Docker Compose, add the MCP service to your
docker-compose.yml:sairo-mcp:build: ./mcpports:- "8100:8100"environment:SAIRO_API_URL: http://sairo:8000SAIRO_API_TOKEN: sairo_your_token_hereDB_DIR: /dataMCP_BIND_HOST: "0.0.0.0"MCP_PORT: "8100"volumes:- sairo-data:/data:rodepends_on:sairo:condition: service_healthyThen run:
Terminal window docker compose up -d -
Connect your AI client
See the client-specific guides below.
Connect Your AI Client
Section titled “Connect Your AI Client”The Claude Desktop Mac app connects via stdio — it launches the MCP server as a local process.
Prerequisites: Python 3.11+ with the mcp package installed:
pip install "mcp[cli]" httpxConfigure Claude Desktop:
Open (or create) ~/Library/Application Support/Claude/claude_desktop_config.json and add:
{ "mcpServers": { "sairo-storage": { "command": "python3.11", "args": [ "-c", "import sys; sys.path.insert(0, '/path/to/sairo/mcp'); from server import mcp; mcp.run(transport='stdio')" ], "env": { "DB_DIR": "/path/to/sairo/data", "SAIRO_API_URL": "http://localhost:8000", "SAIRO_API_TOKEN": "sairo_your_token_here", "MCP_LOG_LEVEL": "WARNING" } } }}Replace the paths:
/path/to/sairo/mcp— where you cloned the Sairo repo’smcp/directory/path/to/sairo/data— the Docker volume or directory where Sairo stores its SQLite databasessairo_your_token_here— the API token you created in step 1
Restart Claude Desktop (Cmd+Q, then reopen). You should see the tools icon in the chat input bar. Ask “What buckets do I have?” to test.
Same as Mac, but the config file is at:
%APPDATA%\Claude\claude_desktop_config.jsonUse forward slashes in paths or escape backslashes:
{ "mcpServers": { "sairo-storage": { "command": "python", "args": [ "-c", "import sys; sys.path.insert(0, 'C:/path/to/sairo/mcp'); from server import mcp; mcp.run(transport='stdio')" ], "env": { "DB_DIR": "C:/path/to/sairo/data", "SAIRO_API_URL": "http://localhost:8000", "SAIRO_API_TOKEN": "sairo_your_token_here", "MCP_LOG_LEVEL": "WARNING" } } }}Cursor and Windsurf connect via Streamable HTTP — they talk to the MCP server over the network.
Make sure the MCP server is running (via Docker Compose or standalone):
# Standalone:cd sairo/mcpSAIRO_API_URL=http://localhost:8000 \SAIRO_API_TOKEN=sairo_your_token_here \DB_DIR=/path/to/sairo/data \python server.pyThen in your IDE settings, add the MCP server URL:
http://localhost:8100/mcpFor Cursor: Go to Settings > MCP Servers > Add Server > URL: http://localhost:8100/mcp
For Windsurf: Go to Settings > AI > MCP > Add: http://localhost:8100/mcp
Claude Code connects via stdio. Add to your project’s .mcp.json or use the CLI:
claude mcp add sairo-storage \ --command "python3.11" \ --args "-c" "import sys; sys.path.insert(0, '/path/to/sairo/mcp'); from server import mcp; mcp.run(transport='stdio')" \ --env DB_DIR=/path/to/sairo/data \ --env SAIRO_API_URL=http://localhost:8000 \ --env SAIRO_API_TOKEN=sairo_your_token_hereFor any MCP-compatible client, the server exposes Streamable HTTP at:
http://your-server:8100/mcpThe server follows the MCP 2025-03-26 specification:
POST /mcp— JSON-RPC requests (initialize, tools/call, etc.)GET /mcp— Server-sent events for notifications- Session management via
Mcp-Session-Idheader
Health check: GET /healthz
Readiness: GET /readyz
Metrics: GET /metrics
What You Can Ask
Section titled “What You Can Ask”You don’t need to know tool names. Just ask naturally:
| What you want to know | What to ask |
|---|---|
| Overview of your storage | ”What buckets do I have?” |
| Where space is going | ”What’s taking up all the space in my-bucket?” |
| Find specific files | ”Find all parquet files in the data pipeline bucket” |
| Check costs | ”How much is this costing me on AWS?” or “Compare costs between R2 and Wasabi” |
| Find waste | ”Are there any duplicate files?” or “What can I archive?” |
| Monitor pipelines | ”Is the bidstream pipeline still running?” or “When was data/ last updated?” |
| Investigate issues | ”Why did storage spike last week?” |
| Read file contents | ”Show me the first 100 lines of that log file” |
| Inspect data schemas | ”What columns are in that parquet file?” |
| Full audit | ”Run a complete storage audit on production-reports” |
What’s Included
Section titled “What’s Included”26 Tools
Section titled “26 Tools”| Category | Tools | What They Do |
|---|---|---|
| Discovery | list_buckets, list_objects, list_folders, search_objects | Navigate and search your storage |
| Inspection | get_object_metadata, read_object_content, read_object_tail, get_file_schema, sample_csv_data, sample_json_data | Look inside files without downloading |
| Analytics | get_storage_breakdown, get_storage_trends, get_file_type_distribution, get_size_distribution, get_age_distribution, get_top_objects, find_duplicates | Understand your data at a glance |
| Cost | estimate_storage_cost, suggest_lifecycle_rules, find_cold_data | Reduce your storage bill |
| Pipeline | analyze_prefix_structure, detect_data_freshness, compare_snapshots | Monitor data pipeline health |
| Operations | get_crawl_status, trigger_crawl, get_audit_log | Manage the Sairo system |
4 Guided Workflows
Section titled “4 Guided Workflows”Use these as prompts to trigger multi-step analysis:
- Storage Audit — Calls 7 tools in sequence: breakdown, file types, size distribution, age analysis, duplicates, lifecycle suggestions. Produces a structured report.
- Cost Optimization — Estimates current costs, finds cold data, identifies duplicates, recommends lifecycle rules with dollar savings.
- Data Quality Check — Checks pipeline freshness, analyzes naming patterns, compares recent snapshots, flags anomalies.
- Incident Investigation — Builds a timeline of storage changes, identifies which folders grew, checks audit logs for who did what.
Supported Providers for Cost Estimation
Section titled “Supported Providers for Cost Estimation”| Provider | Storage Classes |
|---|---|
| AWS S3 | Standard, Standard-IA, One Zone-IA, Glacier Instant, Glacier Flexible, Glacier Deep Archive |
| Cloudflare R2 | Standard (no egress fees) |
| Backblaze B2 | Standard |
| Wasabi | Standard (no egress/API fees) |
| Leaseweb | Standard |
| MinIO / Ceph | Self-hosted ($0 software cost) |
Security
Section titled “Security”Every tool call is gated by authentication and per-bucket RBAC:
- Validates API tokens against Sairo’s auth system
- Enforces bucket-level read/write permissions per user
- Sanitizes all inputs against SQL injection, path traversal, and null bytes
- Sanitizes all outputs to strip prompt injection patterns
- Runs read-only against SQLite databases (no mutations possible)
- Logs every tool invocation with user, bucket, and latency
The MCP server has been tested against every known MCP attack vector from 2025-2026 (30+ CVEs), including tool poisoning, SSRF, DNS rebinding, and confused deputy attacks.
How It Works
Section titled “How It Works”AI Client (Claude, Cursor, Windsurf) │ ▼MCP Server (:8100) ├── SQLite databases (read-only, shared /data volume) │ → Folder listings in 0.05ms │ → Search, analytics, distributions │ → Cost estimation from indexed data │ └── Sairo API (:8000) → File previews and schema extraction → Audit log queries → Crawl trigger (admin only)The MCP server reads Sairo’s per-bucket SQLite databases directly for instant analytics. For operations that need S3 access (file previews, downloads), it calls the Sairo API.
Environment Variables
Section titled “Environment Variables”| Variable | Default | Description |
|---|---|---|
SAIRO_API_URL | http://localhost:8000 | URL of the Sairo API |
SAIRO_API_TOKEN | (required) | API token created in Sairo’s admin panel |
DB_DIR | /data | Path to Sairo’s SQLite databases (shared volume) |
MCP_PORT | 8100 | HTTP port for Streamable HTTP transport |
MCP_BIND_HOST | 127.0.0.1 | Bind address (0.0.0.0 for Docker) |
MCP_LOG_LEVEL | INFO | Log level: DEBUG, INFO, WARNING, ERROR |
MCP_LOG_FORMAT | json | Log format: json (production) or text (development) |
Troubleshooting
Section titled “Troubleshooting”“No module named ‘mcp’” — Install the MCP SDK: pip install "mcp[cli]" httpx
Claude Desktop doesn’t show the tools icon — Make sure you restarted Claude Desktop completely (Cmd+Q). Check logs at ~/Library/Logs/Claude/mcp-server-sairo-storage.log.
“No buckets found” — The DB_DIR path must point to the directory containing Sairo’s .db files. If Sairo runs in Docker, mount the same volume or copy the files locally.
“Authentication required” — Create an API token in Sairo’s admin panel (top nav > API Tokens > Create) and set it as SAIRO_API_TOKEN.
MCP server crashes on startup — If your project directory is named mcp/, Python may confuse it with the mcp pip package. Run the server from outside the mcp/ directory or use the sys.path.insert approach shown in the config examples above.