18 KiB
Production Deployment Guide
This guide covers deploying Flynn in a production environment.
Table of Contents
- Prerequisites
- Docker Deployment
- Nix Deployment
- Systemd Service
- Security
- Configuration
- Monitoring
- Backup & Recovery
- Performance Tuning
- Scaling Considerations
Prerequisites
System Requirements
- OS: Linux (Ubuntu 22.04+ recommended) or macOS
- Node.js: >= 22.0.0
- Memory: Minimum 2GB, 4GB+ recommended
- Disk: 10GB+ for sessions, memory, and vectors
- Docker: Required for sandbox features (optional)
Network Requirements
- Public IP or VPN (Tailscale recommended) for remote access
- Open ports: 18800 (gateway), optional 443 (Tailscale Serve)
- Outbound HTTPS access for model providers and web tools
External Services (Optional)
- Model Providers: Anthropic, OpenAI, GitHub Models, etc. (API keys required)
- Email: SMTP server for email notifications
- Object Storage: MinIO or S3 for backups (optional)
Docker Deployment
Quick Start
Using the provided docker-compose.yml:
# Clone repository
git clone <repo-url>
cd flynn
# Create config
cp config/default.yaml config/production.yaml
# Edit config/production.yaml with your settings
# Start services
docker compose up -d
# View logs
docker compose logs -f
Dockerfile
Use the repo Dockerfile: Dockerfile.
Notes:
- Multi-stage build (builder + runtime).
- Uses
corepack+pnpmwithpnpm-lock.yamlfor reproducible installs. - Exposes port
18800and runsdist/cli/index.js start.
Docker Compose Configuration
Use the repo compose file: docker-compose.yml.
The important parts to customize:
- Mount your config:
./config/production.yaml:/config/config.yaml:ro - Set provider keys (
ANTHROPIC_API_KEY, etc.) - Optionally set gateway token auth (
FLYNN_SERVER_TOKEN)
Environment Variables
# Node environment
export NODE_ENV=production
# Config path
export FLYNN_CONFIG=/path/to/config.yaml
# Data directory (default: ~/.local/share/flynn)
export FLYNN_DATA_DIR=/var/lib/flynn
# Optional: Override model provider credentials
export ANTHROPIC_API_KEY=sk-...
export OPENAI_API_KEY=sk-...
Nix Deployment
If you use Nix, this repo ships a flake (package + dev shell + optional NixOS
module). See docs/deployment/NIX.md.
Systemd Service
Service File
Create /etc/systemd/system/flynn.service:
[Unit]
Description=Flynn AI Assistant Daemon
After=network.target
Wants=network-online.target
[Service]
Type=simple
User=flynn
Group=flynn
WorkingDirectory=/opt/flynn
Environment="NODE_ENV=production"
Environment="FLYNN_CONFIG=/etc/flynn/config.yaml"
Environment="FLYNN_DATA_DIR=/var/lib/flynn"
ExecStart=/usr/local/bin/node /opt/flynn/dist/cli/index.js start
Restart=always
RestartSec=10
StandardOutput=journal
StandardError=journal
SyslogIdentifier=flynn
# Security hardening
NoNewPrivileges=true
PrivateTmp=true
ProtectSystem=strict
ProtectHome=true
ReadWritePaths=/var/lib/flynn /var/log/flynn /var/run
# Resource limits
MemoryLimit=2G
MemorySwap=0
CPUQuota=200%
[Install]
WantedBy=multi-user.target
Create Flynn User
# Create user and group
sudo useradd --system --home /var/lib/flynn --shell /usr/sbin/nologin flynn
sudo groupadd flynn
# Create directories
sudo mkdir -p /opt/flynn /etc/flynn /var/lib/flynn /var/log/flynn
sudo chown -R flynn:flynn /opt/flynn /var/lib/flynn /var/log/flynn
# Copy binaries and config
sudo cp -r dist/* /opt/flynn/
sudo cp config/production.yaml /etc/flynn/config.yaml
sudo chown -R root:root /opt/flynn /etc/flynn
sudo chmod 644 /etc/flynn/config.yaml
Enable and Start Service
# Reload systemd
sudo systemctl daemon-reload
# Enable service (start on boot)
sudo systemctl enable flynn
# Start service
sudo systemctl start flynn
# Check status
sudo systemctl status flynn
# View logs
sudo journalctl -u flynn -f
# Restart service
sudo systemctl restart flynn
Service Management
# Stop service
sudo systemctl stop flynn
# Reload config (requires restart)
sudo systemctl restart flynn
# Check if running
sudo systemctl is-active flynn
# View recent logs
sudo journalctl -u flynn -n 100 --no-pager
Security
Secrets Management
Never commit secrets to version control. Use one of these approaches:
Environment Variables
# config/production.yaml
models:
default:
provider: anthropic
model: claude-sonnet-4-20250514
api_key: '${ANTHROPIC_API_KEY}'
Set in /etc/flynn/.env or systemd service file:
Environment="ANTHROPIC_API_KEY=sk-..."
HashiCorp Vault (Advanced)
Use a secrets manager and inject at runtime:
vault kv get -field=api_key secret/anthropic > /tmp/anthropic_key.txt
export ANTHROPIC_API_KEY=$(cat /tmp/anthropic_key.txt)
rm /tmp/anthropic_key.txt
Authentication
Gateway Auth
# config/production.yaml
server:
token: 'your-random-token-here' # Generate with: openssl rand -hex 32
tailscale_identity: true
auth_http: true
lock: false
Generate a secure token:
openssl rand -hex 32
Safe Defaults (Recommended)
These defaults align with docs/security/SAFE_PERSONAL_AGENT.md:
pairing:
enabled: true
tools:
profile: messaging
sandbox:
enabled: true
Channel Whitelists
Restrict who can interact with Flynn:
channels:
telegram:
allowedChatIds: ['123456789'] # Your Telegram chat ID
discord:
allowedGuildIds: ['987654321098765432']
allowedChannelIds: ['123456789012345678']
slack:
allowedChannelIds: ['C12345678']
signingSecret: '${SLACK_SIGNING_SECRET}'
Network Security
Firewall
# Ubuntu/Debian (ufw)
sudo ufw allow 22/tcp # SSH
sudo ufw allow 18800/tcp # Flynn gateway
sudo ufw enable
# CentOS/RHEL (firewalld)
sudo firewall-cmd --permanent --add-port=18800/tcp
sudo firewall-cmd --reload
Reverse Proxy (Nginx)
Place Flynn behind Nginx for TLS:
server {
listen 443 ssl http2;
server_name flynn.example.com;
ssl_certificate /etc/letsencrypt/live/flynn.example.com/fullchain.pem;
ssl_certificate_key /etc/letsencrypt/live/flynn.example.com/privkey.pem;
# WebSocket upgrade
location / {
proxy_pass http://localhost:18800;
proxy_http_version 1.1;
proxy_set_header Upgrade $http_upgrade;
proxy_set_header Connection "upgrade";
proxy_set_header Host $host;
proxy_set_header X-Real-IP $remote_addr;
proxy_set_header X-Forwarded-For $proxy_add_x_forwarded_for;
proxy_set_header X-Forwarded-Proto $scheme;
# Timeouts
proxy_connect_timeout 60s;
proxy_send_timeout 60s;
proxy_read_timeout 60s;
}
# Health check endpoint (no auth required)
location /health {
proxy_pass http://localhost:18800/health;
access_log off;
}
}
Obtain TLS certificate with Let's Encrypt:
sudo certbot --nginx -d flynn.example.com
File Permissions
# Data directory
sudo chmod 750 /var/lib/flynn
sudo chown flynn:flynn /var/lib/flynn
# Config file
sudo chmod 640 /etc/flynn/config.yaml
sudo chown root:flynn /etc/flynn/config.yaml
# Logs
sudo chmod 750 /var/log/flynn
sudo chown flynn:flynn /var/log/flynn
Sandbox Security
Docker sandbox adds isolation but requires careful configuration:
# config/production.yaml
sandbox:
enabled: true
image: 'node:22-alpine'
dockerSocket: '/var/run/docker.sock'
resourceLimits:
memory: '512m'
cpus: '0.5'
timeoutSec: 60
networkMode: 'none' # No network access
Ensure Docker is secured:
# Run Docker as Flynn user
sudo usermod -aG docker flynn
# Configure Docker daemon security
sudo vim /etc/docker/daemon.json
{
"log-driver": "json-file",
"log-opts": {
"max-size": "10m",
"max-file": "3"
},
"live-restore": true,
"userland-proxy": false
}
Configuration
Production Config Template
# config/production.yaml
# Base config for production deployment
# ── Gateway ───────────────────────────────────────────────────────────────
gateway:
enabled: true
port: 18800
auth:
token: '${GATEWAY_TOKEN}'
trustTailscaleIdentity: true
applyToHttp: true
lock:
enabled: true
tailscaleServe:
enabled: false # Set to true to expose via Tailscale
hostname: 'flynn'
port: 443
# ── Models ─────────────────────────────────────────────────────────────────
models:
default:
anthropic:
apiKey: '${ANTHROPIC_API_KEY}'
model: 'claude-sonnet-4-20250514'
maxTokens: 4096
router:
tiers:
default: 'anthropic:claude-sonnet-4-20250514'
fast: 'anthropic:claude-haiku-4-20250514'
complex: 'anthropic:claude-opus-4-20250514'
local: 'ollama:llama3'
fallbackChain:
- 'github:claude-sonnet-4-5'
- 'local:ollama:llama3'
retry:
maxAttempts: 3
initialDelayMs: 1000
multiplier: 2
maxDelayMs: 30000
# ── Channels ───────────────────────────────────────────────────────────────
channels:
telegram:
enabled: true
token: '${TELEGRAM_BOT_TOKEN}'
allowedChatIds: ['123456789']
discord:
enabled: false
slack:
enabled: false
whatsapp:
enabled: false
# ── Sessions ───────────────────────────────────────────────────────────────
sessions:
ttl: '7d'
maxSessions: 100
# ── Memory ────────────────────────────────────────────────────────────────
memory:
enabled: true
embeddings:
provider: 'openai'
openai:
apiKey: '${OPENAI_API_KEY}'
model: 'text-embedding-3-small'
# ── Tools ─────────────────────────────────────────────────────────────────
tools:
policy: 'coding' # Restrict tool access
executor:
defaultTimeoutMs: 30000
maxOutputBytes: 51200
sandbox:
enabled: false # Enable if using Docker
# ── Agents ────────────────────────────────────────────────────────────────
agents:
default:
modelTier: 'default'
toolPolicy: 'coding'
compaction:
thresholdPct: 80
keepTurns: 4
summaryMaxTokens: 1024
# ── Automation ────────────────────────────────────────────────────────────
automation:
cron:
enabled: false
webhooks:
enabled: false
heartbeat:
enabled: true
interval: '5m'
checks:
- 'gateway'
- 'model'
- 'channels'
- 'memory'
- 'disk'
notifications:
- type: 'telegram'
chatId: '123456789'
# ── Logging ───────────────────────────────────────────────────────────────
logging:
level: 'info' # debug, info, warn, error
Config Validation
Validate config before starting:
flynn doctor --config /etc/flynn/config.yaml
Monitoring
Health Checks
Flynn provides a health check endpoint:
# HTTP health check
curl http://localhost:18800/health
# Response
{
"status": "ok",
"version": "0.1.0",
"uptime": 12345
}
Logs
Journalctl (systemd)
# Follow logs
sudo journalctl -u flynn -f
# View last 100 lines
sudo journalctl -u flynn -n 100 --no-pager
# View logs since yesterday
sudo journalctl -u flynn --since yesterday
# Search for errors
sudo journalctl -u flynn | grep -i error
Log Rotation
Configure logrotate for systemd journal:
sudo vim /etc/systemd/journald.conf
[Journal]
SystemMaxUse=100M
MaxRetentionSec=7day
Restart systemd:
sudo systemctl restart systemd-journald
Heartbeat Monitor
Enable built-in heartbeat monitoring:
automation:
heartbeat:
enabled: true
interval: '5m'
checks:
- 'gateway'
- 'model'
- 'channels'
- 'memory'
- 'disk'
notifications:
- type: 'telegram'
chatId: '123456789'
- type: 'webhook'
url: 'https://hooks.slack.com/services/...'
External Monitoring
Prometheus (Optional)
Use Node.js prom-client for metrics (not currently implemented):
# Future feature
monitoring:
prometheus:
enabled: true
port: 9090
Uptime Monitoring
Use external services:
- UptimeRobot
- Pingdom
- Better Uptime
Monitor:
- Gateway HTTP health endpoint
- WebSocket connection
- Response time
Backup & Recovery
What to Backup
- Configuration:
/etc/flynn/config.yaml - Sessions: SQLite database at
~/.local/share/flynn/sessions.db - Memory Files:
~/.local/share/flynn/memory/ - Vectors: SQLite database at
~/.local/share/flynn/vectors.db - Pairing Codes: SQLite table within sessions.db
Backup Script
Create /usr/local/bin/flynn-backup.sh:
#!/bin/bash
set -e
BACKUP_DIR="/var/backups/flynn"
DATA_DIR="/var/lib/flynn"
CONFIG_DIR="/etc/flynn"
DATE=$(date +%Y%m%d_%H%M%S)
BACKUP_FILE="$BACKUP_DIR/flynn_$DATE.tar.gz"
# Create backup directory
mkdir -p "$BACKUP_DIR"
# Stop Flynn
sudo systemctl stop flynn
# Create backup
tar -czf "$BACKUP_FILE" \
"$CONFIG_DIR/config.yaml" \
"$DATA_DIR/sessions.db" \
"$DATA_DIR/vectors.db" \
"$DATA_DIR/memory/"
# Compress old backups (keep last 7 daily, 4 weekly, 12 monthly)
find "$BACKUP_DIR" -name "flynn_*.tar.gz" -mtime +90 -delete
# Restart Flynn
sudo systemctl start flynn
echo "Backup created: $BACKUP_FILE"
Make executable:
sudo chmod +x /usr/local/bin/flynn-backup.sh
Cron Job
Add to root crontab:
sudo crontab -e
# Daily backup at 2 AM
0 2 * * * /usr/local/bin/flynn-backup.sh >> /var/log/flynn-backup.log 2>&1
Restore
# Stop Flynn
sudo systemctl stop flynn
# Extract backup
sudo tar -xzf /var/backups/flynn/flynn_20250213_020000.tar.gz -C /
# Start Flynn
sudo systemctl start flynn
Database Maintenance
Run SQLite vacuum periodically:
sqlite3 /var/lib/flynn/sessions.db "VACUUM;"
sqlite3 /var/lib/flynn/vectors.db "VACUUM;"
Add to crontab (monthly):
0 0 1 * * sqlite3 /var/lib/flynn/sessions.db "VACUUM;" >> /var/log/flynn-maintenance.log 2>&1
Performance Tuning
Node.js Tuning
Set Node.js options for production:
# In systemd service
Environment="NODE_OPTIONS=--max-old-space-size=2048"
# Or via environment variable
export NODE_OPTIONS="--max-old-space-size=2048"
Context Management
Optimize compaction settings:
agents:
default:
compaction:
thresholdPct: 75 # Trigger earlier
keepTurns: 6 # Keep more context
summaryMaxTokens: 2048 # Better summaries
SQLite Performance
Enable WAL mode:
sqlite3 /var/lib/flynn/sessions.db "PRAGMA journal_mode=WAL;"
sqlite3 /var/lib/flynn/sessions.db "PRAGMA synchronous=NORMAL;"
sqlite3 /var/lib/flynn/sessions.db "PRAGMA cache_size=-64000;" # 64MB
Model Routing
Configure tiers for optimal cost/latency:
models:
router:
tiers:
fast: 'anthropic:claude-haiku-4-20250514' # Quick tasks
default: 'anthropic:claude-sonnet-4-20250514' # General use
complex: 'anthropic:claude-opus-4-20250514' # Complex reasoning
local: 'ollama:llama3' # Fallback
Caching (Future)
Consider adding caching for:
- Repeated tool calls
- Memory search results
- Model responses for common queries
Scaling Considerations
Single-Operator Scope
Flynn is designed for a single operator with multiple concurrent users. Limitations:
- Max Concurrent Sessions: ~100 (depends on model rate limits)
- Throughput: ~10-20 messages/second (varies by model)
- Memory Usage: 2-4GB for moderate usage
When to Scale Up
Consider scaling if:
- Consistent CPU usage > 80%
- Memory usage > 4GB
- Frequent rate limiting from model providers
- Slow response times > 30 seconds
Scaling Strategies
-
Horizontal Scaling: Deploy multiple Flynn instances behind a load balancer (not currently supported - sessions are stateful)
-
Vertical Scaling: Increase server resources (CPU, memory)
-
Multi-Instance Architecture (future):
- Shared session storage (PostgreSQL/Redis)
- Message queue for request distribution
- Session affinity for stateful connections
Cost Optimization
- Use local models for non-critical tasks
- Cache embeddings
- Optimize compaction to reduce token usage
- Use efficient models for delegated tasks
Troubleshooting Production Issues
Service Won't Start
# Check status
sudo systemctl status flynn
# View logs
sudo journalctl -u flynn -n 50 --no-pager
# Validate config
flynn doctor --config /etc/flynn/config.yaml
High Memory Usage
# Check memory
free -h
# Check process memory
ps aux | grep flynn
# Restart service
sudo systemctl restart flynn
Gateway Connection Issues
# Check if port is listening
sudo ss -tlnp | grep 18800
# Check firewall
sudo ufw status
# Test connectivity
curl http://localhost:18800/health
Slow Response Times
# Check CPU usage
top
# Check model provider status
# Verify API keys are valid
# Check network latency
# Enable debug logging
DEBUG='*' sudo systemctl restart flynn
For additional help, see:
- TROUBLESHOOTING.md
- README.md
- GitHub Issues