8a6cd7f559
This commit adds 6 new documentation files to fill critical gaps: - CONTRIBUTING.md: Developer onboarding guide with setup, workflow, code style, testing, and adding features - TROUBLESHOOTING.md: Common issues and solutions for errors, model issues, tool issues, channel issues, gateway issues, configuration issues, and memory/database issues - docs/api/PROTOCOL.md: Gateway JSON-RPC protocol documentation with connection, authentication, message format, methods, events, error codes, and example client implementation - docs/api/TOOLS.md: Tools API documentation covering tool interface, input schema format, result format, tool patterns, tool registration, tool policy, execution flow, and builtin tools reference - docs/deployment/PRODUCTION.md: Production deployment guide covering Docker deployment, systemd service, security, configuration, monitoring, backup & recovery, and performance tuning - docs/performance/TUNING.md: Performance optimization guide covering context management, model routing, tool execution, memory & embeddings, session management, database performance, gateway performance, and resource usage These files complement the existing excellent documentation (README.md, AGENTS.md, ARCHITECTURE.md, STRUCTURE.md, CONVENTIONS.md) to provide complete coverage for users, developers, and operators.
914 lines
19 KiB
Markdown
914 lines
19 KiB
Markdown
# Production Deployment Guide
|
|
|
|
This guide covers deploying Flynn in a production environment.
|
|
|
|
## Table of Contents
|
|
|
|
- [Prerequisites](#prerequisites)
|
|
- [Docker Deployment](#docker-deployment)
|
|
- [Systemd Service](#systemd-service)
|
|
- [Security](#security)
|
|
- [Configuration](#configuration)
|
|
- [Monitoring](#monitoring)
|
|
- [Backup & Recovery](#backup--recovery)
|
|
- [Performance Tuning](#performance-tuning)
|
|
- [Scaling Considerations](#scaling-considerations)
|
|
|
|
## Prerequisites
|
|
|
|
### System Requirements
|
|
|
|
- **OS**: Linux (Ubuntu 22.04+ recommended) or macOS
|
|
- **Node.js**: >= 22.0.0
|
|
- **Memory**: Minimum 2GB, 4GB+ recommended
|
|
- **Disk**: 10GB+ for sessions, memory, and vectors
|
|
- **Docker**: Required for sandbox features (optional)
|
|
|
|
### Network Requirements
|
|
|
|
- Public IP or VPN (Tailscale recommended) for remote access
|
|
- Open ports: 18800 (gateway), optional 443 (Tailscale Serve)
|
|
- Outbound HTTPS access for model providers and web tools
|
|
|
|
### External Services (Optional)
|
|
|
|
- **Model Providers**: Anthropic, OpenAI, GitHub Models, etc. (API keys required)
|
|
- **Email**: SMTP server for email notifications
|
|
- **Object Storage**: MinIO or S3 for backups (optional)
|
|
|
|
## Docker Deployment
|
|
|
|
### Quick Start
|
|
|
|
Using the provided `docker-compose.yml`:
|
|
|
|
```bash
|
|
# Clone repository
|
|
git clone <repo-url>
|
|
cd flynn
|
|
|
|
# Create config
|
|
cp config/default.yaml config/production.yaml
|
|
# Edit config/production.yaml with your settings
|
|
|
|
# Start services
|
|
docker-compose up -d
|
|
|
|
# View logs
|
|
docker-compose logs -f
|
|
```
|
|
|
|
### Dockerfile
|
|
|
|
The multi-stage Dockerfile:
|
|
|
|
```dockerfile
|
|
# Stage 1: Build
|
|
FROM node:22-alpine AS builder
|
|
WORKDIR /app
|
|
COPY package*.json ./
|
|
RUN npm ci --only=production
|
|
COPY . .
|
|
RUN npm run build
|
|
|
|
# Stage 2: Runtime
|
|
FROM node:22-alpine
|
|
WORKDIR /app
|
|
COPY --from=builder /app/dist ./dist
|
|
COPY --from=builder /app/node_modules ./node_modules
|
|
COPY config ./config
|
|
COPY src/gateway/ui ./dist/gateway/ui
|
|
|
|
# Create data directory
|
|
RUN mkdir -p /data
|
|
|
|
# Health check
|
|
HEALTHCHECK --interval=30s --timeout=10s --start-period=5s --retries=3 \
|
|
CMD node -e "require('http').get('http://localhost:18800/health', (r) => {process.exit(r.statusCode === 200 ? 0 : 1)})"
|
|
|
|
# Expose gateway port
|
|
EXPOSE 18800
|
|
|
|
# Run
|
|
CMD ["node", "dist/cli/index.js", "start"]
|
|
```
|
|
|
|
### Docker Compose Configuration
|
|
|
|
```yaml
|
|
version: '3.8'
|
|
|
|
services:
|
|
flynn:
|
|
build: .
|
|
container_name: flynn
|
|
restart: unless-stopped
|
|
ports:
|
|
- "18800:18800"
|
|
volumes:
|
|
- ./config/production.yaml:/flynn/config.yaml:ro
|
|
- flynn_data:/data
|
|
- /var/run/docker.sock:/var/run/docker.sock # For sandbox
|
|
environment:
|
|
- NODE_ENV=production
|
|
- FLYNN_CONFIG=/flynn/config.yaml
|
|
healthcheck:
|
|
test: ["CMD", "wget", "--spider", "-q", "http://localhost:18800/health"]
|
|
interval: 30s
|
|
timeout: 10s
|
|
retries: 3
|
|
start_period: 5s
|
|
|
|
whisper:
|
|
image: openai/whisper-server:latest
|
|
container_name: whisper-server
|
|
restart: unless-stopped
|
|
ports:
|
|
- "8080:8080"
|
|
volumes:
|
|
- whisper_cache:/cache
|
|
environment:
|
|
- WHISPER_MODEL=base
|
|
- WHISPER_HTTP_PORT=8080
|
|
|
|
volumes:
|
|
flynn_data:
|
|
whisper_cache:
|
|
```
|
|
|
|
### Environment Variables
|
|
|
|
```bash
|
|
# Node environment
|
|
export NODE_ENV=production
|
|
|
|
# Config path
|
|
export FLYNN_CONFIG=/path/to/config.yaml
|
|
|
|
# Data directory (default: ~/.local/share/flynn)
|
|
export FLYNN_DATA_DIR=/var/lib/flynn
|
|
|
|
# Optional: Override model provider credentials
|
|
export ANTHROPIC_API_KEY=sk-...
|
|
export OPENAI_API_KEY=sk-...
|
|
```
|
|
|
|
## Systemd Service
|
|
|
|
### Service File
|
|
|
|
Create `/etc/systemd/system/flynn.service`:
|
|
|
|
```ini
|
|
[Unit]
|
|
Description=Flynn AI Assistant Daemon
|
|
After=network.target
|
|
Wants=network-online.target
|
|
|
|
[Service]
|
|
Type=simple
|
|
User=flynn
|
|
Group=flynn
|
|
WorkingDirectory=/opt/flynn
|
|
Environment="NODE_ENV=production"
|
|
Environment="FLYNN_CONFIG=/etc/flynn/config.yaml"
|
|
Environment="FLYNN_DATA_DIR=/var/lib/flynn"
|
|
ExecStart=/usr/local/bin/node /opt/flynn/dist/cli/index.js start
|
|
Restart=always
|
|
RestartSec=10
|
|
StandardOutput=journal
|
|
StandardError=journal
|
|
SyslogIdentifier=flynn
|
|
|
|
# Security hardening
|
|
NoNewPrivileges=true
|
|
PrivateTmp=true
|
|
ProtectSystem=strict
|
|
ProtectHome=true
|
|
ReadWritePaths=/var/lib/flynn /var/log/flynn /var/run
|
|
|
|
# Resource limits
|
|
MemoryLimit=2G
|
|
MemorySwap=0
|
|
CPUQuota=200%
|
|
|
|
[Install]
|
|
WantedBy=multi-user.target
|
|
```
|
|
|
|
### Create Flynn User
|
|
|
|
```bash
|
|
# Create user and group
|
|
sudo useradd --system --home /var/lib/flynn --shell /usr/sbin/nologin flynn
|
|
sudo groupadd flynn
|
|
|
|
# Create directories
|
|
sudo mkdir -p /opt/flynn /etc/flynn /var/lib/flynn /var/log/flynn
|
|
sudo chown -R flynn:flynn /opt/flynn /var/lib/flynn /var/log/flynn
|
|
|
|
# Copy binaries and config
|
|
sudo cp -r dist/* /opt/flynn/
|
|
sudo cp config/production.yaml /etc/flynn/config.yaml
|
|
sudo chown -R root:root /opt/flynn /etc/flynn
|
|
sudo chmod 644 /etc/flynn/config.yaml
|
|
```
|
|
|
|
### Enable and Start Service
|
|
|
|
```bash
|
|
# Reload systemd
|
|
sudo systemctl daemon-reload
|
|
|
|
# Enable service (start on boot)
|
|
sudo systemctl enable flynn
|
|
|
|
# Start service
|
|
sudo systemctl start flynn
|
|
|
|
# Check status
|
|
sudo systemctl status flynn
|
|
|
|
# View logs
|
|
sudo journalctl -u flynn -f
|
|
|
|
# Restart service
|
|
sudo systemctl restart flynn
|
|
```
|
|
|
|
### Service Management
|
|
|
|
```bash
|
|
# Stop service
|
|
sudo systemctl stop flynn
|
|
|
|
# Reload config (requires restart)
|
|
sudo systemctl restart flynn
|
|
|
|
# Check if running
|
|
sudo systemctl is-active flynn
|
|
|
|
# View recent logs
|
|
sudo journalctl -u flynn -n 100 --no-pager
|
|
```
|
|
|
|
## Security
|
|
|
|
### Secrets Management
|
|
|
|
Never commit secrets to version control. Use one of these approaches:
|
|
|
|
#### Environment Variables
|
|
|
|
```yaml
|
|
# config/production.yaml
|
|
models:
|
|
default:
|
|
anthropic:
|
|
apiKey: '${ANTHROPIC_API_KEY}'
|
|
```
|
|
|
|
Set in `/etc/flynn/.env` or systemd service file:
|
|
```ini
|
|
Environment="ANTHROPIC_API_KEY=sk-..."
|
|
```
|
|
|
|
#### HashiCorp Vault (Advanced)
|
|
|
|
Use a secrets manager and inject at runtime:
|
|
|
|
```bash
|
|
vault kv get -field=api_key secret/anthropic > /tmp/anthropic_key.txt
|
|
export ANTHROPIC_API_KEY=$(cat /tmp/anthropic_key.txt)
|
|
rm /tmp/anthropic_key.txt
|
|
```
|
|
|
|
### Authentication
|
|
|
|
#### Gateway Auth
|
|
|
|
```yaml
|
|
# config/production.yaml
|
|
gateway:
|
|
enabled: true
|
|
auth:
|
|
token: 'your-random-token-here' # Generate with: openssl rand -hex 32
|
|
trustTailscaleIdentity: true
|
|
applyToHttp: true
|
|
```
|
|
|
|
Generate a secure token:
|
|
```bash
|
|
openssl rand -hex 32
|
|
```
|
|
|
|
#### Channel Whitelists
|
|
|
|
Restrict who can interact with Flynn:
|
|
|
|
```yaml
|
|
channels:
|
|
telegram:
|
|
allowedChatIds: ['123456789'] # Your Telegram chat ID
|
|
discord:
|
|
allowedGuildIds: ['987654321098765432']
|
|
allowedChannelIds: ['123456789012345678']
|
|
slack:
|
|
allowedChannelIds: ['C12345678']
|
|
signingSecret: '${SLACK_SIGNING_SECRET}'
|
|
```
|
|
|
|
### Network Security
|
|
|
|
#### Firewall
|
|
|
|
```bash
|
|
# Ubuntu/Debian (ufw)
|
|
sudo ufw allow 22/tcp # SSH
|
|
sudo ufw allow 18800/tcp # Flynn gateway
|
|
sudo ufw enable
|
|
|
|
# CentOS/RHEL (firewalld)
|
|
sudo firewall-cmd --permanent --add-port=18800/tcp
|
|
sudo firewall-cmd --reload
|
|
```
|
|
|
|
#### Reverse Proxy (Nginx)
|
|
|
|
Place Flynn behind Nginx for TLS:
|
|
|
|
```nginx
|
|
server {
|
|
listen 443 ssl http2;
|
|
server_name flynn.example.com;
|
|
|
|
ssl_certificate /etc/letsencrypt/live/flynn.example.com/fullchain.pem;
|
|
ssl_certificate_key /etc/letsencrypt/live/flynn.example.com/privkey.pem;
|
|
|
|
# WebSocket upgrade
|
|
location / {
|
|
proxy_pass http://localhost:18800;
|
|
proxy_http_version 1.1;
|
|
proxy_set_header Upgrade $http_upgrade;
|
|
proxy_set_header Connection "upgrade";
|
|
proxy_set_header Host $host;
|
|
proxy_set_header X-Real-IP $remote_addr;
|
|
proxy_set_header X-Forwarded-For $proxy_add_x_forwarded_for;
|
|
proxy_set_header X-Forwarded-Proto $scheme;
|
|
|
|
# Timeouts
|
|
proxy_connect_timeout 60s;
|
|
proxy_send_timeout 60s;
|
|
proxy_read_timeout 60s;
|
|
}
|
|
|
|
# Health check endpoint (no auth required)
|
|
location /health {
|
|
proxy_pass http://localhost:18800/health;
|
|
access_log off;
|
|
}
|
|
}
|
|
```
|
|
|
|
Obtain TLS certificate with Let's Encrypt:
|
|
```bash
|
|
sudo certbot --nginx -d flynn.example.com
|
|
```
|
|
|
|
### File Permissions
|
|
|
|
```bash
|
|
# Data directory
|
|
sudo chmod 750 /var/lib/flynn
|
|
sudo chown flynn:flynn /var/lib/flynn
|
|
|
|
# Config file
|
|
sudo chmod 640 /etc/flynn/config.yaml
|
|
sudo chown root:flynn /etc/flynn/config.yaml
|
|
|
|
# Logs
|
|
sudo chmod 750 /var/log/flynn
|
|
sudo chown flynn:flynn /var/log/flynn
|
|
```
|
|
|
|
### Sandbox Security
|
|
|
|
Docker sandbox adds isolation but requires careful configuration:
|
|
|
|
```yaml
|
|
# config/production.yaml
|
|
sandbox:
|
|
enabled: true
|
|
image: 'node:22-alpine'
|
|
dockerSocket: '/var/run/docker.sock'
|
|
resourceLimits:
|
|
memory: '512m'
|
|
cpus: '0.5'
|
|
timeoutSec: 60
|
|
networkMode: 'none' # No network access
|
|
```
|
|
|
|
Ensure Docker is secured:
|
|
```bash
|
|
# Run Docker as Flynn user
|
|
sudo usermod -aG docker flynn
|
|
|
|
# Configure Docker daemon security
|
|
sudo vim /etc/docker/daemon.json
|
|
```
|
|
|
|
```json
|
|
{
|
|
"log-driver": "json-file",
|
|
"log-opts": {
|
|
"max-size": "10m",
|
|
"max-file": "3"
|
|
},
|
|
"live-restore": true,
|
|
"userland-proxy": false
|
|
}
|
|
```
|
|
|
|
## Configuration
|
|
|
|
### Production Config Template
|
|
|
|
```yaml
|
|
# config/production.yaml
|
|
# Base config for production deployment
|
|
|
|
# ── Gateway ───────────────────────────────────────────────────────────────
|
|
gateway:
|
|
enabled: true
|
|
port: 18800
|
|
auth:
|
|
token: '${GATEWAY_TOKEN}'
|
|
trustTailscaleIdentity: true
|
|
applyToHttp: true
|
|
lock:
|
|
enabled: true
|
|
tailscaleServe:
|
|
enabled: false # Set to true to expose via Tailscale
|
|
hostname: 'flynn'
|
|
port: 443
|
|
|
|
# ── Models ─────────────────────────────────────────────────────────────────
|
|
models:
|
|
default:
|
|
anthropic:
|
|
apiKey: '${ANTHROPIC_API_KEY}'
|
|
model: 'claude-sonnet-4-20250514'
|
|
maxTokens: 4096
|
|
|
|
router:
|
|
tiers:
|
|
default: 'anthropic:claude-sonnet-4-20250514'
|
|
fast: 'anthropic:claude-haiku-4-20250514'
|
|
complex: 'anthropic:claude-opus-4-20250514'
|
|
local: 'ollama:llama3'
|
|
|
|
fallbackChain:
|
|
- 'github:claude-sonnet-4-5'
|
|
- 'local:ollama:llama3'
|
|
|
|
retry:
|
|
maxAttempts: 3
|
|
initialDelayMs: 1000
|
|
multiplier: 2
|
|
maxDelayMs: 30000
|
|
|
|
# ── Channels ───────────────────────────────────────────────────────────────
|
|
channels:
|
|
telegram:
|
|
enabled: true
|
|
token: '${TELEGRAM_BOT_TOKEN}'
|
|
allowedChatIds: ['123456789']
|
|
|
|
discord:
|
|
enabled: false
|
|
|
|
slack:
|
|
enabled: false
|
|
|
|
whatsapp:
|
|
enabled: false
|
|
|
|
# ── Sessions ───────────────────────────────────────────────────────────────
|
|
sessions:
|
|
ttl: '7d'
|
|
maxSessions: 100
|
|
|
|
# ── Memory ────────────────────────────────────────────────────────────────
|
|
memory:
|
|
enabled: true
|
|
embeddings:
|
|
provider: 'openai'
|
|
openai:
|
|
apiKey: '${OPENAI_API_KEY}'
|
|
model: 'text-embedding-3-small'
|
|
|
|
# ── Tools ─────────────────────────────────────────────────────────────────
|
|
tools:
|
|
policy: 'coding' # Restrict tool access
|
|
|
|
executor:
|
|
defaultTimeoutMs: 30000
|
|
maxOutputBytes: 51200
|
|
|
|
sandbox:
|
|
enabled: false # Enable if using Docker
|
|
|
|
# ── Agents ────────────────────────────────────────────────────────────────
|
|
agents:
|
|
default:
|
|
modelTier: 'default'
|
|
toolPolicy: 'coding'
|
|
compaction:
|
|
thresholdPct: 80
|
|
keepTurns: 4
|
|
summaryMaxTokens: 1024
|
|
|
|
# ── Automation ────────────────────────────────────────────────────────────
|
|
automation:
|
|
cron:
|
|
enabled: false
|
|
|
|
webhooks:
|
|
enabled: false
|
|
|
|
heartbeat:
|
|
enabled: true
|
|
interval: '5m'
|
|
checks:
|
|
- 'gateway'
|
|
- 'model'
|
|
- 'channels'
|
|
- 'memory'
|
|
- 'disk'
|
|
notifications:
|
|
- type: 'telegram'
|
|
chatId: '123456789'
|
|
|
|
# ── Logging ───────────────────────────────────────────────────────────────
|
|
logging:
|
|
level: 'info' # debug, info, warn, error
|
|
```
|
|
|
|
### Config Validation
|
|
|
|
Validate config before starting:
|
|
|
|
```bash
|
|
flynn doctor --config /etc/flynn/config.yaml
|
|
```
|
|
|
|
## Monitoring
|
|
|
|
### Health Checks
|
|
|
|
Flynn provides a health check endpoint:
|
|
|
|
```bash
|
|
# HTTP health check
|
|
curl http://localhost:18800/health
|
|
|
|
# Response
|
|
{
|
|
"status": "ok",
|
|
"version": "0.1.0",
|
|
"uptime": 12345
|
|
}
|
|
```
|
|
|
|
### Logs
|
|
|
|
#### Journalctl (systemd)
|
|
|
|
```bash
|
|
# Follow logs
|
|
sudo journalctl -u flynn -f
|
|
|
|
# View last 100 lines
|
|
sudo journalctl -u flynn -n 100 --no-pager
|
|
|
|
# View logs since yesterday
|
|
sudo journalctl -u flynn --since yesterday
|
|
|
|
# Search for errors
|
|
sudo journalctl -u flynn | grep -i error
|
|
```
|
|
|
|
#### Log Rotation
|
|
|
|
Configure logrotate for systemd journal:
|
|
|
|
```bash
|
|
sudo vim /etc/systemd/journald.conf
|
|
```
|
|
|
|
```
|
|
[Journal]
|
|
SystemMaxUse=100M
|
|
MaxRetentionSec=7day
|
|
```
|
|
|
|
Restart systemd:
|
|
```bash
|
|
sudo systemctl restart systemd-journald
|
|
```
|
|
|
|
### Heartbeat Monitor
|
|
|
|
Enable built-in heartbeat monitoring:
|
|
|
|
```yaml
|
|
automation:
|
|
heartbeat:
|
|
enabled: true
|
|
interval: '5m'
|
|
checks:
|
|
- 'gateway'
|
|
- 'model'
|
|
- 'channels'
|
|
- 'memory'
|
|
- 'disk'
|
|
notifications:
|
|
- type: 'telegram'
|
|
chatId: '123456789'
|
|
- type: 'webhook'
|
|
url: 'https://hooks.slack.com/services/...'
|
|
```
|
|
|
|
### External Monitoring
|
|
|
|
#### Prometheus (Optional)
|
|
|
|
Use Node.js prom-client for metrics (not currently implemented):
|
|
|
|
```yaml
|
|
# Future feature
|
|
monitoring:
|
|
prometheus:
|
|
enabled: true
|
|
port: 9090
|
|
```
|
|
|
|
#### Uptime Monitoring
|
|
|
|
Use external services:
|
|
- UptimeRobot
|
|
- Pingdom
|
|
- Better Uptime
|
|
|
|
Monitor:
|
|
- Gateway HTTP health endpoint
|
|
- WebSocket connection
|
|
- Response time
|
|
|
|
## Backup & Recovery
|
|
|
|
### What to Backup
|
|
|
|
1. **Configuration**: `/etc/flynn/config.yaml`
|
|
2. **Sessions**: SQLite database at `~/.local/share/flynn/sessions.db`
|
|
3. **Memory Files**: `~/.local/share/flynn/memory/`
|
|
4. **Vectors**: SQLite database at `~/.local/share/flynn/vectors.db`
|
|
5. **Pairing Codes**: SQLite table within sessions.db
|
|
|
|
### Backup Script
|
|
|
|
Create `/usr/local/bin/flynn-backup.sh`:
|
|
|
|
```bash
|
|
#!/bin/bash
|
|
set -e
|
|
|
|
BACKUP_DIR="/var/backups/flynn"
|
|
DATA_DIR="/var/lib/flynn"
|
|
CONFIG_DIR="/etc/flynn"
|
|
DATE=$(date +%Y%m%d_%H%M%S)
|
|
BACKUP_FILE="$BACKUP_DIR/flynn_$DATE.tar.gz"
|
|
|
|
# Create backup directory
|
|
mkdir -p "$BACKUP_DIR"
|
|
|
|
# Stop Flynn
|
|
sudo systemctl stop flynn
|
|
|
|
# Create backup
|
|
tar -czf "$BACKUP_FILE" \
|
|
"$CONFIG_DIR/config.yaml" \
|
|
"$DATA_DIR/sessions.db" \
|
|
"$DATA_DIR/vectors.db" \
|
|
"$DATA_DIR/memory/"
|
|
|
|
# Compress old backups (keep last 7 daily, 4 weekly, 12 monthly)
|
|
find "$BACKUP_DIR" -name "flynn_*.tar.gz" -mtime +90 -delete
|
|
|
|
# Restart Flynn
|
|
sudo systemctl start flynn
|
|
|
|
echo "Backup created: $BACKUP_FILE"
|
|
```
|
|
|
|
Make executable:
|
|
```bash
|
|
sudo chmod +x /usr/local/bin/flynn-backup.sh
|
|
```
|
|
|
|
### Cron Job
|
|
|
|
Add to root crontab:
|
|
|
|
```bash
|
|
sudo crontab -e
|
|
```
|
|
|
|
```
|
|
# Daily backup at 2 AM
|
|
0 2 * * * /usr/local/bin/flynn-backup.sh >> /var/log/flynn-backup.log 2>&1
|
|
```
|
|
|
|
### Restore
|
|
|
|
```bash
|
|
# Stop Flynn
|
|
sudo systemctl stop flynn
|
|
|
|
# Extract backup
|
|
sudo tar -xzf /var/backups/flynn/flynn_20250213_020000.tar.gz -C /
|
|
|
|
# Start Flynn
|
|
sudo systemctl start flynn
|
|
```
|
|
|
|
### Database Maintenance
|
|
|
|
Run SQLite vacuum periodically:
|
|
|
|
```bash
|
|
sqlite3 /var/lib/flynn/sessions.db "VACUUM;"
|
|
sqlite3 /var/lib/flynn/vectors.db "VACUUM;"
|
|
```
|
|
|
|
Add to crontab (monthly):
|
|
```
|
|
0 0 1 * * sqlite3 /var/lib/flynn/sessions.db "VACUUM;" >> /var/log/flynn-maintenance.log 2>&1
|
|
```
|
|
|
|
## Performance Tuning
|
|
|
|
### Node.js Tuning
|
|
|
|
Set Node.js options for production:
|
|
|
|
```bash
|
|
# In systemd service
|
|
Environment="NODE_OPTIONS=--max-old-space-size=2048"
|
|
|
|
# Or via environment variable
|
|
export NODE_OPTIONS="--max-old-space-size=2048"
|
|
```
|
|
|
|
### Context Management
|
|
|
|
Optimize compaction settings:
|
|
|
|
```yaml
|
|
agents:
|
|
default:
|
|
compaction:
|
|
thresholdPct: 75 # Trigger earlier
|
|
keepTurns: 6 # Keep more context
|
|
summaryMaxTokens: 2048 # Better summaries
|
|
```
|
|
|
|
### SQLite Performance
|
|
|
|
Enable WAL mode:
|
|
|
|
```bash
|
|
sqlite3 /var/lib/flynn/sessions.db "PRAGMA journal_mode=WAL;"
|
|
sqlite3 /var/lib/flynn/sessions.db "PRAGMA synchronous=NORMAL;"
|
|
sqlite3 /var/lib/flynn/sessions.db "PRAGMA cache_size=-64000;" # 64MB
|
|
```
|
|
|
|
### Model Routing
|
|
|
|
Configure tiers for optimal cost/latency:
|
|
|
|
```yaml
|
|
models:
|
|
router:
|
|
tiers:
|
|
fast: 'anthropic:claude-haiku-4-20250514' # Quick tasks
|
|
default: 'anthropic:claude-sonnet-4-20250514' # General use
|
|
complex: 'anthropic:claude-opus-4-20250514' # Complex reasoning
|
|
local: 'ollama:llama3' # Fallback
|
|
```
|
|
|
|
### Caching (Future)
|
|
|
|
Consider adding caching for:
|
|
- Repeated tool calls
|
|
- Memory search results
|
|
- Model responses for common queries
|
|
|
|
## Scaling Considerations
|
|
|
|
### Single-Operator Scope
|
|
|
|
Flynn is designed for a single operator with multiple concurrent users. Limitations:
|
|
|
|
- **Max Concurrent Sessions**: ~100 (depends on model rate limits)
|
|
- **Throughput**: ~10-20 messages/second (varies by model)
|
|
- **Memory Usage**: 2-4GB for moderate usage
|
|
|
|
### When to Scale Up
|
|
|
|
Consider scaling if:
|
|
- Consistent CPU usage > 80%
|
|
- Memory usage > 4GB
|
|
- Frequent rate limiting from model providers
|
|
- Slow response times > 30 seconds
|
|
|
|
### Scaling Strategies
|
|
|
|
1. **Horizontal Scaling**: Deploy multiple Flynn instances behind a load balancer (not currently supported - sessions are stateful)
|
|
|
|
2. **Vertical Scaling**: Increase server resources (CPU, memory)
|
|
|
|
3. **Multi-Instance Architecture** (future):
|
|
- Shared session storage (PostgreSQL/Redis)
|
|
- Message queue for request distribution
|
|
- Session affinity for stateful connections
|
|
|
|
### Cost Optimization
|
|
|
|
- Use local models for non-critical tasks
|
|
- Cache embeddings
|
|
- Optimize compaction to reduce token usage
|
|
- Use efficient models for delegated tasks
|
|
|
|
## Troubleshooting Production Issues
|
|
|
|
### Service Won't Start
|
|
|
|
```bash
|
|
# Check status
|
|
sudo systemctl status flynn
|
|
|
|
# View logs
|
|
sudo journalctl -u flynn -n 50 --no-pager
|
|
|
|
# Validate config
|
|
flynn doctor --config /etc/flynn/config.yaml
|
|
```
|
|
|
|
### High Memory Usage
|
|
|
|
```bash
|
|
# Check memory
|
|
free -h
|
|
|
|
# Check process memory
|
|
ps aux | grep flynn
|
|
|
|
# Restart service
|
|
sudo systemctl restart flynn
|
|
```
|
|
|
|
### Gateway Connection Issues
|
|
|
|
```bash
|
|
# Check if port is listening
|
|
sudo ss -tlnp | grep 18800
|
|
|
|
# Check firewall
|
|
sudo ufw status
|
|
|
|
# Test connectivity
|
|
curl http://localhost:18800/health
|
|
```
|
|
|
|
### Slow Response Times
|
|
|
|
```bash
|
|
# Check CPU usage
|
|
top
|
|
|
|
# Check model provider status
|
|
# Verify API keys are valid
|
|
# Check network latency
|
|
|
|
# Enable debug logging
|
|
DEBUG='*' sudo systemctl restart flynn
|
|
```
|
|
|
|
---
|
|
|
|
For additional help, see:
|
|
- [TROUBLESHOOTING.md](../../TROUBLESHOOTING.md)
|
|
- [README.md](../../README.md)
|
|
- GitHub Issues
|