docs: sync voice reliability updates and phase state

2026-02-26 17:29:29 -08:00
parent 163b1a0139
commit 03926a81eb
6 changed files with 83 additions and 11 deletions
@@ -572,7 +572,24 @@ Flynn can attach synthesized voice replies (OpenAI-compatible `/v1/audio/speech`
 tts:
  enabled: true
  enabled_channels: [telegram, whatsapp, discord]  # Empty = all channels
-  provider:
+  providers:
+    - name: primary
+      type: custom                                  # openai | custom
+      endpoint: "https://tts-primary.example.com/v1/audio/speech"
+      api_key: "${PRIMARY_TTS_API_KEY}"
+      model: "gpt-4o-mini-tts"
+      voice: "alloy"
+      format: "mp3"                                 # mp3 | wav | opus
+    - name: backup
+      type: openai
+      api_key: "${OPENAI_API_KEY}"
+      model: "gpt-4o-mini-tts"
+      voice: "nova"
+      format: "opus"
+  fallback:
+    max_attempts: 2
+    failure_cooldown_ms: 60000
+  provider:                                         # Legacy single-provider shape (still supported)
    type: openai                                    # openai | custom
    endpoint: "https://api.openai.com/v1/audio/speech"
    api_key: "${OPENAI_API_KEY}"                   # Optional Bearer token
@@ -585,12 +602,18 @@ tts:
 |-------|----------|-------------|
 | `tts.enabled` | no | Enable voice reply synthesis (default: `false`) |
 | `tts.enabled_channels` | no | Channels allowed to receive voice replies (`[]` means all channels) |
+| `tts.providers[]` | no | Ordered provider chain for synthesis fallback |
+| `tts.providers[].name` | no | Provider label used for health tracking/debug logs |
 | `tts.provider.type` | no | `openai` or `custom` (default: `openai`) |
 | `tts.provider.endpoint` | no | OpenAI-compatible `/v1/audio/speech` endpoint (`openai` defaults to OpenAI API URL) |
 | `tts.provider.api_key` | no | Bearer token for authentication |
 | `tts.provider.model` | no | TTS model (default: `gpt-4o-mini-tts`) |
 | `tts.provider.voice` | no | Voice identifier (default: `alloy`) |
 | `tts.provider.format` | no | Output format: `mp3`, `wav`, `opus` (default: `mp3`) |
+| `tts.fallback.max_attempts` | no | Max providers attempted per reply before text fallback (default: `3`) |
+| `tts.fallback.failure_cooldown_ms` | no | Cooldown for providers after synthesis failures (default: `60000`) |
+
+If all configured providers fail, Flynn deterministically returns text-only (no dropped reply) and retries unhealthy providers after their cooldown window.

 ### Capture Tools