feat: implement tier-a4 tts voice output replies

2026-02-18 10:22:28 -08:00
parent 3eb07875f1
commit a71aa5992d
11 changed files with 482 additions and 4 deletions
@@ -419,6 +419,34 @@ docker run -d \
 # docker compose up -d
 ```

+### Text-to-Speech (TTS) Reply Audio
+
+Flynn can attach synthesized voice replies (OpenAI-compatible `/v1/audio/speech`) alongside text responses.
+
+```yaml
+tts:
+  enabled: true
+  enabled_channels: [telegram, whatsapp, discord]  # Empty = all channels
+  provider:
+    type: openai                                    # openai | custom
+    endpoint: "https://api.openai.com/v1/audio/speech"
+    api_key: "${OPENAI_API_KEY}"                   # Optional Bearer token
+    model: "gpt-4o-mini-tts"
+    voice: "alloy"
+    format: "mp3"                                  # mp3 | wav | opus
+```
+
+| Field | Required | Description |
+|-------|----------|-------------|
+| `tts.enabled` | no | Enable voice reply synthesis (default: `false`) |
+| `tts.enabled_channels` | no | Channels allowed to receive voice replies (`[]` means all channels) |
+| `tts.provider.type` | no | `openai` or `custom` (default: `openai`) |
+| `tts.provider.endpoint` | no | OpenAI-compatible `/v1/audio/speech` endpoint (`openai` defaults to OpenAI API URL) |
+| `tts.provider.api_key` | no | Bearer token for authentication |
+| `tts.provider.model` | no | TTS model (default: `gpt-4o-mini-tts`) |
+| `tts.provider.voice` | no | Voice identifier (default: `alloy`) |
+| `tts.provider.format` | no | Output format: `mp3`, `wav`, `opus` (default: `mp3`) |
+
 ### Capture Tools

 Flynn includes host capture tools: