docs: document native audio support across README, CHANGELOG, config, and planning docs

- README: add audio.transcribe to tool list, update media pipeline description,
  add Native Audio Support and Audio Transcription config sections, add
  supports_audio per-tier override example
- SOUL.md: add audio.transcribe to available tools list
- CHANGELOG: add native audio support and audio.transcribe tool entries
- config/default.yaml: add commented audio config section, supports_audio hint
- INTEGRATIONS.md: expand audio section with native passthrough, capabilities,
  smart routing, AudioSource type, token estimation, audio.transcribe tool
- STRUCTURE.md: add capabilities.ts and audio-transcribe.ts to key file listings
- ARCHITECTURE.md: update data flow step 5 to describe smart audio routing
This commit is contained in:
William Valentin
2026-02-11 18:41:53 -08:00
parent 819ac26b3b
commit 5c531a760d
7 changed files with 87 additions and 8 deletions
+12
View File
@@ -39,6 +39,7 @@ models:
default:
provider: anthropic
model: claude-sonnet-4-20250514
# supports_audio: false # Override native audio detection per tier
local:
provider: ollama
model: glm-4.7-flash
@@ -117,3 +118,14 @@ hooks:
# peer: "123456789"
# failure_threshold: 2
# disk_threshold_mb: 100
# ── Audio ────────────────────────────────────────────────────────────
# Configure a Whisper-compatible endpoint for audio transcription.
# Models that support native audio input (Gemini, OpenAI, GitHub) will
# receive raw audio directly; others fall back to this endpoint.
# audio:
# transcription_endpoint: "http://localhost:8080/v1/audio/transcriptions"
# transcription_api_key: "${WHISPER_API_KEY}"
# transcription_model: "whisper-1"
# transcription_provider: "openai"