- ollama.ts: add normalizeMessagesForOllama() converting Anthropic-style
tool_use/tool_result blocks to Ollama's native tool_calls + role:tool format
- llamacpp.ts: add normalizeMessagesForLlamaCpp() with hybrid approach —
assistant tool_calls in native format, but tool results as structured user
messages (many GGUF templates silently drop role:tool messages)
- llamacpp.ts: add configurable requestTimeout with AbortController (default 3min)
- Both use fast-path when no tool blocks are present (zero overhead)
- Full test coverage for both normalizers: plain text passthrough, tool_use
conversion, tool_result mapping, multi-tool round trips, error results
- Ollama: pass tools to API, parse tool_calls responses, handle thinking field from reasoning models (deepseek-r1, glm-4.7-flash)
- llama.cpp: pass tools via OpenAI-compatible endpoint, parse tool_calls, accumulate streaming tool call deltas
- Both clients now set stopReason to 'tool_use' when tool calls are present
- Tests: 12 new tests (8 Ollama + 5 llama.cpp, total 983→995)