docs(references): add Anthropic + OpenAI official best practices

- anthropic-prompt-caching.md: KV cache mechanics, TTLs, pricing, auto vs explicit - openai-prompt-caching.md: automatic caching, in-memory vs 24h retention, prompt_cache_key - anthropic-prompting-best-practices.md: clear instructions, XML tags, few-shot, model-specific notes - openai-prompting-best-practices.md: message roles, optimization framework, structured outputs, model selection Key findings: - Anthropic caching: only for Claude models, 5m default TTL, 1h optional, 10% cost for reads - OpenAI caching: automatic/free, 5-10min default, 24h extended for GPT-5+ - GLM/ZAI models: neither caching mechanism applies - Subagent model routing table added to openai-prompting-best-practices.md
2026-03-05 20:34:38 +00:00
parent c2fe8155e3
commit 79e61f4528
4 changed files with 408 additions and 0 deletions
@@ -0,0 +1,124 @@
+# Anthropic — Prompting Best Practices (Claude-specific)
+
+**Source**: https://docs.anthropic.com/en/docs/build-with-claude/prompt-engineering/claude-prompting-best-practices
+**Fetched**: 2026-03-05
+**Applies to**: Claude Opus 4.6, Sonnet 4.6, Haiku 4.5
+
+---
+
+## General Principles
+
+### Be Clear and Direct
+- Claude responds well to clear, explicit instructions.
+- Think of Claude as a brilliant but new employee — explain your norms explicitly.
+- **Golden rule**: If a colleague with minimal context would be confused by the prompt, Claude will be too.
+- Be specific about output format and constraints.
+- Use numbered lists or bullets when order/completeness matters.
+
+### Add Context and Motivation
+- Explaining *why* an instruction exists helps Claude generalize correctly.
+  - Bad: `NEVER use ellipses`
+  - Better: `Your response will be read aloud by TTS, so never use ellipses since TTS won't know how to pronounce them.`
+
+### Use Examples (Few-shot / Multishot)
+- Examples are one of the most reliable ways to steer format, tone, and structure.
+- 3-5 examples for best results.
+- Wrap examples in `<example>` / `<examples>` tags.
+- Make examples relevant, diverse (cover edge cases), and clearly structured.
+
+### Structure Prompts with XML Tags
+- Use XML tags to separate instructions, context, examples, and variable inputs.
+- Reduces misinterpretation in complex prompts.
+- Consistent, descriptive tag names (e.g., `<instructions>`, `<context>`, `<input>`).
+- Nest tags for hierarchy (e.g., `<documents><document index="1">...</document></documents>`).
+
+### Give Claude a Role
+- Setting a role in the system prompt focuses behavior and tone.
+- Even one sentence: `You are a helpful coding assistant specializing in Python.`
+
+### Long Context Prompting (20K+ tokens)
+- **Put longform data at the top**: Documents and inputs above queries/instructions. Can improve quality up to 30%.
+- **Use XML for document metadata**: Wrap each doc in `<document>` tags with `<source>` and `<document_content>`.
+- **Ground responses in quotes**: Ask Claude to quote relevant sections before analyzing — cuts through noise.
+
+---
+
+## Output and Formatting
+
+### Communication Style (Latest Models — Opus 4.6, Sonnet 4.6)
+- More direct and concise than older models.
+- May skip detailed summaries after tool calls (jumps directly to next action).
+- If you want visibility: `After completing a task with tool use, provide a quick summary of the work.`
+
+### Control Response Format
+1. **Tell Claude what to do, not what not to do**
+   - Not: `Do not use markdown` → Better: `Your response should be flowing prose paragraphs.`
+2. **Use XML format indicators**
+   - `Write the prose sections in <smoothly_flowing_prose_paragraphs> tags.`
+3. **Match prompt style to desired output style**
+   - If you want no markdown, use no markdown in your prompt.
+4. **For code generation output**: Ask for specific structure, include "Go beyond the basics to create a fully-featured implementation."
+
+### Minimize Markdown (when needed)
+Put in system prompt:
+```xml
+<avoid_excessive_markdown>
+Write in clear, flowing prose using complete paragraphs. Reserve markdown for inline code, code blocks, and simple headings (##, ###). Avoid **bold** and *italics*. Do NOT use ordered/unordered lists unless presenting truly discrete items or the user explicitly asks. Incorporate items naturally into sentences instead.
+</avoid_excessive_markdown>
+```
+
+### LaTeX
+Claude Opus 4.6 defaults to LaTeX for math. To disable:
+```
+Format in plain text only. Do not use LaTeX, MathJax, or any markup. Write math with standard text (/, *, ^).
+```
+
+---
+
+## Tool Use and Agentic Systems
+
+### Tool Definition Best Practices
+- Provide clear, detailed descriptions for each tool and parameter.
+- Specify exactly when the tool should and should not be used.
+- Clarify which parameters are required vs. optional.
+- Give examples of correct tool call patterns.
+
+### Agentic System Prompts
+- Include explicit instructions for error handling, ambiguity, and when to ask for clarification.
+- Specify how to handle tool call failures.
+- Define scope boundaries — what the agent should and should not attempt.
+
+---
+
+## Model-Specific Notes
+
+### Claude Opus 4.6
+- Best for: Complex multi-step reasoning, nuanced analysis, sophisticated code generation.
+- Defaults to LaTeX for math (disable if needed).
+- Most verbose by default — may need to prompt for conciseness.
+
+### Claude Sonnet 4.6
+- Best for: General-purpose tasks, coding, analysis with good speed/quality balance.
+- More concise than Opus by default.
+- Strong instruction following.
+
+### Claude Haiku 4.5
+- Best for: Simple tasks, quick responses, high-volume low-stakes workloads.
+- Fastest and cheapest Claude model.
+- May need tighter prompts for complex formatting.
+- Use `<example>` tags more liberally to guide behavior.
+
+---
+
+## Relevance for Our Subagent Prompts
+
+### For Claude models (Haiku/Sonnet as advisors)
+- Always include a role statement.
+- Use XML tags to separate role, instructions, context, and topic.
+- Use examples wrapped in `<example>` tags for structured output formats.
+- Static instructions → early in prompt. Variable topic → at the end.
+
+### For cheaper models (GLM-4.7, see separate reference)
+- Need even tighter prompting — be more explicit.
+- Structured output schemas > open-ended generation.
+- Constrained output length.