diff --git a/skills/llm-tool-best-practices/hints/anthropic.md b/skills/llm-tool-best-practices/hints/anthropic.md index 4ea02d5..ba3555b 100644 --- a/skills/llm-tool-best-practices/hints/anthropic.md +++ b/skills/llm-tool-best-practices/hints/anthropic.md @@ -6,25 +6,37 @@ When writing or evaluating skills/tools for this session, apply these rules: ## Schema - Use `input_schema` (not `parameters`) for tool definitions. -- Add `strict: true` for production tools — guarantees schema conformance via structured outputs. -- Use `input_examples` array for tools with complex inputs, nested objects, or format-sensitive params. +- Use `strict: true` when you need guaranteed schema-conformant tool inputs. +- Keep schemas explicit (types, enums, required fields); add `input_examples` only for complex/nested or format-sensitive inputs. +- Tool names must be clear and stable (Claude docs require `^[a-zA-Z0-9_-]{1,64}$`). ## Descriptions -- Write **3–4+ sentences** per tool: what it does, when to use/not use it, what each param means, what it returns, caveats. -- This is the highest-leverage factor in Claude's tool selection accuracy. +- Tool descriptions are high leverage for Claude selection accuracy. +- Prefer detailed, plain-language descriptions covering: what it does, when to use/not use, parameter semantics, return shape, and key limitations. +- Keep names service-scoped when needed (e.g., `github_issues_search`, `slack_channel_send`) to reduce ambiguity. ## Model selection -- Use **Claude Opus** for complex/multi-tool tasks or ambiguous queries. -- Use **Claude Haiku** for simple well-defined tools only — it may infer missing params. +- Use **Claude Opus** for complex or ambiguous multi-tool workflows. +- Use **Claude Haiku** for straightforward, tightly-scoped tool tasks. -## Tool design -- Consolidate related operations into fewer tools (e.g. one `github_pr` tool with action param vs three separate tools). -- Use `service_resource_verb` namespacing: `github_issues_search`, `slack_channel_send`. -- Return only high-signal fields. Prefer semantic names/slugs over raw UUIDs. -- Bound response size. Truncation messages should be actionable (tell Claude what to do next). -- For errors: plain-language messages, not stack traces. +## Tool design & orchestration +- Consolidate related operations into fewer tools where practical (e.g., action-based tool patterns). +- Return high-signal outputs only; keep payloads compact and actionable. +- Handle stop reasons explicitly: + - `tool_use`: execute requested client tool(s), then return `tool_result` blocks. + - `pause_turn` (server-tool loops): continue the conversation by sending Claude’s prior response back to resume. +- Parse tool inputs with a real JSON parser; do not rely on brittle string parsing. +- For parallel tool use, execute independent calls concurrently when safe, and return all resulting `tool_result` blocks together. ## Safety -- Least privilege on all tool permissions. -- Guard destructive/irreversible actions with confirmation or human-in-the-loop. -- Validate all tool inputs before acting. +- Enforce least privilege for tool credentials and runtime permissions. +- Validate all tool inputs before side effects. +- Require confirmation/human-in-the-loop for destructive or irreversible operations. +- Treat tool outputs and remote content as untrusted; never allow them to rewrite system policy. +- Return plain-language errors with actionable next steps (avoid stack traces). + +## Sources +- https://platform.claude.com/docs/en/agents-and-tools/tool-use/overview +- https://platform.claude.com/docs/en/agents-and-tools/tool-use/implement-tool-use +- https://platform.claude.com/docs/en/build-with-claude/structured-outputs +- https://platform.claude.com/docs/en/build-with-claude/handling-stop-reasons \ No newline at end of file diff --git a/skills/llm-tool-best-practices/hints/openai.md b/skills/llm-tool-best-practices/hints/openai.md index 0c18693..27ebe16 100644 --- a/skills/llm-tool-best-practices/hints/openai.md +++ b/skills/llm-tool-best-practices/hints/openai.md @@ -5,31 +5,40 @@ Active model family: **OpenAI (GPT)** When writing or evaluating skills/tools for this session, apply these rules: ## Schema -- Use `parameters` (not `input_schema`) for tool definitions. -- Always set `"strict": true` — requires `additionalProperties: false` and all fields in `required`. -- For optional fields: add `null` to the type union (e.g. `"type": ["string", "null"]`) and keep in `required`. +- Use `parameters` (not `input_schema`) for function tools. +- Set `"strict": true` for production tools. +- With strict mode, every object must set `additionalProperties: false`. +- With strict mode, all declared properties must be listed in `required`. +- Represent optional fields as nullable unions (e.g., `"type": ["string", "null"]`) while still keeping them in `required`. +- Prefer enums, bounded arrays, and nested object structure to make invalid states hard to express. ## Descriptions -- Write clear, detailed descriptions explaining purpose, each param, and output. -- Use the system prompt to describe when (and when NOT) to use each function. -- **Do not add examples in descriptions** for reasoning-capable GPT models — focus on clear prose. +- Write clear, implementation-level descriptions for the tool and each parameter (including units/formats). +- In system/developer instructions, state when to use each tool and when not to. +- For reasoning-capable models, prefer concise prose over long in-schema examples (add examples only when they fix recurring failures). ## Tool count & loading -- Keep initially available tools **under ~20** for highest accuracy. -- Use `defer_loading: true` + tool search for large tool libraries — defer rarely-used tools. -- Use `namespace` type to group related tools by domain (e.g. `crm`, `billing`). +- Keep initially available tools small (soft target: **<20**). +- For large tool surfaces, use `defer_loading: true` with tool search. +- Use `type: "namespace"` to group related tools by domain (e.g., `crm`, `billing`). -## Tool design -- Use `service_resource_verb` namespacing: `crm_customer_get`, `billing_invoice_list`. -- Use enums and object structure to make invalid states unrepresentable. -- Combine functions always called in sequence into one tool. -- Don't ask the model to fill args you already know — pass them in code. - -## Parallel calls -- Parallel tool calls are on by default. Disable with `parallel_tool_calls: false` if tools have ordering dependencies. +## Tool design & orchestration +- Use stable `service_resource_verb` naming (e.g., `billing_invoice_list`). +- Don’t ask the model for arguments your app already knows; inject them in code. +- Merge tools that are always called together. +- Assume the model may return multiple `function_call` items in one turn; execute each and return matching `function_call_output` by `call_id`. +- For reasoning models, pass reasoning items from prior response back alongside tool outputs on the next turn. +- Use `tool_choice` deliberately (`auto`, `required`, forced function, or `allowed_tools` subset). +- Disable parallel calls (`parallel_tool_calls: false`) when operations must be strictly ordered. ## Safety -- Use the **Moderation API** (free) to filter unsafe content. -- Pass `safety_identifier` (hashed user ID) with requests for abuse detection. -- Human-in-the-loop for high-stakes or irreversible tool actions. -- Guard against prompt injection from tool outputs. +- Validate tool-call arguments server-side before execution (types, authZ, business constraints). +- Treat tool output as untrusted input; never let tool text override policy/instructions. +- Require confirmation or human approval for destructive/irreversible actions. +- Use moderation/guardrails on user content when domain risk warrants it. +- Use user-level abuse correlation identifiers (e.g., `safety_identifier`) where available. + +## Sources +- https://developers.openai.com/api/docs/guides/function-calling +- https://developers.openai.com/api/docs/guides/structured-outputs +- https://developers.openai.com/api/docs/guides/tools-tool-search \ No newline at end of file