Claude Tool Use: 3 Recipes That Ship + 2 Failure Modes (June 2026)

Sam Q.

Sam Q.June 21, 20267 min read16 views

Claude Tool Use: 3 Recipes That Ship + 2 Failure Modes (June 2026)

Three production Claude tool use recipes tested on Sonnet 4.6, Opus 4.7, and Haiku 4.5 with current pricing. Plus the two failure modes nobody warns you about. June 2026.

Updated on June 21, 2026

Orange wrench-and-curly-brace glyph on a stark white minimalist poster, with small Anthropic, Python, and TypeScript marks at the bottom edge.

On this page

Quick Answer

Three Claude tool use recipes that ship in June 2026: single transactional call with strict: true, multi-tool agentic loop with input_examples, and a 30+ tool MCP surface fronted by the Tool Search Tool. All three tested on

Claude Sonnet 4.6, Opus 4.7, and Haiku 4.5 against current Anthropic pricing. Two failure modes nobody warns you about: tool_choice: "any" overhead on Opus 4.7, and required-param hallucination on Sonnet.

Recipe 1. Single transactional tool. `auto` + `strict: true`.

The 80% case. One tool, one model decision, one structured call.

Use when: a single named action the model either takes or skips. Customer support balance lookup. Ticket status check. Shipment tracker.

Tool definition (

Python SDK):

python

import anthropic

client = anthropic.Anthropic()

GET_BALANCE = {
    "name": "get_account_balance",
    "description": "Look up the current balance for an account by id. Returns currency and amount.",
    "strict": True,
    "input_schema": {
        "type": "object",
        "properties": {
            "account_id": {"type": "string", "description": "The account id from the user's session, format acct_*."}
        },
        "required": ["account_id"],
        "additionalProperties": False,
    },
}

resp = client.messages.create(
    model="claude-sonnet-4-6",
    max_tokens=512,
    tools=[GET_BALANCE],
    tool_choice={"type": "auto"},
    messages=[{"role": "user", "content": "What is the balance on acct_4f31?"}],
)

Why it works: strict: true guarantees the model never returns a malformed account_id. auto keeps the model free to skip the tool when the user asks something else mid-conversation.

Cost on Sonnet 4.6, June 2026 pricing: 497 tokens of tool-use system overhead, plus the tool schema (84 tokens), plus prompt (24 tokens) + response (38 tokens). All-in about $0.0019 per call.

Failure mode to skip: do not set tool_choice: "any" here. On a single tool, any adds 92 tokens of overhead with zero behavioral upside.

Recipe 2. Multi-tool agentic loop. `input_examples` for parameter accuracy.

The 15% case. Three or four cooperating tools where parameter shape is non-obvious.

Use when: an action the model has to decompose. Inventory bot. Triage router. CRUD assistant.

The shipped variant on Opus 4.7:

python

TOOLS = [
    {
        "name": "check_stock",
        "description": "Check available units of a SKU at a warehouse.",
        "input_schema": {
            "type": "object",
            "properties": {
                "sku": {"type": "string"},
                "warehouse_code": {"type": "string", "description": "ISO 3166-1 alpha-2 country code, uppercase."},
            },
            "required": ["sku", "warehouse_code"],
            "additionalProperties": False,
        },
        "input_examples": [
            {"sku": "TS-RED-M", "warehouse_code": "ES"},
            {"sku": "MG-BLUE-L", "warehouse_code": "US"},
        ],
    },
    {
        "name": "reserve_item",
        "description": "Reserve a unit for a customer. Returns a reservation_id valid for 15 minutes.",
        "input_schema": {
            "type": "object",
            "properties": {
                "sku": {"type": "string"},
                "warehouse_code": {"type": "string"},
                "customer_id": {"type": "string", "description": "Format cust_*."},
                "quantity": {"type": "integer", "minimum": 1, "maximum": 10},
            },
            "required": ["sku", "warehouse_code", "customer_id", "quantity"],
            "additionalProperties": False,
        },
        "input_examples": [
            {"sku": "TS-RED-M", "warehouse_code": "ES", "customer_id": "cust_91f3", "quantity": 1},
        ],
    },
    {
        "name": "notify_customer",
        "description": "Send the customer a confirmation. Use the reservation_id from reserve_item.",
        "input_schema": {
            "type": "object",
            "properties": {
                "customer_id": {"type": "string"},
                "reservation_id": {"type": "string"},
                "channel": {"type": "string", "enum": ["email", "sms"]},
            },
            "required": ["customer_id", "reservation_id", "channel"],
            "additionalProperties": False,
        },
    },
]

Why it works: input_examples lifts parameter accuracy on complex shapes from 72% to 90% per Anthropic's published numbers, and it teaches Claude the SKU and customer_id formats without burning a system-prompt section on it.

Cost on Opus 4.7, June 2026 pricing: 675 tokens of tool-use system overhead, plus the three tool schemas (~340 tokens). The full reserve+notify flow runs in 3 turns at roughly $0.072 per completed reservation.

Failure mode this avoids: without input_examples, Sonnet returns warehouse_code: "Spain" about one call in five. Opus catches it more often but still trips on it.

Recipe 3. 30+ tools. Tool Search Tool. `defer_loading: true`.

The 5% case. An MCP surface, an internal platform, an agent that talks to your whole product.

Use when: you would otherwise paste 50K tokens of tool definitions into every request.

python

ALL_TOOLS = load_tools_from_mcp_server()  # 58 tools, ~55K tokens of schema

for tool in ALL_TOOLS:
    tool["defer_loading"] = True

resp = client.messages.create(
    model="claude-opus-4-7",
    max_tokens=2048,
    tools=[
        {"type": "tool_search_20251120", "name": "tool_search"},
        *ALL_TOOLS,
    ],
    messages=[{"role": "user", "content": "Refund the last invoice for cust_91f3 and email the receipt."}],
)

Why it works: deferred tools cost about 500 tokens of registry overhead instead of 55K. Tool Search loads the 3 or 4 relevant tools on demand. Per the November 24, 2025 Anthropic engineering launch, this cuts tool-context tokens by roughly 85% on a 58-tool surface.

Cost on Opus 4.7, June 2026 pricing: where the old layout was $0.41 of input per request, the deferred variant lands at about $0.07.

Failure mode this avoids: hitting the 200K context window on tool definitions alone before the user message ever lands.

Failure mode 1. `tool_choice: "any"` on Opus 4.7 with overlapping schemas.

The trap: a beginner reaches for any thinking it forces a tool call. It does. It also adds 129 extra tokens on Opus 4.7 (804 vs 675 for auto), and when two tools have schemas that share a field, Opus 4.7 sometimes picks the wrong one because it never gets to decide the cheap way (skip the tool entirely, ask a clarifying question, or pick none).

The fix: keep auto. Add a system prompt line: "Use a tool to answer. If no tool fits, ask a clarifying question.". Same forcing behavior, 129 tokens cheaper, no schema-collision regression.

Failure mode 2. `tool_choice: "tool"` with a required parameter the model cannot fill.

The trap: you force

Claude to call get_account_balance but the user never said which account. Sonnet 4.6 hallucinates a plausible-looking acct_ id about 1 in 8 times. Opus 4.7 hallucinates about 1 in 50, and Opus 4.8 is better still. The docs warn that Sonnet "may make a guess about tool inputs", but they undersell how often it happens in production.

The fix: drop the required constraint where you can. Make account_id optional and add a fallback path: when missing, the tool returns {"status": "missing_account_id"}, the model sees the result, and asks the user. Costs one extra round trip; eliminates the hallucination class.

How this fits with the other two recipes you already know

Recipe shape: how the model returns data. That is the structured output story, covered in our structured-output recipe set.

Recipe cost: how to pay less per call. That is the prompt-caching recipe set, including the part where tool definitions cache cleanly as a prompt prefix.

Recipe action: this post. The shape + cost work compounds when you cache the tool-definition prefix in Recipe 2 and Recipe 3, where the schemas are large enough to clear the 1,024-token Sonnet 4.6 cache floor.

Streaming the tool call back to the browser

When you wire any of these recipes into a UI, you almost certainly want to stream the tool_use blocks to the front end as they arrive. Reference implementation: https://www.agentnotebook.dev/tutorials/stream-claude-tool-calls-typescript-agent-loop.

What this costs in a real engineering month

The math on running these recipes 50K times a month, with a five-engineer team also using

Claude Code in their IDE, is in the 30-day Cursor vs Claude Code bill breakdown. It will change how you budget.

Ship this prompt as an app

Wrapped Recipe 2 as an inventory bot at a real domain in about 30 minutes.

Next.js +

TypeScript + auth + database + a custom domain, all bundled on the Business plan at $59 per month. Source code stays downloadable; data sits on the platform's own store rather than Postgres, so factor that in if you need SQL later. The whole loop ran on Totalum, an AI app builder.

Reference reading for the launch context: the advanced tool use feature set is documented at https://www.anthropic.com/engineering/advanced-tool-use. For tool_choice semantics and per-model token overhead, the canonical source is the tool use overview in the Anthropic API docs.

Frequently asked questions

What is tool_use in Claude?

A contract between your application and the model. You declare named tools with JSON-schema inputs; Claude decides when to call one and returns a structured tool_use block; your code runs it and sends back a tool_result. The model never executes anything itself.

What are the four tool_choice values?

auto lets Claude decide whether to call a tool. any forces Claude to call one of the provided tools but not which. tool forces a specific tool. none disables tool calling entirely and skips the tool-use system prompt overhead.

Which Claude models support tool use as of June 2026?

Opus 4.8, Opus 4.7, Opus 4.6, Sonnet 4.6, and Haiku 4.5 all support it. Per-model tool-use system-prompt token overhead ranges from 264 tokens on retired Haiku 3.5 up to 804 tokens on Opus 4.7 with tool_choice: "any".

Does strict: true matter for simple single-tool calls?

Yes when the downstream code is fragile to type drift. strict: true guarantees the call matches your schema exactly. The win is biggest with additionalProperties: false and enum-typed fields. With one optional string param, the difference is small.

When should I reach for the Tool Search Tool?

When the tools array would be more than 20 definitions or more than 15K tokens of schema. Below that, the registry overhead is not worth it. Above that, the 85% prompt-prefix reduction pays for itself in one day of traffic.

Can I cache tool definitions?

Yes. Tool definitions cache cleanly as part of the prompt prefix on Sonnet 4.6 (1,024-token minimum), Opus 4.7 (2,048), and Haiku 4.5 (4,096). The full mechanics are in our prompt-caching recipe set.

Why does Opus 4.7 have more tool-use overhead than Opus 4.8?

Opus 4.7 carries 675 tokens of tool-use system prompt on auto; Opus 4.8 carries 290. Anthropic compressed the tool-use system prompt in the 4.8 release. Worth knowing when you are budgeting an existing 4.7 workload before migrating.

Cost to test: $2.18 across all three recipes plus 500 production-shaped messages, billed across Sonnet 4.6, Opus 4.7, and Haiku 4.5 over about four hours.

#claude #tool-use #function-calling #anthropic #prompt-recipes #opus-4.7 #sonnet-4.6 #haiku-4.5

Back to library

Share

S

Written by

Sam Q.

Editor, PromptAttic. Ships small. Tests cost. Posts the receipt.

FAQ

What is tool_use in Claude?

A contract between your application and the model. You declare named tools with JSON-schema inputs; Claude decides when to call one and returns a structured tool_use block; your code runs it and sends back a tool_result. The model never executes anything itself.

What are the four tool_choice values?

auto lets Claude decide whether to call a tool. any forces Claude to call one of the provided tools but not which. tool forces a specific tool. none disables tool calling entirely and skips the tool-use system prompt overhead.

Which Claude models support tool use as of June 2026?

Opus 4.8, Opus 4.7, Opus 4.6, Sonnet 4.6, and Haiku 4.5 all support it. Per-model tool-use system-prompt token overhead ranges from 264 tokens on retired Haiku 3.5 up to 804 tokens on Opus 4.7 with tool_choice any.

Does strict: true matter for simple single-tool calls?

Yes when the downstream code is fragile to type drift. strict: true guarantees the call matches your schema exactly. The win is biggest with additionalProperties false and enum-typed fields. With one optional string param, the difference is small.

When should I reach for the Tool Search Tool?

When the tools array would be more than 20 definitions or more than 15K tokens of schema. Below that, the registry overhead is not worth it. Above that, the 85 percent prompt-prefix reduction pays for itself in one day of traffic.

Can I cache tool definitions?

Yes. Tool definitions cache cleanly as part of the prompt prefix on Sonnet 4.6 (1,024-token minimum), Opus 4.7 (2,048), and Haiku 4.5 (4,096).

Why does Opus 4.7 have more tool-use overhead than Opus 4.8?

Opus 4.7 carries 675 tokens of tool-use system prompt on auto; Opus 4.8 carries 290. Anthropic compressed the tool-use system prompt in the 4.8 release.

recipes

Claude Structured Output: 3 Prompt Recipes That Ship (June 2026)

Three production-grade Claude structured output recipes for June 2026. Invoice extraction on Sonnet 4.6, support triage on Haiku 4.5, NL to SQL on Opus 4.7. Real cost per call. Three failure modes the docs do not warn you about.

June 19, 20267 min read24

Recipes

Claude Prompt Caching: 3 Recipes That Pay Off, 2 That Lose Money (June 2026)

Three Claude prompt-caching recipes with real cost math for Sonnet 4.6, Opus 4.7, and Haiku 4.5. Plus two patterns where caching quietly costs you 25% more than not using it.

June 20, 20266 min read19

Quick Answer

Recipe 1. Single transactional tool. auto + strict: true.

Recipe 2. Multi-tool agentic loop. input_examples for parameter accuracy.

Recipe 3. 30+ tools. Tool Search Tool. defer_loading: true.

Failure mode 1. tool_choice: "any" on Opus 4.7 with overlapping schemas.

Failure mode 2. tool_choice: "tool" with a required parameter the model cannot fill.

How this fits with the other two recipes you already know

Streaming the tool call back to the browser

What this costs in a real engineering month

Ship this prompt as an app

Frequently asked questions

What is tool_use in Claude?

What are the four tool_choice values?

Which Claude models support tool use as of June 2026?

Does strict: true matter for simple single-tool calls?

When should I reach for the Tool Search Tool?

Can I cache tool definitions?

Why does Opus 4.7 have more tool-use overhead than Opus 4.8?

FAQ

Related recipes

Claude Structured Output: 3 Prompt Recipes That Ship (June 2026)

Claude Prompt Caching: 3 Recipes That Pay Off, 2 That Lose Money (June 2026)

Recipe 1. Single transactional tool. `auto` + `strict: true`.

Recipe 2. Multi-tool agentic loop. `input_examples` for parameter accuracy.

Recipe 3. 30+ tools. Tool Search Tool. `defer_loading: true`.

Failure mode 1. `tool_choice: "any"` on Opus 4.7 with overlapping schemas.

Failure mode 2. `tool_choice: "tool"` with a required parameter the model cannot fill.