Prompt Chaining: 5 Recipes That Ship (2026)
Five paste-ready prompt chaining recipes for 2026: extract-then-reason, draft-critique-rewrite, route, map-reduce, and generate-then-gate, each with its failure mode.

On this page
Quick answer
Prompt chaining is splitting one job across two or more prompts, where each prompt's output feeds the next input. In 2026 it buys you accuracy and control on multi-step tasks, at the cost of latency, tokens, and more places to fail. Below are five paste-ready chains, each with the failure mode nobody warns you about, plus a section on when a single prompt wins.
Both Anthropic's chain-prompts guidance and
OpenAI's prompting guide say the same thing: when a task has distinct steps, give each step its own prompt. The community reference promptingguide.ai frames it as trading one opaque call for a pipeline you can inspect. The recipes below are that pipeline. Terse, paste-ready.
Recipe 1: Extract, then reason
Never make one prompt extract and decide at once.
Prompt A pulls structured facts. Prompt B reasons over only those facts.
[Prompt A] Extract every date, amount, and party from the contract below as JSON. No commentary.
[Prompt B] Given this JSON, list every payment due in the next 30 days.
Why it works: a model that reasons while reading invents facts. Split it. Prompt B never sees the raw text, so it cannot smuggle in a clause that was not there.
Fails when: Prompt A drops a field B needs. Garbage in, garbage out, and B sounds confident about it. Test A alone first.
Recipe 2: Draft, critique, rewrite
Two passes beat one long instruction.
[Prompt A] Write the release note. 120 words.
[Prompt B] You are a critic. List 3 concrete problems with this note. No praise.
[Prompt C] Rewrite the note fixing exactly these 3 problems.
Why it works: a fresh context critiques harder than the writer. The model will not defend a draft it does not remember writing. This is the meta-prompting move aimed at your own output; there are more of these in the meta prompting recipes.
Fails when: the critic invents nitpicks to look useful. Cap it at 3 and force the word "concrete."
Recipe 3: Route, then answer
Classify first. Dispatch to a specialized prompt second.
[Prompt A] Classify this ticket: BILLING, BUG, or HOWTO. One word only.
[then] send it to the prompt written for that class.
Why it works: one giant do-everything prompt is worse at all three jobs than three small prompts are at one each. Routing keeps every instruction short and testable. It also pairs cleanly with tool calls, covered in the tool-use recipes.
Fails when: the classifier returns "BILLING/BUG" or a whole sentence. Constrain output to the exact enum and reject anything else.
Recipe 4: Map, then reduce
For long input, summarize the parts, then merge.
[Prompt A, per chunk] Summarize this section in 3 bullets.
[Prompt B] Merge these bullet sets into one 5-bullet summary. Drop duplicates.
Why it works: it beats stuffing 200 pages into one call. Each chunk gets full attention; the merge sees only signal.
Fails when: cross-chunk context matters, like a name defined in chunk 1 and used in chunk 9. Map-reduce loses it. Add a shared glossary pass before the map step.
Recipe 5: Generate, then gate
Add a validator hop that returns pass or fail with a reason.
[Prompt A] Generate the SQL for this request.
[Prompt B] Return PASS or FAIL and one reason. FAIL if it writes, drops, or lacks a LIMIT.
[on FAIL] re-run A with the reason appended.
Why it works: cheaper and more honest than trusting the first output. The gate is a second opinion you can log and count.
Fails when: A and B share the same blind spot. Use a different instruction, and sometimes a cheaper model, for the gate so it fails differently than the generator.
When a single prompt wins
Chains are not free. Each hop adds latency, tokens, and a new failure surface. Every hop also drops context the last one had.
Skip the chain when:
- The task is one step. Classification, extraction, a short rewrite. Sections inside one prompt do the job.
- Latency matters. A 4-hop chain is 4 round trips. Users feel it.
- You are in a loop or a hot path. Chain cost multiplies on every call.
Chain when the task genuinely has stages that need to be verified or specialized between them. Not before.
Cost to test: $0.03 to run all five chains once on a small 2026 model, most of it the map-reduce pass.
Written by
Sam Q.Sam Q. ships prompt recipes at PromptAttic. Terse by default. Tests everything before writing it down.
FAQ
What is the difference between prompt chaining and chain of thought prompting?
Chain of thought is one prompt that asks the model to reason step by step inside a single response. Prompt chaining is multiple prompts, where each output feeds the next input. CoT reasons within one call; chaining splits work across calls so you can inspect and gate each step.
What is the difference between routing and prompt chaining?
Routing is one hop that classifies the input and sends it to a specialized prompt. Prompt chaining is the broader pattern of feeding one prompt's output into the next. Routing is simply a chain whose first link is a classifier.
What are the benefits of chaining prompts?
Higher accuracy on multi-step tasks, shorter and more testable instructions per hop, and inspectable intermediate outputs you can log and gate. The trade-off is added latency and token cost.
What is tool chaining in AI?
Tool chaining is when a model calls a tool, reads the result, then calls another tool based on it. It is prompt chaining where some hops are tool calls instead of text generations.
How many steps should a prompt chain have?
As few as the task needs. Every hop adds latency, tokens, and a failure point. Start with one prompt and add a hop only when a step must be verified or specialized.
When should you not use prompt chaining?
Skip it for single-step tasks, latency-sensitive paths, and loops or hot paths where per-call cost multiplies. A single well-structured prompt often wins.
Related recipes
Meta Prompting: 5 Recipes That Ship (2026)
Five paste-ready meta prompting recipes: make the model write, critique, rubric, and template your prompts, each with its failure mode. Plus the honest part: when meta prompting just wastes tokens.
Claude Tool Use: 3 Recipes That Ship + 2 Failure Modes (June 2026)
Three production Claude tool use recipes tested on Sonnet 4.6, Opus 4.7, and Haiku 4.5 with current pricing. Plus the two failure modes nobody warns you about. June 2026.
Prompt Injection Defenses That Hold Up (2026)
Four prompt-layer defenses against prompt injection that measurably help, three that are theater, and the one architecture rule that actually keeps you safe. With paste-ready prompts and each failure mode.


