Don't Trust the First Answer: The Adversarial Sub-Agent Pattern for High-Stakes Decisions

START HEREWhy a single agent's first answer is the most biased one

A single agent inherits your framing. You hand it a hunch, phrased as a hunch — "I think X is true — check it for me." The model reads that framing, treats your belief as the thing to support, and goes looking for evidence that fits. It comes back agreeing with you, citations included, and you feel smart. That's not research — that's confirmation bias with a tool-use loop bolted on. The fix isn't a better single prompt. It's structural: stop asking one agent to evaluate a claim, and instead run several agents that gather evidence independently, then make them argue. The answer you keep is the one still standing after the fight — not the first thing the model said to make you happy.

A single agent inherits YOUR framing — phrase a hunch as a hunch and it confirms the hunch
Independence first: separate agents gather evidence without seeing each other's conclusions
Conflict second: force them to attack each other's findings before anything reaches you
You read the survivor, not the first draft

This is a research-rigor technique. The video that inspired it used a stock-trading demo — those numbers were illustrative, not a result, and nothing here is financial advice. The value is the PATTERN: it works for a pricing call, a tech-stack bet, or a market-validation question just as well as for anything else you can't afford to get wrong.

THE SHAPEThe pattern: independent evidence-gatherers → forced cross-examination → survivor

Modern frontier models can genuinely fan a job out to sub-agents that work in parallel — this isn't a metaphor for 'think harder.' Anthropic's own multi-agent research system uses an orchestrator that spins up several sub-agents at once, each in its own context window, each running its own searches independently before reporting back. The adversarial pattern adds two deliberate moves on top of that raw capability: you assign each sub-agent a different evidence source so they can't all drift toward the same answer, and then you add an explicit debate stage where they have to challenge each other before the orchestrator writes a single line of conclusion. Getting each agent to gather independently stops the whole answer from converging on one source. The debate step is what surfaces the counter-case you'd actually miss — the evidence you wouldn't have found if you'd asked one agent to confirm you.

Stage	What happens	Why it matters
1. Frame as a test	State the hunch, then explicitly ask the model to PROVE it or DESTROY it	Removes the 'agree with me' framing — the goal is now a verdict, not a confirmation
2. Fan out, independently	Spin up separate sub-agents, each on a DIFFERENT data source, working in parallel	No single source can quietly steer the whole answer; gathering happens before any conclusion
3. Cross-examine	Make the agents challenge each other's findings before any summary is written	Surfaces the weakest evidence and the strongest counter-case — the part you'd otherwise miss
4. Keep the survivor	Only the conclusion that withstands the debate reaches you, with the dissent attached	You see the bull case, the bear case, and which one actually held

One honest nuance: the parallel sub-agent capability is real and shipping in frontier models. The adversarial debate step, though, is something YOU instruct — it's a prompt pattern you impose on the orchestration, not an automatic behavior. That capability is real and ships in frontier models. This guide is just the prompt that points it at the right job.

COPY THISThe "prove it or destroy it" prompt scaffold

This is the reusable skeleton. Swap the bracketed parts for your decision and keep the structure — the structure is what does the work. Notice the order: the hunch is stated plainly, then immediately reframed as something to attack; the sources are named and split; the debate is mandatory and comes BEFORE the report; and the model is told in plain language that you don't want the first answer, you want the one that survives.

Frame it as a challenge, not a confirmation: "I have a hunch and I want you to either prove it or destroy it. My hunch is: [STATE YOUR ACTUAL BELIEF IN ONE SENTENCE]. I'm leaning toward acting on it — talk me out of it if it's wrong."
Demand independent gathering across named sources: "Run deep research and don't stop until it's done. Spin up separate sub-agents, each working independently. Agent 1: [SOURCE A — e.g. the primary historical/usage data]. Agent 2: [SOURCE B — e.g. what comparable players/competitors actually did]. Agent 3: [SOURCE C — e.g. independent analysis or third-party signals]. They must gather evidence before forming any conclusion."
Force the cross-examination: "Have the agents challenge each other's findings before you give me anything else. I don't want the first answer. I want the answer that survives the arguments."
Specify the deliverable: "Then produce one clean report I can act on: my hypothesis at the top, the case FOR, the case AGAINST, and a clear verdict on which survived — with the evidence behind each side."
Run it at high effort and walk away: give the model room and time (this is a long-horizon job, not a chat reply), then read the survivor — not the running commentary.

The single most important line is "I want the answer that survives the arguments." It re-points the whole job from 'support my belief' to 'find what's true,' and it's the difference between a flattering answer and a useful one.

WHY AGENTS COME BACK EMPTYWiring real data in so the agents aren't blocked

An adversarial debate is worthless if both sides are arguing from the model's stale memory. The agents need to reach live, real data — and the moment they try to scrape it, a lot of useful sources block bots. The fix is to give the agent a real data path via a connector (MCP), so 'go check the source' actually fetches the source instead of guessing. This is generic plumbing: a scraping/fetch connector, a database connector, whatever your decision needs. The point is that each independent sub-agent has a genuine, non-blocked way to gather its own evidence.

In your Claude client, go to Customize → Connectors, click +, then Add custom connector and paste the remote MCP server URL for your data source (a scraping/fetch service, a database, an internal API). The + menu inside a chat only enables connectors you've already added here — it's not where you add a new one.
If it needs a key, add it where the connector asks (API key / OAuth under Advanced), then click Add to finish.
Back in a conversation, open the + menu → Connectors and toggle your connector on for this chat; set its tools to always allow for the run so the agents aren't interrupted mid-research.
Confirm the connector shows up and is enabled for the conversation before you start.
Now write the scaffold so each sub-agent is pointed at a real source through that connector — not at the model's memory.

Why a connector and not 'just ask it to browse': many high-value sources block naive scraping, so an agent told to 'go look it up' quietly comes back empty or hallucinates. A proper fetch/scrape MCP gives the agents an unblocked path — which is the whole point of making them gather evidence independently. Heads-up: a custom connector runs from the cloud, not your laptop, so only wire data you're comfortable reaching that way, and prefer read-only access for a research run.

THE OUTPUTA clean report: hypothesis on top, bull vs bear, what survived

Don't let the model bury the verdict in prose. Ask for a fixed structure so you can act fast and audit the reasoning. The shape below is the deliverable to demand at the end of the scaffold — and the best part is that a frontier model is strong enough at building HTML to render this as an interactive one-pager you can actually click through, not just a wall of text.

Hypothesis — your original hunch, restated verbatim at the very top, so there's no moving the goalposts.
The case FOR (bull) — the strongest evidence the pro-side agent assembled, with where it came from.
The case AGAINST (bear) — the strongest counter-evidence, including anything that directly contradicts your hunch.
What survived the debate — the conclusion left standing after cross-examination, stated plainly.
Confidence + what would change it — how strong the verdict is, and the one piece of new evidence that would flip it.

The honesty test: a good run will sometimes tell you your hunch is wrong — bluntly. That's the feature, not a bug. If every report agrees with you, your scaffold is still confirming, not stress-testing — tighten the 'destroy it' framing.

MAKE IT YOURSWhere builders actually use this

Strip out the example and the skeleton fits any decision where being wrong is expensive and you're tempted to trust your gut. The three sub-agents just become three different evidence angles on whatever you're deciding.

Pricing a new offer — Agent 1: what comparable products actually charge. Agent 2: your own usage/cost data. Agent 3: independent signals (reviews, churn drivers, willingness-to-pay chatter). Hunch: 'we can charge $X.' Make them fight it.
A tech-stack or vendor bet — Agent 1: the docs and real limits. Agent 2: what teams who adopted it report months later. Agent 3: failure stories and migration pain. Hunch: 'we should standardize on Y.'
Market validation for a feature — Agent 1: demand signals. Agent 2: what competitors shipped and how it landed. Agent 3: the strongest reasons it would flop. Hunch: 'customers want Z.'
A hiring or partner decision — frame the 'this is a great fit' belief as something to disprove, and let the bear-case agent do its job before you commit.

The meta-skill: any time you catch yourself asking an agent to confirm something you already believe, stop and rewrite the prompt to make it attack the belief instead. That one reflex is most of the value here.

Get the next drop

New AI build guides + the occasional bonus template. No spam, unsubscribe anytime.

By submitting you agree to our Privacy Policy & Terms. Unsubscribe anytime.

Frequently asked questions

What does "adversarial sub-agents" actually mean?

It's a research pattern where, instead of one AI agent evaluating your claim, you run several agents that each gather evidence independently (ideally from different sources), then explicitly instruct them to challenge each other's findings before any conclusion is written. You keep the conclusion that survives the debate. The point is to break confirmation bias: a single agent tends to inherit your framing and confirm your hunch, while agents forced to argue surface the counter-case you'd otherwise miss.

Can frontier models really run sub-agents in parallel, or is that just 'think harder'?

It's real parallelism. Anthropic's own multi-agent research system uses an orchestrator (lead) agent that spins up multiple sub-agents — its engineering write-up describes the lead spinning up roughly 3–5 sub-agents in parallel, each in its own context window running its own searches independently, then returning findings to be synthesized. The capability is genuine; the adversarial debate step on top of it is something you instruct via the prompt, not an automatic behavior.

Is the 'prove it or destroy it' line just a gimmick?

No — the framing is the mechanism. When you state a hunch and ask the model to support it, you've made 'support my belief' the objective, and models are good at hitting the objective you give them. Reframing the job as 'prove it OR destroy it' and adding 'I want the answer that survives the arguments' re-points the whole run toward finding what's true rather than what's flattering. It's the single highest-leverage line in the scaffold.

Why do I need an MCP connector instead of just telling it to browse the web?

Because many high-value sources block naive scraping, so an agent told to 'go look it up' quietly returns empty or fills the gap with a guess. A proper fetch/scrape (or database) connector via MCP gives each sub-agent an unblocked, real path to live data, so the independent evidence-gathering actually happens. You add it in your Claude client under Customize → Connectors → + → Add custom connector, paste the MCP server URL, and click Add; then enable it for the chat via the + menu → Connectors and allow its tools. Note a custom connector runs from the cloud, so wire only data you're comfortable reaching that way, ideally read-only for a research run.

Does this only work for trading or finance decisions?

No — and you should treat any trading framing as an illustrative example only, never as financial advice. The pattern is decision-agnostic: pricing a new offer, choosing a tech stack or vendor, validating a feature, even a hiring or partner call. The three sub-agents just become three different evidence angles on whatever you're deciding. Any high-stakes call where you're tempted to trust your gut is a candidate.

Sources · How we built our multi-agent research system — Anthropic (orchestrator spins up 3–5 parallel sub-agents, each independent) · Get started with custom connectors using remote MCP — Anthropic Help Center (the Add custom connector flow) · Use connectors to extend Claude's capabilities — Anthropic Help Center (enable/manage connectors per conversation) · Source video — Claude Fable 5 walkthrough incl. the adversarial research demo (credited to Samin Yasar; trading numbers shown are an illustrative demo, not a result)

Don't Trust the First Answer: The Adversarial Sub-Agent Pattern for High-Stakes Decisions

START HEREWhy a single agent's first answer is the most biased one

THE SHAPEThe pattern: independent evidence-gatherers → forced cross-examination → survivor

COPY THISThe "prove it or destroy it" prompt scaffold

WHY AGENTS COME BACK EMPTYWiring real data in so the agents aren't blocked

THE OUTPUTA clean report: hypothesis on top, bull vs bear, what survived

MAKE IT YOURSWhere builders actually use this

Get the next drop

Frequently asked questions

The same rigor, but for the agents you ship to customers

Grab the AI Reseller Starter Kit

Don't Trust the First Answer: The Adversarial Sub-Agent Pattern for High-Stakes Decisions

START HEREWhy a single agent's first answer is the most biased one

THE SHAPEThe pattern: independent evidence-gatherers → forced cross-examination → survivor

COPY THISThe "prove it or destroy it" prompt scaffold

WHY AGENTS COME BACK EMPTYWiring real data in so the agents aren't blocked

THE OUTPUTA clean report: hypothesis on top, bull vs bear, what survived

MAKE IT YOURSWhere builders actually use this

Get the next drop

Frequently asked questions

More free guides

The same rigor, but for the agents you ship to customers

Grab the AI Reseller Starter Kit