Model Watch

Does Claude Code need an MCP server to know your codebase? We measured it.

If you are spinning up an MCP server just so a coding agent can read your docs, you are probably overbuilding. For knowledge that fits in the context window, a plain text file beat the MCP server on accuracy, cost, and speed in our benchmark.

PCTX Editorial · Jun 9, 2026 · 4 min

The short answer

If you want Claude Code to know your codebase, you do not need to build an MCP server for it. Putting the docs straight into its context window worked better. In our benchmark, an AGENTS.md or llms.txt file in the prompt scored 100 percent in a single turn, at about $0.002 a question.

The same questions through an MCP knowledge server scored 50 percent and cost far more, sometimes by a factor of a thousand. MCP servers are built for live tools and actions. For documentation that already fits in the context, fetching it is slower, pricier, and less accurate than simply showing it.

What an MCP server is actually for

An MCP server is a standard interface that lets a coding agent reach outside itself while it runs, to drive a browser, hit an API, query a database, or take an action in another system. That is the job the popular ones do. The Playwright, GitHub, and database MCP servers all exist to give the agent live reach it cannot get from its prompt.

Knowledge is a different job. When you point an MCP server at a folder of docs, you are not giving the agent a new power. You are making it fetch, one tool call at a time, something it could have read directly. That distinction is what the benchmark below measures.

We tested both, six ways

The data comes from our open benchmark for AI agents, the Agent Voyager Project (AVP). In the run we published as Captain's Log #3, we took two coding agents, Claude Code and Goose, both on a Haiku-tier model, and asked each six operational questions about a real system. The only variable was how the agent reached the documentation.

We tried six channels. Three put the docs straight in the prompt, as an llms.txt file, an AGENTS.md file, and a raw unstructured dump. The other three made the agent go fetch them, through an Explore CLI, an MCP knowledge server, and a packaged Agent Skill. A seventh run, with no docs, set the floor.

Channel	Accuracy	Cost/question	Turns
llms.txt in the prompt	100%	~$0.002	1
AGENTS.md in the prompt	100%	~$0.002	1
Raw docs dump in the prompt	83%	~$0.002	1
Explore CLI	100%	$0.02 to $0.17	7 to 10
MCP knowledge server	50%	$0.03 to $0.52	8 to 19
Agent Skill	17 to 33%	$0.08 to $0.65	14 to 23
Nothing (control)	0%	tiny	1

Both agents landed in nearly the same place on the channels that won, so this is a result about the channel, not the model.

The doc in the prompt beat the server

The two channels that did nothing but place the docs in the prompt, AGENTS.md and llms.txt, both scored 100 percent in one turn for a fraction of a cent. Every channel that made the agent go and fetch the knowledge did worse.

The MCP knowledge server scored 50 percent. The Agent Skill scored between 17 and 33. Both cost many times more and ran for up to twenty turns. In the MCP server's worst case, a single question pulled the agent through 33 turns and 32 tool calls and cost $1.22, when the same answer sat in a file it could have read for a fraction of a cent.

Why fetching loses to reading

When the docs are in the context window, the answer is in front of the model and it reads it. There is nothing to decide and nothing to call.

When the same knowledge sits behind a tool, the agent has to choose to call it, issue the call, wait, read the result, and often call again. Every step is a chance to go wrong and another batch of tokens, and none of it buys accuracy when the answer would have fit in the window to begin with.

When Claude Code does need an MCP server

None of this counts against MCP. The protocol pays off when the agent needs something live it cannot be handed in advance, like a browser to drive, an API to call, a database to query, or data too large or too fresh to preload. That is why the Playwright and GitHub servers earn their keep.

Static product docs are the opposite case. They are small, fixed, and already yours to paste. A tool round trip there adds cost and turns without adding answers.

How to give a coding agent your codebase

If your agent answers questions from docs that fit in the context window, put them there. An AGENTS.md or llms.txt file in the prompt was the cheapest and most accurate channel we tested, and it is the same file Cursor, Claude Code, and Codex already look for.

One caveat showed up in the numbers. The raw, unstructured dump scored 83 percent instead of 100, because a single acronym was never defined. A little curation of what you load still pays off. Keep the MCP server for the live work it was built for.

Common questions

Does Claude Code need an MCP server to read my codebase?

Not for static docs. In our open agent benchmark, putting the docs straight into the context window through an AGENTS.md or llms.txt file scored 100 percent at about $0.002 per question. The MCP knowledge server scored 50 percent and cost far more. Use an MCP server for live tools and actions, not for documentation that already fits in the context.

What is an MCP server?

An MCP (Model Context Protocol) server is a standard interface that lets a coding agent like Claude Code call tools or pull data from an external system while it runs. It is the right tool for live actions, like driving a browser or querying a database. For a fixed set of docs, our benchmark found that loading them into context directly was both more accurate and far cheaper.

MCP server or AGENTS.md and llms.txt for giving an agent context?

For a fixed body of documentation, the file in context wins. In our test an AGENTS.md or llms.txt file scored 100 percent in one turn for a fraction of a cent, while the MCP knowledge server scored 50 percent over as many as 33 turns. Reach for an MCP server when the agent needs live data or actions it cannot be handed up front.

When does a coding agent actually need an MCP server?

When it needs to do something live: run a browser, hit an API, query a database, or read data too large or too fresh to load in advance. That is what the Playwright, GitHub, and database MCP servers are for. Static product docs are the opposite case, and there the round trip costs accuracy and money.