Does Claude Code need an MCP server to know your codebase? We measured it.
If you are spinning up an MCP server just so a coding agent can read your docs, you are probably overbuilding. For knowledge that fits in the context window, a plain text file beat the MCP server on accuracy, cost, and speed in our benchmark.
The short answer
If you want Claude Code to know your codebase, you do not need to build an MCP server for it. Putting the docs straight into its context window worked better. In our benchmark, an AGENTS.md or llms.txt file in the prompt scored 100 percent in a single turn, at about $0.002 a question.
The same questions through an MCP knowledge server scored 50 percent and cost far more, sometimes by a factor of a thousand. MCP servers are built for live tools and actions. For documentation that already fits in the context, fetching it is slower, pricier, and less accurate than simply showing it.
What an MCP server is actually for
An MCP server is a standard interface that lets a coding agent reach outside itself while it runs, to drive a browser, hit an API, query a database, or take an action in another system. That is the job the popular ones do. The Playwright, GitHub, and database MCP servers all exist to give the agent live reach it cannot get from its prompt.
Knowledge is a different job. When you point an MCP server at a folder of docs, you are not giving the agent a new power. You are making it fetch, one tool call at a time, something it could have read directly. That distinction is what the benchmark below measures.
We tested both, six ways
The data comes from our open benchmark for AI agents, the Agent Voyager Project (AVP). In the run we published as Captain's Log #3, we took two coding agents, Claude Code and Goose, both on a Haiku-tier model, and asked each six operational questions about a real system. The only variable was how the agent reached the documentation.
We tried six channels. Three put the docs straight in the prompt, as an llms.txt file, an AGENTS.md file, and a raw unstructured dump. The other three made the agent go fetch them, through an Explore CLI, an MCP knowledge server, and a packaged Agent Skill. A seventh run, with no docs, set the floor.
| Channel | Accuracy | Cost/question | Turns |
|---|---|---|---|
| llms.txt in the prompt | 100% | ~$0.002 | 1 |
| AGENTS.md in the prompt | 100% | ~$0.002 | 1 |
| Raw docs dump in the prompt | 83% | ~$0.002 | 1 |
| Explore CLI | 100% | $0.02 to $0.17 | 7 to 10 |
| MCP knowledge server | 50% | $0.03 to $0.52 | 8 to 19 |
| Agent Skill | 17 to 33% | $0.08 to $0.65 | 14 to 23 |
| Nothing (control) | 0% | tiny | 1 |
Both agents landed in nearly the same place on the channels that won, so this is a result about the channel, not the model.
The doc in the prompt beat the server
The two channels that did nothing but place the docs in the prompt, AGENTS.md and llms.txt, both scored 100 percent in one turn for a fraction of a cent. Every channel that made the agent go and fetch the knowledge did worse.
The MCP knowledge server scored 50 percent. The Agent Skill scored between 17 and 33. Both cost many times more and ran for up to twenty turns. In the MCP server's worst case, a single question pulled the agent through 33 turns and 32 tool calls and cost $1.22, when the same answer sat in a file it could have read for a fraction of a cent.
Why fetching loses to reading
When the docs are in the context window, the answer is in front of the model and it reads it. There is nothing to decide and nothing to call.
When the same knowledge sits behind a tool, the agent has to choose to call it, issue the call, wait, read the result, and often call again. Every step is a chance to go wrong and another batch of tokens, and none of it buys accuracy when the answer would have fit in the window to begin with.
When Claude Code does need an MCP server
None of this counts against MCP. The protocol pays off when the agent needs something live it cannot be handed in advance, like a browser to drive, an API to call, a database to query, or data too large or too fresh to preload. That is why the Playwright and GitHub servers earn their keep.
Static product docs are the opposite case. They are small, fixed, and already yours to paste. A tool round trip there adds cost and turns without adding answers.
How to give a coding agent your codebase
If your agent answers questions from docs that fit in the context window, put them there. An AGENTS.md or llms.txt file in the prompt was the cheapest and most accurate channel we tested, and it is the same file Cursor, Claude Code, and Codex already look for.
One caveat showed up in the numbers. The raw, unstructured dump scored 83 percent instead of 100, because a single acronym was never defined. A little curation of what you load still pays off. Keep the MCP server for the live work it was built for.