- AI Development
- Agent Design tags:
- MCP
- Code Execution
- Model Context Protocol
- AI Agent
- Context Optimization
- Claude
- ChatGPT
- Gemini
MCP Code Execution Deep Dive: Agent Design Achieving Up to 98.7% Token Cost Reduction with Code Execution with MCP¶
Target Audience
- Engineers developing AI agents using MCP
- Those facing agent cost and latency challenges
- Anyone wanting to catch up with 2025 agent design trends
Key Points¶
- Understand the limitations and bottlenecks of traditional MCP tool-calling approaches
- Grasp the architecture and mechanisms of Code execution with MCP
- Learn implementation patterns that achieved up to 98.7% reduction in Anthropic's validation
- Discover concrete steps for migrating existing MCP agents
What is MCP Code Execution? (Overview of Code Execution with MCP)¶
MCP (Model Context Protocol) is the standard protocol for connecting external tools to AI agents. Anthropic's "Code execution with MCP" proposed in 2025 is a new design pattern where agents execute TypeScript/JavaScript code to interact with tools instead of calling them directly.
In Anthropic's official validation, a representative workflow demonstrated a reduction from 150,000 tokens to 2,000 tokens (98.7% reduction), showing dramatic improvements in both time and cost.
Bottlenecks Traditional MCP Tool-Calling Faces¶
Traditional tool-calling-centric designs face two critical problems:
Problem 1: Context Window Waste¶
Pre-loading all tool definitions into context means processing hundreds of thousands of tokens before reading a request. When accessing thousands of tools, tool definitions alone generate enormous costs and delay response times.
Problem 2: Duplicate Transfer of Intermediate Results¶
When exchanging data between tools, all results pass through the model. For example, processing a 2-hour meeting transcript consumes an additional 50,000 tokens each time it's transferred between operations.
Real Token Consumption Example
When chaining multiple tools, the same data passes through context repeatedly, not only reducing processing efficiency but also increasing opportunities for errors.
Architecture of Code Execution with MCP¶
Anthropic's proposed new approach combines filesystem-based discovery patterns with code execution environments.
Basic Architecture Design¶
Traditional approaches loaded all tool definitions into context as structured metadata. Code Execution arranges tool definitions as a file tree.
servers/
├── google-drive/getDocument.ts
├── salesforce/updateRecord.ts
└── slack/sendMessage.ts
Agents explore to find needed tools and load only necessary definitions on-demand. There's no need to pre-load all tool specifications.
Execution Environment Isolation and Security¶
TypeScript/JavaScript code executes within sandboxes where network and resources can be restricted. Interaction with MCP servers is intended to occur through TypeScript APIs provided by the sandbox.
Security Design
Depending on implementation, combining Internet isolation with strict ingress/egress controls can achieve higher security. Sandbox design is recommended to incorporate requirements like resource limits, monitoring, and network isolation.
Privacy Protection Mechanism¶
Intermediate results remain within the sandbox and are not shared with the model unless explicitly specified via logs or return statements. This reduces the risk of automatically tokenizing sensitive data.
Breakdown of 98.7% Token Cost Reduction¶
Anthropic's official validation confirmed the following reduction mechanisms:
How Reduction Works¶
- Lazy Loading of Tool Definitions: Exploring and loading only needed tools avoids consuming hundreds of thousands of tokens from pre-loading
- Data Processing Within Execution Environment: Completing filtering and transformation within the execution environment excludes unnecessary information from context
- Control Flow Optimization: Executing loops and conditionals within the environment reduces model round-trips
Secondary Benefits¶
Agents can write results to files, enabling resumable workflows and reusable skill libraries.
Executing loops and conditionals within the environment reduces model round-trips, shortening "time to first token."
Since intermediate data doesn't pass through the model, the risk of tokenizing sensitive information is reduced.
Concrete Implementation Patterns¶
Comparing with traditional tool-calling approaches clarifies implementation differences.
Traditional Approach (Not Recommended)¶
# Load all tool definitions into context
tools = load_all_tools() # Consumes hundreds of thousands of tokens
response = agent.invoke(prompt, tools=tools)
Code Execution Approach (Recommended)¶
// Explore and load only needed tools
const doc = await import('./servers/google-drive/getDocument');
const result = await doc.getDocument('doc-id');
// Filter and return only necessary information
return result.summary;
This approach minimizes information included in context.
Usage with Claude / ChatGPT / Gemini¶
MCP Code Execution is being adopted across multiple LLM clients.
Support Status¶
- Claude Code: Anthropic's Claude Code environment recommends the Code execution with MCP pattern
- ChatGPT Developer Mode: OpenAI's ChatGPT Developer Mode officially provides MCP client functionality
- Gemini CLI: Google's Gemini CLI supports FastMCP integration, enabling MCP + Code Execution patterns
Leading Standard Candidate for 2025
Major LLM clients like OpenAI's ChatGPT Developer Mode and Google's Gemini CLI have begun officially providing MCP client functionality. The MCP + Code Execution pattern is rapidly spreading as a leading standard candidate as of 2025.
Implementation Steps and Design Considerations¶
Concrete steps for migrating existing MCP agents to Code Execution.
Step 1: Inventory Existing MCP Tool Definitions¶
Organize the tools currently used by your agent:
- Number of tools (fewer than 10 / 10-100 / over 100)
- MCP servers in use (Google Drive / Salesforce / Slack, etc.)
- Tool definition granularity (tool names only / detailed schemas)
Step 2: Select MCP + Code Execution Compatible Client¶
Design based on one of the following clients:
- Claude Code: Anthropic's official environment, standard support for Code execution with MCP pattern
- Serena: Open-source MCP client implementation
- Gemini CLI: Google's official CLI tool, FastMCP integration support
- ChatGPT Developer Mode: OpenAI official, equipped with MCP client functionality
Step 3: Design Generator for MCP Tool Definitions → TypeScript API Files¶
Prepare scripts to generate TypeScript API files from existing MCP server tool definitions.
// Generation example: servers/google-drive/getDocument.ts
export async function getDocument(docId: string) {
// Access Google Drive via MCP
const response = await mcp.call('google-drive', 'getDocument', { docId });
return response;
}
Step 4: Compare Token Consumption "Direct Tool Calling vs Code Execution" in One Workflow¶
Measure token consumption for both approaches in actual workflows (e.g., log aggregation batch, CRM update flow).
Gradual Migration
Rather than migrating all tools at once, prioritizing workflows with high token consumption for Code Execution migration allows you to verify effects while controlling risk.
Frequently Asked Questions (FAQ)¶
How does MCP Code Execution differ from traditional Function Calling?¶
Traditional Function Calling pre-loads all tool definitions into context. MCP Code Execution loads only needed tools on-demand and completes data processing within the execution environment, dramatically reducing context consumption.
What are the prerequisites for Anthropic's demonstrated 98.7% reduction?¶
Anthropic's official blog presents a reduction example from 150,000 to 2,000 tokens as a "representative workflow." While specific workflow details aren't disclosed, cases with many tools and frequent intermediate data exchanges are particularly effective.
Can MCP Code Execution be used with ChatGPT and Gemini besides Claude?¶
Yes. ChatGPT Developer Mode and Gemini CLI each officially support MCP client functionality, enabling the Code execution with MCP pattern. However, implementation details and sandbox specifications may differ across products.
What security points should be considered?¶
In sandbox design, the following elements need consideration:
- Network Restrictions: Internet isolation or permitting access only to specific MCP servers
- Resource Limits: Setting CPU, memory, and disk I/O caps
- Monitoring and Logging: Recording code execution logs and anomaly detection
- Timeout Settings: Preventing long-running executions
Next Steps¶
MCP Code Execution is a new paradigm in agent design. Shifting from traditional tool-calling-centric design to code execution + on-demand tool loading can dramatically improve both cost and latency.
If you're already using MCP, consider redesigning tool definitions as TypeScript APIs and migrating to Code Execution. For new projects, designing with Code Execution from the start enables building scalable agent systems.