Envoy AI Gateway MCP Performance
A closer look at the performance misconceptions of the MCP proxy implementation in Envoy AI Gateway
Envoy AI Gateway MCP proxy implementation
In the past few weeks, there has been increasing discussion and some misconceptions have arisen around the performance of Envoy AI Gateway, in particular around its implementation of Model Context Protocol (MCP).
There have been claims stating that the performance of the MCP proxy implementation is significantly worse than other proxies in the ecosystem. However, those claims were based on measurements made with Envoy AI Gateway running with defaults that are known to trade performance for security. The Envoy community has published a blog post that addresses the misconceptions around the performance of the MCP implementation.
As the Envoy AI Gateway blog post explains, there is a fundamental design choice in the MCP routing implementation:
- When brokering access to multiple upstream MCP servers, Envoy AI Gateway encodes the aggregated session information for upstreams into a single MCP Session ID.
- This Session ID is encrypted (to prevent leaking internal details) and returned to the clients.
- When clients send back this Session ID, Envoy AI Gateway decodes it and uses the session information to route the request to the right upstream MCP server.
While encryption introduces some overhead, this design allows Envoy AI Gateway to remain stateless even though MCP is a stateful protocol. This means the Envoy AI Gateway MCP proxy can be easily scaled horizontally without the need for access to a central, persistent session store.
MCP proxy benchmark comparison
In this post, we’ll focus on how the benchmarks compare with other proxies in the ecosystem.
The Envoy AI Gateway blog post shows the performance of the MCP proxy for different configurations of the encryption algorithm strength. I recommend reading that blog post first, to fully understand the design and the benchmark comparison in the following sections.
Benchmark comparison
I’ve run the benchmarks on my MacBook Pro 17,1 (M1) 8-core laptop, comparing three scenarios:
- No proxy - Used as a baseline to understand the overhead added by the MCP proxies.
- Envoy AI Gateway - Running in standalone mode as a single process on my laptop, with 100 key derivation iterations used as the encryption setup.
- Agent Gateway - Running as a process on my laptop.
The benchmarks consist of a simple MCP interaction that calls an “echo” tool. After running the benchmarks several times, the following results were obtained:
As we can see, both implementations add ~300ms–400ms of latency over the direct calls to the upstream MCP server. We can also see that the average performance difference between Envoy AI Gateway and Agent Gateway is ~0.03ms, which could be considered negligible in real-world scenarios.
MCP tool calls are part of a wider conversation with an LLM (or SLM). Large Language Model reasoning usually takes several seconds, so sub-millisecond latency in MCP tool calls becomes negligible in real world use cases.
The Envoy AI Gateway default encryption settings provide slower performance but still remain in the reasonable range when it comes to real use cases, and they are a good tradeoff between security and speed. Still, users can tune the settings to meet their speed needs.
Running the benchmarks
The numbers above have been taken on my laptop, and I encourage readers to run the benchmarks on their own. Here is how to do that:
# Install Envoy AI Gateway
git clone https://github.com/envoyproxy/ai-gateway.git
cd ai-gateway
go install ./cmd/aigw
# Install Agent Gateway
curl https://raw.githubusercontent.com/agentgateway/agentgateway/refs/heads/main/common/scripts/get-agentgateway | bash # Create the Agent Gateway config file
cat >agentgateway.yaml <<EOF
config:
adminAddr: 0.0.0.0:15000
binds:
- port: 4000
listeners:
- routes:
- backends:
- mcp:
targets:
- name: echo
mcp:
# Upstream MCP started by the bench tests
host: http://localhost:8080/mcp
policies:
cors:
allowOrigins:
- "*"
allowHeaders:
- mcp-protocol-version
- content-type
- cache-control
EOF Update and run the benchmark tests
To include Agent Gateway in the Envoy AI Gateway benchmarks, you can apply the following
patch, taking care to update the path to the agentgateway.yaml file to the right location:
diff --git a/tests/bench/bench_test.go b/tests/bench/bench_test.go
index 61b45dfd..77dc1394 100644
--- a/tests/bench/bench_test.go
+++ b/tests/bench/bench_test.go
@@ -87,6 +87,12 @@ func setupBenchmark(b *testing.B) []MCPBenchCase {
`--mcp-json={"mcpServers":{"aigw":{"type":"http","url":"http://localhost:8080/mcp"}}}`,
},
},
+ {
+ Name: "Agent_Gateway",
+ TestAddr: "http://localhost:4000/mcp",
+ ProxyBinary: "agentgateway",
+ ProxyArgs: []string{"-f", "./agentgateway.yaml"},
+ },
}
}
Once updated, you can run the benchmarks as follows:
go test -timeout=15m -run='^$' -bench=. -count=10 ./tests/bench/... On my MacBook Pro 17,1 (M1) 8 core laptop, I got the following results:
BenchmarkMCP/No_Proxy-8 10651 110963 ns/op
BenchmarkMCP/No_Proxy-8 9868 108272 ns/op
BenchmarkMCP/No_Proxy-8 10000 108742 ns/op
BenchmarkMCP/No_Proxy-8 10000 112245 ns/op
BenchmarkMCP/No_Proxy-8 10000 110956 ns/op
BenchmarkMCP/No_Proxy-8 9236 110825 ns/op
BenchmarkMCP/No_Proxy-8 10000 107458 ns/op
BenchmarkMCP/No_Proxy-8 10000 109067 ns/op
BenchmarkMCP/No_Proxy-8 9290 107687 ns/op
BenchmarkMCP/No_Proxy-8 9674 107584 ns/op
BenchmarkMCP/EAIGW_Inline_100-8 3007 394069 ns/op
BenchmarkMCP/EAIGW_Inline_100-8 3015 399953 ns/op
BenchmarkMCP/EAIGW_Inline_100-8 2911 393956 ns/op
BenchmarkMCP/EAIGW_Inline_100-8 2989 394526 ns/op
BenchmarkMCP/EAIGW_Inline_100-8 3003 390494 ns/op
BenchmarkMCP/EAIGW_Inline_100-8 2998 394805 ns/op
BenchmarkMCP/EAIGW_Inline_100-8 2968 390968 ns/op
BenchmarkMCP/EAIGW_Inline_100-8 3027 390652 ns/op
BenchmarkMCP/EAIGW_Inline_100-8 3013 391191 ns/op
BenchmarkMCP/EAIGW_Inline_100-8 3001 391717 ns/op
BenchmarkMCP/Agent_Gateway-8 3548 336899 ns/op
BenchmarkMCP/Agent_Gateway-8 3644 331865 ns/op
BenchmarkMCP/Agent_Gateway-8 3579 335559 ns/op
BenchmarkMCP/Agent_Gateway-8 3402 332653 ns/op
BenchmarkMCP/Agent_Gateway-8 3456 340476 ns/op
BenchmarkMCP/Agent_Gateway-8 3477 335467 ns/op
BenchmarkMCP/Agent_Gateway-8 3614 347263 ns/op
BenchmarkMCP/Agent_Gateway-8 3564 443132 ns/op
BenchmarkMCP/Agent_Gateway-8 3180 449002 ns/op
BenchmarkMCP/Agent_Gateway-8 3312 377641 ns/op
Conclusion
As we have seen, Envoy AI Gateway offers almost identical performance as Agent Gateway. Most of the overhead is on session encryption (not on the MCP traffic handling) and the default settings can be easily configured to meet users’ needs.
Here are the key takeaways:
- MCP interactions happen in the context of conversations with Language Models, which usually take seconds to complete. In these real-world scenarios, sub-millisecond latency for MCP access is not a common concern.
- Envoy AI Gateway offers almost identical performance to Agent Gateway, while providing the entire Envoy proxy feature set.
- The design of the MCP proxy in Envoy AI Gateway does not require a central, persistent session store, allowing it to easily scale horizontally without external dependencies.
- The fact that Envoy has been successfully running in production for more than 10 years, and can leverage AI features with similar performance to other proxies in the ecosystem, makes it a proven and reliable proxy to handle AI traffic at scale in production.