Envoy AI Gateway MCP Performance
A closer look at the performance misconceptions of the MCP proxy implementation in Envoy AI Gateway
Envoy AI Gateway MCP proxy implementation
In the past few weeks, there has been increasing discussion and some misconceptions have arisen around the performance of Envoy AI Gateway, in particular around its implementation of Model Context Protocol (MCP).
There have been claims stating that the performance of the MCP proxy implementation is significantly worse than other proxies in the ecosystem. However, those claims were based on measurements made with Envoy AI Gateway running with defaults that are known to trade performance for security. The Envoy community has published a blog post that addresses the misconceptions around the performance of the MCP implementation.
As the Envoy AI Gateway blog post explains, there is a fundamental design choice in the MCP routing implementation:
- When brokering access to multiple upstream MCP servers, Envoy AI Gateway encodes the aggregated session information for upstreams into a single MCP Session ID.
- This Session ID is encrypted (to prevent leaking internal details) and returned to the clients.
- When clients send back this Session ID, Envoy AI Gateway decodes it and uses the session information to route the request to the right upstream MCP server.
While encryption introduces some overhead, this design allows Envoy AI Gateway to remain stateless even though MCP is a stateful protocol. This means the Envoy AI Gateway MCP proxy can be easily scaled horizontally without the need for access to a central, persistent session store.
MCP proxy benchmark comparison
In this post, we’ll focus on how the benchmarks compare with other proxies in the ecosystem.
The Envoy AI Gateway blog post shows the performance of the MCP proxy for different configurations of the encryption algorithm strength. I recommend reading that blog post first, to fully understand the design and the benchmark comparison in the following sections.
Benchmark comparison
I’ve run the benchmarks on my MacBook Pro 17,1 (M1) 8-core laptop, comparing three scenarios:
- No proxy - Used as a baseline to understand the overhead added by the MCP proxies.
- Envoy AI Gateway - Running in standalone mode as a single process on my laptop, with 100 key derivation iterations used as the encryption setup.
- Agent Gateway - Running as a process on my laptop.
The benchmarks consist of a simple MCP interaction that calls an “echo” tool. After running the benchmarks several times, the following results were obtained:
As we can see, both implementations add ~160ms-390ms of latency over the direct calls to the upstream MCP server. We can also see that the average performance difference between Envoy AI Gateway and Agent Gateway is ~0.2ms, which could be considered negligible in real-world scenarios.
MCP tool calls are part of a wider conversation with an LLM (or SLM). Large Language Model reasoning usually takes several seconds, so sub-millisecond latency in MCP tool calls becomes negligible in real world use cases.
The Envoy AI Gateway default encryption settings provide slower performance but still remain in the reasonable range when it comes to real use cases, and they are a good tradeoff between security and speed. Still, users can tune the settings to meet their speed needs.
Running the benchmarks
The numbers above have been taken on my laptop, and I encourage readers to run the benchmarks on their own. Here is how to do that:
# Install Envoy AI Gateway
git clone https://github.com/envoyproxy/ai-gateway.git
cd ai-gateway
make build.aigw
# Install Agent Gateway
curl https://raw.githubusercontent.com/agentgateway/agentgateway/refs/heads/main/common/scripts/get-agentgateway | bash # Create the Agent Gateway config file
cat >agentgateway.yaml <<EOF
config:
adminAddr: 0.0.0.0:15000
binds:
- port: 4000
listeners:
- routes:
- backends:
- mcp:
targets:
- name: echo
mcp:
# Upstream MCP started by the bench tests
host: http://localhost:8080/mcp
policies:
cors:
allowOrigins:
- "*"
allowHeaders:
- mcp-protocol-version
- content-type
- cache-control
EOF Update and run the benchmark tests
To include Agent Gateway in the Envoy AI Gateway benchmarks, you can apply the following
patch, taking care to update the path to the agentgateway.yaml file to the right location:
diff --git a/tests/bench/bench_test.go b/tests/bench/bench_test.go
index 61b45dfd..77dc1394 100644
--- a/tests/bench/bench_test.go
+++ b/tests/bench/bench_test.go
@@ -87,6 +87,12 @@ func setupBenchmark(b *testing.B) []MCPBenchCase {
`--mcp-json={"mcpServers":{"aigw":{"type":"http","url":"http://localhost:8080/mcp"}}}`,
},
},
+ {
+ Name: "Agent_Gateway",
+ TestAddr: "http://localhost:4000/mcp",
+ ProxyBinary: "agentgateway",
+ ProxyArgs: []string{"-f", "./agentgateway.yaml"},
+ },
}
}
Once updated, you can run the benchmarks as follows:
go test -timeout=15m -run='^$' -bench=. -count=10 ./tests/bench/... On my MacBook Pro 17,1 (M1) 8 core laptop, I got the following results:
BenchmarkMCP/Baseline_NoProxy-8 14703 79143 ns/op
BenchmarkMCP/Baseline_NoProxy-8 15224 80604 ns/op
BenchmarkMCP/Baseline_NoProxy-8 15004 78925 ns/op
BenchmarkMCP/Baseline_NoProxy-8 14907 126648 ns/op
BenchmarkMCP/Baseline_NoProxy-8 14720 81119 ns/op
BenchmarkMCP/Baseline_NoProxy-8 14830 81568 ns/op
BenchmarkMCP/Baseline_NoProxy-8 14926 80838 ns/op
BenchmarkMCP/Baseline_NoProxy-8 14893 80829 ns/op
BenchmarkMCP/Baseline_NoProxy-8 15056 80135 ns/op
BenchmarkMCP/Baseline_NoProxy-8 15096 80058 ns/op
BenchmarkMCP/EAIGW_Config_100-8 2964 390887 ns/op
BenchmarkMCP/EAIGW_Config_100-8 2930 399122 ns/op
BenchmarkMCP/EAIGW_Config_100-8 3200 380820 ns/op
BenchmarkMCP/EAIGW_Config_100-8 3098 380291 ns/op
BenchmarkMCP/EAIGW_Config_100-8 3074 383210 ns/op
BenchmarkMCP/EAIGW_Config_100-8 3114 381255 ns/op
BenchmarkMCP/EAIGW_Config_100-8 2810 375857 ns/op
BenchmarkMCP/EAIGW_Config_100-8 3135 384219 ns/op
BenchmarkMCP/EAIGW_Config_100-8 3103 464622 ns/op
BenchmarkMCP/EAIGW_Config_100-8 3135 378941 ns/op
BenchmarkMCP/EAIGW_Inline_100-8 2977 389766 ns/op
BenchmarkMCP/EAIGW_Inline_100-8 3019 401704 ns/op
BenchmarkMCP/EAIGW_Inline_100-8 3072 390002 ns/op
BenchmarkMCP/EAIGW_Inline_100-8 3098 390813 ns/op
BenchmarkMCP/EAIGW_Inline_100-8 3050 386559 ns/op
BenchmarkMCP/EAIGW_Inline_100-8 3050 383955 ns/op
BenchmarkMCP/EAIGW_Inline_100-8 3080 385445 ns/op
BenchmarkMCP/EAIGW_Inline_100-8 3110 382815 ns/op
BenchmarkMCP/EAIGW_Inline_100-8 3040 381472 ns/op
BenchmarkMCP/EAIGW_Inline_100-8 3066 384949 ns/op
BenchmarkMCP/Agent_Gateway-8 7155 162183 ns/op
BenchmarkMCP/Agent_Gateway-8 6952 161342 ns/op
BenchmarkMCP/Agent_Gateway-8 7411 161223 ns/op
BenchmarkMCP/Agent_Gateway-8 7428 156689 ns/op
BenchmarkMCP/Agent_Gateway-8 7713 157361 ns/op
BenchmarkMCP/Agent_Gateway-8 7794 162505 ns/op
BenchmarkMCP/Agent_Gateway-8 7606 206050 ns/op
BenchmarkMCP/Agent_Gateway-8 7509 158908 ns/op
BenchmarkMCP/Agent_Gateway-8 7226 157775 ns/op
BenchmarkMCP/Agent_Gateway-8 7248 160533 ns/op
Conclusion
As we have seen, Envoy AI Gateway offers almost identical performance as Agent Gateway. Most of the overhead is on session encryption (not on the MCP traffic handling) and the default settings can be easily configured to meet users’ needs.
Here are the key takeaways:
- MCP interactions happen in the context of conversations with Language Models, which usually take seconds to complete. In these real-world scenarios, sub-millisecond latency for MCP access is not a common concern.
- Envoy AI Gateway offers almost identical performance to Agent Gateway, while providing the entire Envoy proxy feature set.
- The design of the MCP proxy in Envoy AI Gateway does not require a central, persistent session store, allowing it to easily scale horizontally without external dependencies.
- The fact that Envoy has been successfully running in production for more than 10 years, and can leverage AI features with similar performance to other proxies in the ecosystem, makes it a proven and reliable proxy to handle AI traffic at scale in production.