MCP Catalog Now Available: Simplified Discovery, Configuration, and AI Observability in Tetrate Agent Router Service

Learn more

Envoy AI Gateway MCP Performance

A closer look at the performance misconceptions of the MCP proxy implementation in Envoy AI Gateway

Envoy AI Gateway MCP Performance

Envoy AI Gateway MCP proxy implementation

In the past few weeks, there has been increasing discussion and some misconceptions have arisen around the performance of Envoy AI Gateway, in particular around its implementation of Model Context Protocol (MCP).

There have been claims stating that the performance of the MCP proxy implementation is significantly worse than other proxies in the ecosystem. However, those claims were based on measurements made with Envoy AI Gateway running with defaults that are known to trade performance for security. The Envoy community has published a blog post that addresses the misconceptions around the performance of the MCP implementation.

As the Envoy AI Gateway blog post explains, there is a fundamental design choice in the MCP routing implementation:

  • When brokering access to multiple upstream MCP servers, Envoy AI Gateway encodes the aggregated session information for upstreams into a single MCP Session ID.
  • This Session ID is encrypted (to prevent leaking internal details) and returned to the clients.
  • When clients send back this Session ID, Envoy AI Gateway decodes it and uses the session information to route the request to the right upstream MCP server.
Note

While encryption introduces some overhead, this design allows Envoy AI Gateway to remain stateless even though MCP is a stateful protocol. This means the Envoy AI Gateway MCP proxy can be easily scaled horizontally without the need for access to a central, persistent session store.

MCP proxy benchmark comparison

In this post, we’ll focus on how the benchmarks compare with other proxies in the ecosystem.

The Envoy AI Gateway blog post shows the performance of the MCP proxy for different configurations of the encryption algorithm strength. I recommend reading that blog post first, to fully understand the design and the benchmark comparison in the following sections.

Benchmark comparison

I’ve run the benchmarks on my MacBook Pro 17,1 (M1) 8-core laptop, comparing three scenarios:

  • No proxy - Used as a baseline to understand the overhead added by the MCP proxies.
  • Envoy AI Gateway - Running in standalone mode as a single process on my laptop, with 100 key derivation iterations used as the encryption setup.
  • Agent Gateway - Running as a process on my laptop.

The benchmarks consist of a simple MCP interaction that calls an “echo” tool. After running the benchmarks several times, the following results were obtained:

Envoy AI Gateway Comparison
Envoy AI Gateway Comparison

As we can see, both implementations add ~160ms-390ms of latency over the direct calls to the upstream MCP server. We can also see that the average performance difference between Envoy AI Gateway and Agent Gateway is ~0.2ms, which could be considered negligible in real-world scenarios.

Info

MCP tool calls are part of a wider conversation with an LLM (or SLM). Large Language Model reasoning usually takes several seconds, so sub-millisecond latency in MCP tool calls becomes negligible in real world use cases.


The Envoy AI Gateway default encryption settings provide slower performance but still remain in the reasonable range when it comes to real use cases, and they are a good tradeoff between security and speed. Still, users can tune the settings to meet their speed needs.

Running the benchmarks

The numbers above have been taken on my laptop, and I encourage readers to run the benchmarks on their own. Here is how to do that:

Get the latest Envoy AI Gateway and AgentGateway
# Install Envoy AI Gateway
git clone https://github.com/envoyproxy/ai-gateway.git
cd ai-gateway
make build.aigw

# Install Agent Gateway
curl https://raw.githubusercontent.com/agentgateway/agentgateway/refs/heads/main/common/scripts/get-agentgateway | bash
Create the Agent Gateway config file
# Create the Agent Gateway config file
cat >agentgateway.yaml <<EOF
config:
  adminAddr: 0.0.0.0:15000
binds:
  - port: 4000
    listeners:
      - routes:
          - backends:
              - mcp:
                  targets:
                    - name: echo
                      mcp:
                        # Upstream MCP started by the bench tests
                        host: http://localhost:8080/mcp
            policies:
              cors:
                allowOrigins:
                  - "*"
                allowHeaders:
                  - mcp-protocol-version
                  - content-type
                  - cache-control
EOF

Update and run the benchmark tests

To include Agent Gateway in the Envoy AI Gateway benchmarks, you can apply the following patch, taking care to update the path to the agentgateway.yaml file to the right location:

diff --git a/tests/bench/bench_test.go b/tests/bench/bench_test.go
index 61b45dfd..77dc1394 100644
--- a/tests/bench/bench_test.go
+++ b/tests/bench/bench_test.go
@@ -87,6 +87,12 @@ func setupBenchmark(b *testing.B) []MCPBenchCase {
 				`--mcp-json={"mcpServers":{"aigw":{"type":"http","url":"http://localhost:8080/mcp"}}}`,
 			},
 		},
+		{
+			Name:        "Agent_Gateway",
+			TestAddr:    "http://localhost:4000/mcp",
+			ProxyBinary: "agentgateway",
+			ProxyArgs:   []string{"-f", "./agentgateway.yaml"},
+		},
 	}
 }

Once updated, you can run the benchmarks as follows:

go test -timeout=15m -run='^$' -bench=. -count=10 ./tests/bench/...

On my MacBook Pro 17,1 (M1) 8 core laptop, I got the following results:

BenchmarkMCP/Baseline_NoProxy-8         	   14703	     79143 ns/op
BenchmarkMCP/Baseline_NoProxy-8         	   15224	     80604 ns/op
BenchmarkMCP/Baseline_NoProxy-8         	   15004	     78925 ns/op
BenchmarkMCP/Baseline_NoProxy-8         	   14907	    126648 ns/op
BenchmarkMCP/Baseline_NoProxy-8         	   14720	     81119 ns/op
BenchmarkMCP/Baseline_NoProxy-8         	   14830	     81568 ns/op
BenchmarkMCP/Baseline_NoProxy-8         	   14926	     80838 ns/op
BenchmarkMCP/Baseline_NoProxy-8         	   14893	     80829 ns/op
BenchmarkMCP/Baseline_NoProxy-8         	   15056	     80135 ns/op
BenchmarkMCP/Baseline_NoProxy-8         	   15096	     80058 ns/op
BenchmarkMCP/EAIGW_Config_100-8         	    2964	    390887 ns/op
BenchmarkMCP/EAIGW_Config_100-8         	    2930	    399122 ns/op
BenchmarkMCP/EAIGW_Config_100-8         	    3200	    380820 ns/op
BenchmarkMCP/EAIGW_Config_100-8         	    3098	    380291 ns/op
BenchmarkMCP/EAIGW_Config_100-8         	    3074	    383210 ns/op
BenchmarkMCP/EAIGW_Config_100-8         	    3114	    381255 ns/op
BenchmarkMCP/EAIGW_Config_100-8         	    2810	    375857 ns/op
BenchmarkMCP/EAIGW_Config_100-8         	    3135	    384219 ns/op
BenchmarkMCP/EAIGW_Config_100-8         	    3103	    464622 ns/op
BenchmarkMCP/EAIGW_Config_100-8         	    3135	    378941 ns/op
BenchmarkMCP/EAIGW_Inline_100-8         	    2977	    389766 ns/op
BenchmarkMCP/EAIGW_Inline_100-8         	    3019	    401704 ns/op
BenchmarkMCP/EAIGW_Inline_100-8         	    3072	    390002 ns/op
BenchmarkMCP/EAIGW_Inline_100-8         	    3098	    390813 ns/op
BenchmarkMCP/EAIGW_Inline_100-8         	    3050	    386559 ns/op
BenchmarkMCP/EAIGW_Inline_100-8         	    3050	    383955 ns/op
BenchmarkMCP/EAIGW_Inline_100-8         	    3080	    385445 ns/op
BenchmarkMCP/EAIGW_Inline_100-8         	    3110	    382815 ns/op
BenchmarkMCP/EAIGW_Inline_100-8         	    3040	    381472 ns/op
BenchmarkMCP/EAIGW_Inline_100-8         	    3066	    384949 ns/op
BenchmarkMCP/Agent_Gateway-8            	    7155	    162183 ns/op
BenchmarkMCP/Agent_Gateway-8            	    6952	    161342 ns/op
BenchmarkMCP/Agent_Gateway-8            	    7411	    161223 ns/op
BenchmarkMCP/Agent_Gateway-8            	    7428	    156689 ns/op
BenchmarkMCP/Agent_Gateway-8            	    7713	    157361 ns/op
BenchmarkMCP/Agent_Gateway-8            	    7794	    162505 ns/op
BenchmarkMCP/Agent_Gateway-8            	    7606	    206050 ns/op
BenchmarkMCP/Agent_Gateway-8            	    7509	    158908 ns/op
BenchmarkMCP/Agent_Gateway-8            	    7226	    157775 ns/op
BenchmarkMCP/Agent_Gateway-8            	    7248	    160533 ns/op

Conclusion

As we have seen, Envoy AI Gateway offers almost identical performance as Agent Gateway. Most of the overhead is on session encryption (not on the MCP traffic handling) and the default settings can be easily configured to meet users’ needs.

Here are the key takeaways:

  • MCP interactions happen in the context of conversations with Language Models, which usually take seconds to complete. In these real-world scenarios, sub-millisecond latency for MCP access is not a common concern.
  • Envoy AI Gateway offers almost identical performance to Agent Gateway, while providing the entire Envoy proxy feature set.
  • The design of the MCP proxy in Envoy AI Gateway does not require a central, persistent session store, allowing it to easily scale horizontally without external dependencies.
  • The fact that Envoy has been successfully running in production for more than 10 years, and can leverage AI features with similar performance to other proxies in the ecosystem, makes it a proven and reliable proxy to handle AI traffic at scale in production.
Product background Product background for tablets
New to service mesh?

Get up to speed with free online courses at Tetrate Academy and quickly learn Istio and Envoy.

Learn more
Using Kubernetes?

Tetrate Enterprise Gateway for Envoy (TEG) is the easiest way to get started with Envoy Gateway for production use cases. Get the power of Envoy Proxy in an easy-to-consume package managed via the Kubernetes Gateway API.

Learn more
Getting started with Istio?

Tetrate Istio Subscription (TIS) is the most reliable path to production, providing a complete solution for running Istio and Envoy securely in mission-critical environments. It includes:

  • Tetrate Istio Distro – A 100% upstream distribution of Istio and Envoy.
  • Compliance-ready – FIPS-verified and FedRAMP-ready for high-security needs.
  • Enterprise-grade support – The ONLY enterprise support for 100% upstream Istio, ensuring no vendor lock-in.
  • Learn more
    Need global visibility for Istio?

    TIS+ is a hosted Day 2 operations solution for Istio designed to streamline workflows for platform and support teams. It offers:

  • A global service dashboard
  • Multi-cluster visibility
  • Service topology visualization
  • Workspace-based access control
  • Learn more
    Decorative CTA background pattern background background
    Tetrate logo in the CTA section Tetrate logo in the CTA section for mobile

    Ready to enhance your
    network

    with more
    intelligence?