Envoy AI Gateway from Concept to Reality: Tetrate and Bloomberg's Journey to Standardizing LLM Routing

The explosion of GenAI has introduced a new kind of infrastructure challenge: managing LLM traffic across multiple providers, environments, and teams. To meet this challenge, Bloomberg and Tetrate partnered, not behind closed doors, but openly, on a shared goal: to build a scalable, secure, and flexible way to route GenAI traffic. Read about the journey of taking Envoy AI Gateway from concept to reality.

Tetrate

July 18, 2025

The explosion of GenAI has introduced a new kind of infrastructure challenge: managing LLM traffic across multiple providers, environments, and teams.

To meet this challenge, Bloomberg and Tetrate partnered, not behind closed doors, but openly, on a shared goal: to build a scalable, secure, and flexible way to route GenAI traffic.

The result is Envoy AI Gateway, built as an extension of Envoy Proxy and Envoy Gateway, as a solution within the Envoy Project.

Check out the Reference Architecture Blog Post on the Envoy AI Gateway Blog.

Continue reading to learn more about the journey of taking Envoy AI Gateway from concept to reality.

Solving a Shared Problem, Together

Bloomberg engineers proposed expanding the functionality of Envoy Gateway to tackle the challenges of LLM traffic. As they needed to expose LLMs, hosted both internally and externally to applications, it was a natural choice to engage with the open source community.

Tetrate had been contributing heavily to Envoy Gateway and being part of the maintainer team, welcomed the idea.

Together, we co-developed a solution in the open that could:

Unify traffic to external and internal model backends
Handle per-provider auth securely at the edge
Enforce policies and protect budgets via token-aware rate limiting
Provide visibility into GenAI usage across teams and providers

This work was shared from the start with the Envoy community to benefit a broad set of adopters.

📣 Read our joint announcement

Built on a Strong Foundation: Envoy + Envoy Gateway

Instead of reinventing the wheel, we extended the tools many teams already use.

Envoy Proxy: A CNCF graduated project trusted by companies for dealing with high-scale traffic
Envoy Gateway: A simplification layer that brings Kubernetes-native API Gateway features to Envoy, with a modern control plane and Gateway API support
Envoy AI Gateway: An extension of Envoy Gateway, purpose-built to handle the unique traffic and governance needs of GenAI platforms

This composable foundation gave us:

Proven scalability
Extensibility through filters
Native support for Kubernetes Gateway API
A familiar operational model for platform engineers

From Community Talks to Architecture Patterns

Since launching the project, we’ve contributed code, given talks, and worked with early adopters to test and refine the approach.

🎤 Keynote at KubeCon introduced the mission and early design
🛠️ Releases added upstream auth, token-based rate limiting, OpenTelemetry support, and so much more!
📐 Now, we’ve published a Reference Architecture to show you how it all fits together

The architecture introduces a Two-Tier Gateway Model:

Tier 1 Gateway: Unified frontend for external traffic, handling authentication, routing, and cost protection
Tier 2 Gateway: Internal cluster-level gateway alongside tools like KServe, focused on self-hosted model traffic and fine-grained controls

This pattern provides platform teams with autonomy without compromising control or visibility.

Real Impact: An Enterprise-Ready OSS Foundation

Collaborating with an end user like Bloomberg in this implementation ensures that the architecture caters to the foundational enterprise needs.

A few examples include:

Support both OpenAI and internal models behind a single API
Manage provider credentials securely
Add governance and insight without friction to developers
Enable model usage confidently across teams
Enterprise OIDC integrations

The journey demonstrates what’s possible when companies collaborate openly, bringing practical and scalable solutions to the entire ecosystem.

Learn more about how you can use Envoy AI Gateway from various presentations, conversations, and podcasts from the team behind Envoy AI Gateway:

Why It Matters for You

Most platform teams building for GenAI face these problems:

A long list of LLM providers, each with its quirks
Credential sprawl and secret rotation overhead
Inconsistent usage tracking or cost overruns
Custom integration glue that doesn’t scale

Envoy AI Gateway addresses these challenges.

🛡️ Security: Inject upstream credentials at the edge
📊 Visibility: Centralize logs and metrics across providers
🪪 Governance: Set policies per team, per route, per user.
🧩 Flexibility: Deploy in any cloud, use any provider, plug in your own auth.

Start Building with the Reference Architecture

If you’re ready to explore how this can work in your environment, the Reference Architecture provides a complete walkthrough.

It includes guidance on:

External vs. internal model routing
Integration with KServe for model serving
Token-aware policy enforcement
Production-ready observability and telemetry

Whether you’re starting with an external provider or running your own hosted instances, this architecture grows with you.

Check out the Reference Architecture Blog Post on the Envoy AI Gateway Blog.

Built for the Community, with the Community

At Tetrate, we believe the best infrastructure is built in the open, together.

Envoy AI Gateway exists because of the collaboration between users like Bloomberg, maintainers such as the Envoy Gateway team, and contributors across the cloud-native community.

If you want to simplify your GenAI stack, reduce risk, and accelerate delivery, we invite you to join us.

Let’s build the next generation of AI platforms, together.

Tetrate

July 18, 2025

New to service mesh?

Get up to speed with free online courses at Tetrate Academy and quickly learn Istio and Envoy.

Learn more

Using Kubernetes?

Tetrate Enterprise Gateway for Envoy (TEG) is the easiest way to get started with Envoy Gateway for production use cases. Get the power of Envoy Proxy in an easy-to-consume package managed via the Kubernetes Gateway API.

Learn more

Getting started with Istio?

Tetrate Istio Subscription (TIS) is the most reliable path to production, providing a complete solution for running Istio and Envoy securely in mission-critical environments. It includes:

Tetrate Istio Distro – A 100% upstream distribution of Istio and Envoy.

Compliance-ready – FIPS-verified and FedRAMP-ready for high-security needs.

Enterprise-grade support – The ONLY enterprise support for 100% upstream Istio, ensuring no vendor lock-in.

Learn more