Announcing Tetrate Agent Router Service: Intelligent routing for GenAI developers

Learn more

Envoy AI Gateway from Concept to Reality: Tetrate and Bloomberg's Journey to Standardizing LLM Routing

The explosion of GenAI has introduced a new kind of infrastructure challenge: managing LLM traffic across multiple providers, environments, and teams. To meet this challenge, Bloomberg and Tetrate partnered, not behind closed doors, but openly, on a shared goal: to build a scalable, secure, and flexible way to route GenAI traffic. Read about the journey of taking Envoy AI Gateway from concept to reality.

Envoy AI Gateway from Concept to Reality: Tetrate and Bloomberg's Journey to Standardizing LLM Routing

The explosion of GenAI has introduced a new kind of infrastructure challenge: managing LLM traffic across multiple providers, environments, and teams.

To meet this challenge, Bloomberg and Tetrate partnered, not behind closed doors, but openly, on a shared goal: to build a scalable, secure, and flexible way to route GenAI traffic.

The result is Envoy AI Gateway, built as an extension of Envoy Proxy and Envoy Gateway, as a solution within the Envoy Project.

Check out the Reference Architecture Blog Post on the Envoy AI Gateway Blog.

Continue reading to learn more about the journey of taking Envoy AI Gateway from concept to reality.


Solving a Shared Problem, Together

Bloomberg engineers proposed expanding the functionality of Envoy Gateway to tackle the challenges of LLM traffic. As they needed to expose LLMs, hosted both internally and externally to applications, it was a natural choice to engage with the open source community.

Tetrate had been contributing heavily to Envoy Gateway and being part of the maintainer team, welcomed the idea.

Together, we co-developed a solution in the open that could:

  • Unify traffic to external and internal model backends
  • Handle per-provider auth securely at the edge
  • Enforce policies and protect budgets via token-aware rate limiting
  • Provide visibility into GenAI usage across teams and providers

This work was shared from the start with the Envoy community to benefit a broad set of adopters.

📣 Read our joint announcement


Built on a Strong Foundation: Envoy + Envoy Gateway

Instead of reinventing the wheel, we extended the tools many teams already use.

  • Envoy Proxy: A CNCF graduated project trusted by companies for dealing with high-scale traffic
  • Envoy Gateway: A simplification layer that brings Kubernetes-native API Gateway features to Envoy, with a modern control plane and Gateway API support
  • Envoy AI Gateway: An extension of Envoy Gateway, purpose-built to handle the unique traffic and governance needs of GenAI platforms

This composable foundation gave us:

  • Proven scalability
  • Extensibility through filters
  • Native support for Kubernetes Gateway API
  • A familiar operational model for platform engineers

From Community Talks to Architecture Patterns

Since launching the project, we’ve contributed code, given talks, and worked with early adopters to test and refine the approach.

  • 🎤 Keynote at KubeCon introduced the mission and early design
  • 🛠️ Releases added upstream auth, token-based rate limiting, OpenTelemetry support, and so much more!
  • 📐 Now, we’ve published a Reference Architecture to show you how it all fits together

The architecture introduces a Two-Tier Gateway Model:

  • Tier 1 Gateway: Unified frontend for external traffic, handling authentication, routing, and cost protection
  • Tier 2 Gateway: Internal cluster-level gateway alongside tools like KServe, focused on self-hosted model traffic and fine-grained controls

This pattern provides platform teams with autonomy without compromising control or visibility.


Real Impact: An Enterprise-Ready OSS Foundation

Collaborating with an end user like Bloomberg in this implementation ensures that the architecture caters to the foundational enterprise needs.

A few examples include:

  • Support both OpenAI and internal models behind a single API
  • Manage provider credentials securely
  • Add governance and insight without friction to developers
  • Enable model usage confidently across teams
  • Enterprise OIDC integrations

The journey demonstrates what’s possible when companies collaborate openly, bringing practical and scalable solutions to the entire ecosystem.

Learn more about how you can use Envoy AI Gateway from various presentations, conversations, and podcasts from the team behind Envoy AI Gateway:


Why It Matters for You

Most platform teams building for GenAI face these problems:

  • A long list of LLM providers, each with its quirks
  • Credential sprawl and secret rotation overhead
  • Inconsistent usage tracking or cost overruns
  • Custom integration glue that doesn’t scale

Envoy AI Gateway addresses these challenges.

  • 🛡️ Security: Inject upstream credentials at the edge
  • 📊 Visibility: Centralize logs and metrics across providers
  • 🪪 Governance: Set policies per team, per route, per user.
  • 🧩 Flexibility: Deploy in any cloud, use any provider, plug in your own auth.

Start Building with the Reference Architecture

If you’re ready to explore how this can work in your environment, the Reference Architecture provides a complete walkthrough.

It includes guidance on:

  • External vs. internal model routing
  • Integration with KServe for model serving
  • Token-aware policy enforcement
  • Production-ready observability and telemetry

Whether you’re starting with an external provider or running your own hosted instances, this architecture grows with you.

Envoy AI Gateway Reference Architecture

Check out the Reference Architecture Blog Post on the Envoy AI Gateway Blog.


Built for the Community, with the Community

At Tetrate, we believe the best infrastructure is built in the open, together.

Envoy AI Gateway exists because of the collaboration between users like Bloomberg, maintainers such as the Envoy Gateway team, and contributors across the cloud-native community.

If you want to simplify your GenAI stack, reduce risk, and accelerate delivery, we invite you to join us.

Let’s build the next generation of AI platforms, together.

Product background Product background for tablets
New to service mesh?

Get up to speed with free online courses at Tetrate Academy and quickly learn Istio and Envoy.

Learn more
Using Kubernetes?

Tetrate Enterprise Gateway for Envoy (TEG) is the easiest way to get started with Envoy Gateway for production use cases. Get the power of Envoy Proxy in an easy-to-consume package managed via the Kubernetes Gateway API.

Learn more
Getting started with Istio?

Tetrate Istio Subscription (TIS) is the most reliable path to production, providing a complete solution for running Istio and Envoy securely in mission-critical environments. It includes:

  • Tetrate Istio Distro – A 100% upstream distribution of Istio and Envoy.
  • Compliance-ready – FIPS-verified and FedRAMP-ready for high-security needs.
  • Enterprise-grade support – The ONLY enterprise support for 100% upstream Istio, ensuring no vendor lock-in.
  • Learn more
    Need global visibility for Istio?

    TIS+ is a hosted Day 2 operations solution for Istio designed to streamline workflows for platform and support teams. It offers:

  • A global service dashboard
  • Multi-cluster visibility
  • Service topology visualization
  • Workspace-based access control
  • Learn more
    Decorative CTA background pattern background background
    Tetrate logo in the CTA section Tetrate logo in the CTA section for mobile

    Ready to enhance your
    network

    with more
    intelligence?