Announcing Built On Envoy: Making Envoy Extensions Accessible to Everyone

Learn more

Envoy AI Gateway from Concept to Reality: Tetrate and Bloomberg's Journey to Standardizing LLM Routing

The explosion of GenAI has introduced a new kind of infrastructure challenge: managing LLM traffic across multiple providers, environments, and teams. To meet this challenge, Bloomberg and Tetrate partnered, not behind closed doors, but openly, on a shared goal: to build a scalable, secure, and flexible way to route GenAI traffic. Read about the journey of taking Envoy AI Gateway from concept to reality.

Envoy AI Gateway from Concept to Reality: Tetrate and Bloomberg's Journey to Standardizing LLM Routing

The explosion of GenAI has introduced a new kind of infrastructure challenge: managing LLM traffic across multiple providers, environments, and teams.

To meet this challenge, Bloomberg and Tetrate partnered, not behind closed doors, but openly, on a shared goal: to build a scalable, secure, and flexible way to route GenAI traffic.

The result is Envoy AI Gateway, built as an extension of Envoy Proxy and Envoy Gateway, as a solution within the Envoy Project.

Check out the Reference Architecture Blog Post on the Envoy AI Gateway Blog.

Continue reading to learn more about the journey of taking Envoy AI Gateway from concept to reality.


Solving a Shared Problem, Together

Bloomberg engineers proposed expanding the functionality of Envoy Gateway to tackle the challenges of LLM traffic. As they needed to expose LLMs, hosted both internally and externally to applications, it was a natural choice to engage with the open source community.

Tetrate had been contributing heavily to Envoy Gateway and being part of the maintainer team, welcomed the idea.

Together, we co-developed a solution in the open that could:

  • Unify traffic to external and internal model backends
  • Handle per-provider auth securely at the edge
  • Enforce policies and protect budgets via token-aware rate limiting
  • Provide visibility into GenAI usage across teams and providers

This work was shared from the start with the Envoy community to benefit a broad set of adopters.

📣 Read our joint announcement


Built on a Strong Foundation: Envoy + Envoy Gateway

Instead of reinventing the wheel, we extended the tools many teams already use.

  • Envoy Proxy: A CNCF graduated project trusted by companies for dealing with high-scale traffic
  • Envoy Gateway: A simplification layer that brings Kubernetes-native API Gateway features to Envoy, with a modern control plane and Gateway API support
  • Envoy AI Gateway: An extension of Envoy Gateway, purpose-built to handle the unique traffic and governance needs of GenAI platforms

This composable foundation gave us:

  • Proven scalability
  • Extensibility through filters
  • Native support for Kubernetes Gateway API
  • A familiar operational model for platform engineers

From Community Talks to Architecture Patterns

Since launching the project, we’ve contributed code, given talks, and worked with early adopters to test and refine the approach.

  • 🎤 Keynote at KubeCon introduced the mission and early design
  • 🛠️ Releases added upstream auth, token-based rate limiting, OpenTelemetry support, and so much more!
  • 📐 Now, we’ve published a Reference Architecture to show you how it all fits together

The architecture introduces a Two-Tier Gateway Model:

  • Tier 1 Gateway: Unified frontend for external traffic, handling authentication, routing, and cost protection
  • Tier 2 Gateway: Internal cluster-level gateway alongside tools like KServe, focused on self-hosted model traffic and fine-grained controls

This pattern provides platform teams with autonomy without compromising control or visibility.


Real Impact: An Enterprise-Ready OSS Foundation

Collaborating with an end user like Bloomberg in this implementation ensures that the architecture caters to the foundational enterprise needs.

A few examples include:

  • Support both OpenAI and internal models behind a single API
  • Manage provider credentials securely
  • Add governance and insight without friction to developers
  • Enable model usage confidently across teams
  • Enterprise OIDC integrations

The journey demonstrates what’s possible when companies collaborate openly, bringing practical and scalable solutions to the entire ecosystem.

Learn more about how you can use Envoy AI Gateway from various presentations, conversations, and podcasts from the team behind Envoy AI Gateway:


Why It Matters for You

Most platform teams building for GenAI face these problems:

  • A long list of LLM providers, each with its quirks
  • Credential sprawl and secret rotation overhead
  • Inconsistent usage tracking or cost overruns
  • Custom integration glue that doesn’t scale

Envoy AI Gateway addresses these challenges.

  • 🛡️ Security: Inject upstream credentials at the edge
  • 📊 Visibility: Centralize logs and metrics across providers
  • 🪪 Governance: Set policies per team, per route, per user.
  • 🧩 Flexibility: Deploy in any cloud, use any provider, plug in your own auth.

Start Building with the Reference Architecture

If you’re ready to explore how this can work in your environment, the Reference Architecture provides a complete walkthrough.

It includes guidance on:

  • External vs. internal model routing
  • Integration with KServe for model serving
  • Token-aware policy enforcement
  • Production-ready observability and telemetry

Whether you’re starting with an external provider or running your own hosted instances, this architecture grows with you.

Envoy AI Gateway Reference Architecture

Check out the Reference Architecture Blog Post on the Envoy AI Gateway Blog.


Built for the Community, with the Community

At Tetrate, we believe the best infrastructure is built in the open, together.

Envoy AI Gateway exists because of the collaboration between users like Bloomberg, maintainers such as the Envoy Gateway team, and contributors across the cloud-native community.

If you want to simplify your GenAI stack, reduce risk, and accelerate delivery, we invite you to join us.

Let’s build the next generation of AI platforms, together.

Product background Product background for tablets
Building AI agents

Agent Router Enterprise provides managed LLM & MCP Gateways plus AI Guardrails in your dedicated instance. Graduate agents from prototype to production with consistent model access, governed tool use, and runtime supervision — built on Envoy AI Gateway by its creators.

  • LLM Gateway – Unified model catalog with automatic fallback across providers
  • MCP Gateway – Curated tool access with per-profile authentication and filtering
  • AI Guardrails – Enforce policies, prevent data loss, and supervise agent behavior
  • Learn more
    Replacing NGINX Ingress

    Tetrate Enterprise Gateway for Envoy (TEG) is the enterprise-ready replacement for NGINX Ingress Controller. Built on Envoy Gateway and the Kubernetes Gateway API, TEG delivers advanced traffic management, security, and observability without vendor lock-in.

  • 100% upstream Envoy Gateway – CVE-protected builds
  • Kubernetes Gateway API native – Modern, portable, and extensible ingress
  • Enterprise-grade support – 24/7 production support from Envoy experts
  • Learn more
    Decorative CTA background pattern background background
    Tetrate logo in the CTA section Tetrate logo in the CTA section for mobile

    Ready to enhance your
    network

    with more
    intelligence?