Agent Router Service

Shifting Compliance Left... and Right

"Shift left" has been the DevOps mantra for a decade. For GenAI, you need to shift left AND right—because what you test before deployment isn't what runs after deployment.

Paul Merrison

November 27, 2025

“Shift left” has been the rallying cry of DevOps for a decade. Find bugs earlier, test during development, catch problems before production. Makes perfect sense for deterministic code.

For GenAI, you need to shift left AND right. Because the thing you’re testing before deployment isn’t the same thing that runs after deployment.

The Left Shift Still Matters

You should absolutely test your AI systems during development. Build your test harness. Check for bias in your training data. Validate that your RAG pipeline retrieves the right documents. Make sure your prompts don’t accidentally ask the model to ignore all safety instructions.

All of this is necessary. None of it is sufficient.

Consider a team that spends weeks building a comprehensive pre-deployment test suite for their customer service chatbot. They test thousands of scenarios. They find and fix dozens of issues. They get sign-off from Legal and launch.

Then a user discovers they can get refund approval for anything by phrasing their request as a hypothetical philosophy question. (“If a customer were experiencing Kantian categorical imperative issues with their purchase…”)

The test suite didn’t catch it because nobody thought to test that attack vector. And there are infinite attack vectors, because the attack surface is human language.

The Right Shift Is Required

“Shifting right” means validating in production, continuously, on every request. This is where people get nervous, because it sounds expensive and complicated.

It doesn’t have to be.

The trick is understanding which policies need runtime enforcement and which can be validated once. Your model architecture? Test that during development. The specific prompt template you’re using? Test that during development.

Whether the current request contains PII that needs to be stripped? That’s a runtime check. Whether the user is asking about topics your chatbot shouldn’t discuss? Runtime check. Whether the model’s response is about to leak internal information? Runtime check.

These aren’t “tests” in the traditional sense. They’re policy enforcement. And they need to happen on every request, not just during QA.

The Gateway Pattern

So how do you enforce policies on every request without rebuilding your entire application?

You do it at the gateway. The same infrastructure pattern that handles routing, load balancing, and authentication can also handle policy enforcement.

Think about it: every AI request in your system already flows through network infrastructure. Prompt goes out, response comes back. That’s your enforcement point.

Want to strip PII from prompts before they hit the model? Intercept at the gateway. Need to block requests about regulated topics? Check at the gateway. Want to scan responses for sensitive data before they reach users? Filter at the gateway.

The beautiful thing about this pattern is that it’s centralized. You’re not asking 15 different development teams to each implement PII filtering correctly. You’re enforcing it once, at the infrastructure layer, for every AI system in your organization.

Practical Example: The PII Filter

Let’s get concrete. You need to ensure no user prompts send credit card numbers to your hosted LLM provider (because that violates your DPA, your security policy, and possibly your sanity).

The “shift left” approach: Tell developers to sanitize inputs before calling the LLM API. Add it to your coding standards. Put it in the wiki. Hope everyone reads the wiki and implements it correctly and doesn’t forget when they’re rushing to fix a P1 bug at 11pm.

The “shift left AND right” approach: Developers can do whatever they want in their code. But at the gateway, every outbound request to an LLM API gets scanned for credit card patterns. If one is detected, you can either strip it, reject the request, or log an alert and forward it. Your choice.

The key difference: compliance is enforced by infrastructure, not by hoping developers remember to import the right library.

When Runtime Checks Fail

The other reason you need runtime checks: they can fail gracefully in production without taking down your entire system.

Your pre-deployment tests are binary. Either the build passes or it doesn’t. Either you deploy or you don’t.

Runtime policy checks can be more nuanced. If your PII filter detects a credit card number, you can strip it and continue. If your topic filter detects a prohibited subject, you can return a friendly error message. If your output filter sees potential sensitive data in a response, you can redact it.

You’re enforcing compliance without sacrificing availability. The system keeps running, users keep getting responses, and your audit log shows that policies were enforced on every request.

The Compliance Sandwich

What you want is a compliance sandwich:

Shift left: Test during development to catch obvious problems early
Shift right: Enforce policies at runtime to catch everything else
Infrastructure layer: Implement runtime enforcement at the gateway so it’s consistent and centralized

The development testing makes sure you’re building something reasonable. The runtime enforcement makes sure it stays reasonable in production, even as models change, users get creative, and edge cases emerge.

Most teams only do the left shift. They test thoroughly, document everything, get sign-off, and then hope nothing breaks in production.

Hope is not a compliance strategy.

Tetrate believes compliance should be continuous and infrastructure-driven, not periodic and application-dependent. Our Agent Router Service provides centralized policy enforcement at the gateway layer, ensuring that checks happen on every request without burdening development teams. Learn more about our approach to AI governance at tetrate.io/contact.

Paul Merrison

November 27, 2025

New to service mesh?

Get up to speed with free online courses at Tetrate Academy and quickly learn Istio and Envoy.

Learn more

Using Kubernetes?

Tetrate Enterprise Gateway for Envoy (TEG) is the easiest way to get started with Envoy Gateway for production use cases. Get the power of Envoy Proxy in an easy-to-consume package managed via the Kubernetes Gateway API.

Learn more

Getting started with Istio?

Tetrate Istio Subscription (TIS) is the most reliable path to production, providing a complete solution for running Istio and Envoy securely in mission-critical environments. It includes:

Tetrate Istio Distro – A 100% upstream distribution of Istio and Envoy.

Compliance-ready – FIPS-verified and FedRAMP-ready for high-security needs.

Enterprise-grade support – The ONLY enterprise support for 100% upstream Istio, ensuring no vendor lock-in.

Learn more