Agent Router Enterprise

Global Policy, Local Context

Your CISO says "no GPT-4 for customer data." Two weeks later, support asks why they can't use the better model for public FAQs. Turns out simple policies get complicated when they meet reality.

Paul Merrison

January 5, 2026

Your CISO says “no GPT-4 for internal customer data.” Reasonable mandate, clear security policy, sounds simple.

Two weeks later, your customer support team is asking why their chatbot can’t access the better model when they’re only using publicly available FAQ content. Your analytics team is asking why they can’t use GPT-4 on anonymized datasets. Your compliance team is asking what “internal customer data” even means.

Turns out “simple” policies become complicated when they meet reality.

The Centralization Trap

Every large organization has the same instinct when it comes to governance: centralize it. Create an AI Governance Board, write comprehensive policies, mandate that all teams follow them.

This makes sense! Consistency is good. Ensuring everyone follows the same security and compliance standards is literally the point of governance.

The problem is that blanket policies that work everywhere tend to be so conservative that they block legitimate use cases, or so vague that they don’t actually provide guidance.

“No AI for sensitive data”—okay, but what counts as sensitive? “All AI outputs must be reviewed by a human”—even the internal code documentation bot? “Only use approved models”—great, but the approved list hasn’t been updated in six months and doesn’t include the model we need.

The Autonomy Trap

The opposite approach is letting teams make their own decisions. They know their use cases, they understand their risk profile, they can implement appropriate controls.

This also makes sense! Context matters. The risk profile of an internal developer tool is different from a customer-facing financial advisor.

The problem is that you end up with 15 different interpretations of “appropriate controls” and no way to ensure consistent security posture. When the auditor asks “how do you ensure PII isn’t leaked to third-party models?” the answer is “well, each team handles it differently.”

That’s not governance. That’s hoping for the best.

The “Global Policy, Local Context” Model

What you actually want is policies that are centrally mandated but contextually applied.

The organization sets non-negotiable rules (global policy):

PII must not be sent to external model providers
All AI decisions affecting customers must be logged
Prompts containing confidential information must use approved models only

But the enforcement of those rules adapts to context (local context):

“PII” includes customer names in the support chatbot, but not in the anonymized analytics dataset
“Customer-affecting decisions” includes refund approvals, but not code completions
“Confidential information” means different things in legal vs. marketing

Same rules, different parameters based on what the system is actually doing.

The Technical Implementation

How do you actually build this?

You need a policy enforcement layer that:

Understands the global rules (the “what”)
Has access to request context (the “who, where, when”)
Can make decisions based on both (the “should this be allowed”)

Most organizations try to implement this at the application layer. Each service is responsible for reading the global policy, understanding its own context, and making enforcement decisions.

This is fragile. It requires every service to correctly interpret policy, correctly assess context, and correctly enforce decisions. If one service gets it wrong, you have a compliance gap.

The better pattern: enforce at the infrastructure layer with context-aware policies.

Example: Model Access Control

Global policy: “Customer PII may only be processed by models we control (self-hosted or dedicated instances), not shared commercial APIs.”

This needs to be enforced differently depending on context:

Customer support chatbot:

Request includes customer name and order ID → route to dedicated model instance
Request is generic FAQ question with no PII → route to faster/cheaper shared API

Analytics pipeline:

Request includes hashed customer IDs (no direct PII) → allow shared API
Request includes raw customer data → route to self-hosted model

Internal code assistant:

Request is about public library documentation → allow shared API
Request includes proprietary code snippets → route to dedicated instance

Same global policy. Three different systems. Three different contextual interpretations.

If this logic lives in each application, you’re implementing it three times and hoping you got it right everywhere.

If it lives at the gateway, you implement it once with different configuration parameters per service:

services:
  customer-support:
    pii_detection: strict
    model_routing:
      contains_pii: dedicated-gpt4
      default: shared-gpt35

  analytics:
    pii_detection: hash_ids_ok
    model_routing:
      contains_pii: self-hosted-llama
      default: shared-gpt4

  code-assistant:
    pii_detection: proprietary_code
    model_routing:
      contains_proprietary: dedicated-claude
      default: shared-gpt35

Policy is consistent (PII goes to controlled models). Configuration adapts to context.

The Permission Gradient

Not all teams need the same degree of autonomy.

Your security-savvy ML team that’s been building AI systems for years? They probably can be trusted to configure their own policies within guardrails.

The marketing team that just discovered ChatGPT last month? Maybe their policies should be more tightly controlled.

A mature governance model has a permission gradient:

Restricted: Central policy team defines everything, no local override
Guided: Teams can configure parameters within allowed ranges
Autonomous: Teams can define their own policies as long as they meet minimum requirements

Different teams operate at different levels based on their risk profile and governance maturity.

And critically, this is controlled centrally. You’re not asking teams to self-assess their governance maturity. You’re assigning policy levels and enforcing them at the infrastructure layer.

The Audit Advantage

When policies are globally defined but contextually enforced, auditing becomes tractable.

Instead of asking 15 teams how they each implemented PII protection, you ask one question: “show me the global PII policy and how it’s configured for each service.”

You get a single source of truth for what the rules are, plus a configuration layer showing how they’re applied in different contexts.

Your audit log shows policy decisions being made consistently by the infrastructure layer, not inconsistently by various application implementations.

The Evolution Path

Most organizations start with either total centralization (rigid, blocks innovation) or total decentralization (flexible, creates compliance gaps).

The evolution is toward selective decentralization:

Start with highly centralized policies for immature/high-risk AI use
As teams demonstrate governance capability, grant them more configuration autonomy
Maintain central enforcement even as configuration becomes distributed

You’re not choosing between “central control” and “team autonomy.” You’re building a system where central policy and local context coexist.

When Local Context Should Override Global Policy

Sometimes teams have legitimate reasons to deviate from global policy. Maybe they’re in a regulated industry with stricter requirements. Maybe they’re handling data that requires special controls.

A mature governance model allows for exceptions, but makes them explicit and auditable:

Exceptions are requested through a formal process
They’re documented with justification
They’re time-limited and reviewed periodically
They’re visible in audit logs

“Team X is using a more restrictive policy than the global mandate” is fine. “Team Y quietly disabled PII filtering” is not fine.

The infrastructure layer can enforce both the global policy and the approved exceptions, making deviations visible rather than hidden.

The Developer Experience

From a developer’s perspective, global policy with local context should feel like:

Clear rules about what’s not allowed
Flexibility in how to accomplish allowed things
Configuration that makes sense for their use case
Errors that explain why something was blocked and how to fix it

Not:

Mysterious policy rejections with no explanation
Impossible restrictions that don’t account for their context
Having to read 47 pages of policy docs to figure out what they can do

When governance is infrastructure-enforced but contextually aware, developers get useful error messages:

“Request blocked: contains customer email addresses. Customer PII must use dedicated model instance. Either strip the email or route to [dedicated-endpoint].”

That’s actionable. That’s useful. That’s governance that helps rather than just blocking.

The Real Balance

You don’t want governance that’s so centralized it can’t adapt to reality. You don’t want governance that’s so flexible it’s just chaos with documentation.

You want global policy (what we believe is right) enforced locally (in the way that makes sense for this specific situation).

Infrastructure-layer enforcement gives you both. Centralized rules, contextual application, consistent enforcement, flexible configuration.

And developers who understand what they’re allowed to do without having to become policy experts.

Tetrate believes effective governance requires global policies with local context awareness. Our Agent Router Enterprise enforces organization-wide rules while adapting to service-specific requirements through configurable policy parameters at the gateway layer. Learn more here ›

Paul Merrison

January 5, 2026

Building AI agents

Agent Router Enterprise provides managed LLM & MCP Gateways plus AI Guardrails in your dedicated instance. Graduate agents from prototype to production with consistent model access, governed tool use, and runtime supervision — built on Envoy AI Gateway by its creators.

LLM Gateway – Unified model catalog with automatic fallback across providers

MCP Gateway – Curated tool access with per-profile authentication and filtering

AI Guardrails – Enforce policies, prevent data loss, and supervise agent behavior

Learn more

Replacing NGINX Ingress

Tetrate Enterprise Gateway for Envoy (TEG) is the enterprise-ready replacement for NGINX Ingress Controller. Built on Envoy Gateway and the Kubernetes Gateway API, TEG delivers advanced traffic management, security, and observability without vendor lock-in.

100% upstream Envoy Gateway – CVE-protected builds

Kubernetes Gateway API native – Modern, portable, and extensible ingress

Enterprise-grade support – 24/7 production support from Envoy experts

Learn more

Announcing Built On Envoy: Making Envoy Extensions Accessible to Everyone

Global Policy, Local Context

The Centralization Trap

The Autonomy Trap

The “Global Policy, Local Context” Model

The Technical Implementation

Example: Model Access Control

The Permission Gradient

The Audit Advantage

The Evolution Path

When Local Context Should Override Global Policy

The Developer Experience

The Real Balance

Building AI agents

Replacing NGINX Ingress

Ready to enhance your
network
with more
intelligence?

Announcing Built On Envoy: Making Envoy Extensions Accessible to Everyone

Global Policy, Local Context

The Centralization Trap

The Autonomy Trap

The “Global Policy, Local Context” Model

The Technical Implementation

Example: Model Access Control

The Permission Gradient

The Audit Advantage

The Evolution Path

When Local Context Should Override Global Policy

The Developer Experience

The Real Balance

Building AI agents

Replacing NGINX Ingress

Ready to enhance your network with more intelligence?

Ready to enhance your
network
with more
intelligence?