Tetrate Agent Router Service

Tetrate Agent Router Service: Traffic Splitting, Enhanced Playground, and Next-Gen Model Support

This week's update introduces intelligent traffic splitting for cost optimization, a powerful code rendering feature in the Playground, support for GPT-5 and open source models, plus enhanced billing security and automation.

Nurul Arif Setiawan

August 13, 2025

Tetrate Agent Router Service: Traffic Splitting, Enhanced Playground, and Next-Gen Model Support

Welcome to our weekly update for Tetrate Agent Router Service! This week we’re rolling out several powerful features that give you more control over your AI workloads, enhance your development experience, and expand model options even further.

If you haven’t tried Tetrate Agent Router Service yet, you can sign up in single step to try these new features. Sign up with your business email and get $5 credit upfront!

Intelligent Traffic Splitting

One of the most requested features is now live: intelligent traffic splitting. This powerful capability allows you to distribute requests across multiple models or providers based on percentages you define, enabling sophisticated cost optimization and performance tuning strategies.

How It Works

Traffic splitting lets you:

Distribute load across multiple models (e.g., 70% to GPT-4.1-mini, 30% to GPT-4.1)
A/B test different models in production without code changes
Gradually migrate to newer models by slowly increasing their traffic share
Optimize costs by routing a percentage of requests to more affordable models

Real-World Use Cases

Cost Optimization: Route 80% of simple queries to GPT-4.1-mini while reserving GPT-4.1 for the remaining 20% that require advanced reasoning.

Progressive Rollouts: Start with 10% traffic to GPT-5, monitor performance, then gradually increase as confidence grows.

Load Balancing: Distribute traffic across multiple providers to avoid rate limits and ensure high availability.

Enhanced Playground with Code Rendering

The Playground now features intelligent code rendering with syntax highlighting and formatting, making it easier than ever to test and iterate on code-generation use cases.

New Capabilities

Automatic language detection for syntax highlighting
Copy-to-clipboard functionality for code blocks
Instant start - Playground automatically selects your API key and model, so you can start testing immediately

This enhancement is particularly valuable for developers using the service for code generation, debugging assistance, or building AI-powered development tools.

Next-Generation Model Support

We’re excited to announce support for the latest advancements in AI models:

GPT-5

OpenAI’s most advanced model is now available through our platform. GPT-5 brings:

Significantly improved reasoning capabilities
Better code generation and debugging
Enhanced multimodal understanding
Reduced hallucination rates

GPT-OSS

We’ve expanded support for GPT-OSS through our existing providers Groq and DeepInfra, giving you access to this powerful open-source model with:

Lightning-fast inference via Groq’s LPU hardware
Cost-effective deployment through DeepInfra’s infrastructure
Complete transparency with open weights
No usage restrictions or licensing concerns

These additions complement our existing lineup of models from OpenAI, Anthropic, Google, xAI, Groq, and DeepInfra, giving you unparalleled flexibility in choosing the right model for each task.

Getting Started with the New Features

To explore these enhancements:

Traffic Splitting: Navigate to API Key configurations and click “Traffic Splitting” to configure percentage-based distribution
Enhanced Playground: Visit the Playground to experience the new code rendering capabilities
New Models: Update your code to use GPT-5 or GPT-OSS

What’s Next

We’re excited about what’s coming next week:

Native Coding Assistant Support: TARS will be available as a native provider in popular coding assistants
Bring Your Own Provider Keys: Use your existing API keys from OpenAI, Anthropic, and other providers
Extended Model Support: Even more models joining our platform

Get Started Today

If you haven’t tried Tetrate Agent Router Service yet, sign up now and get $5 free credit when you use your business email. Existing users can access all new features immediately through their dashboard.

Have questions or feedback? Join our Slack community or reach out through the in-app support.

Nurul Arif Setiawan

August 13, 2025

Building AI agents

Agent Router Enterprise provides managed LLM & MCP Gateways plus AI Guardrails in your dedicated instance. Graduate agents from prototype to production with consistent model access, governed tool use, and runtime supervision — built on Envoy AI Gateway by its creators.

LLM Gateway – Unified model catalog with automatic fallback across providers

MCP Gateway – Curated tool access with per-profile authentication and filtering

AI Guardrails – Enforce policies, prevent data loss, and supervise agent behavior

Learn more

Replacing NGINX Ingress

Tetrate Enterprise Gateway for Envoy (TEG) is the enterprise-ready replacement for NGINX Ingress Controller. Built on Envoy Gateway and the Kubernetes Gateway API, TEG delivers advanced traffic management, security, and observability without vendor lock-in.

100% upstream Envoy Gateway – CVE-protected builds

Kubernetes Gateway API native – Modern, portable, and extensible ingress

Enterprise-grade support – 24/7 production support from Envoy experts

Learn more

Announcing Built On Envoy: Making Envoy Extensions Accessible to Everyone

Tetrate Agent Router Service: Traffic Splitting, Enhanced Playground, and Next-Gen Model Support