Tetrate Agent Router Service

Tetrate Agent Router Service Adds Embedding Support and xAI Grok Code Fast Model

This week's update introduces embedding model support for RAG applications and adds xAI's new Grok Code Fast model, expanding our AI development capabilities for both code generation and semantic search use cases.

Nurul Arif Setiawan

September 3, 2025

Tetrate Agent Router Service Adds Embedding Support and xAI Grok Code Fast Model

We’re excited to announce two major enhancements to Tetrate Agent Router Service that significantly expand its capabilities for AI developers. This week’s update brings native embedding model support for building RAG (Retrieval-Augmented Generation) applications and adds xAI’s latest Grok Code Fast model for enhanced code generation capabilities.

If you haven’t tried Tetrate Agent Router Service yet, you can sign up in a single step to experience these new capabilities with $5 free credit when using your business email.

Embedding Model Support for RAG Applications

We’re thrilled to introduce native embedding model support, enabling developers to build sophisticated RAG (Retrieval-Augmented Generation) applications directly through Tetrate Agent Router Service. This feature provides seamless access to OpenAI’s state-of-the-art embedding models through our unified API.

Understanding Embeddings and RAG

Embeddings are numerical representations of text that capture semantic meaning, enabling AI applications to understand context and relationships between different pieces of information. In RAG applications, embeddings are crucial for:

Semantic Search

Find relevant documents based on meaning, not just keywords
Retrieve contextually similar information even with different wording
Improve search accuracy by understanding user intent

Knowledge Retrieval

Convert your knowledge base into searchable vectors
Quickly find the most relevant information from large datasets
Enhance LLM responses with accurate, up-to-date information

Context Enhancement

Provide LLMs with relevant background information
Reduce hallucinations by grounding responses in actual data
Enable domain-specific AI applications without fine-tuning

Embedding Playground

We’ve enhanced our Playground to support embedding models, making it incredibly easy to experiment with text embeddings and understand how they work. You can now generate embeddings for any text, visualize their dimensions, and compare how different models represent the same content.

The new Embedding Playground features:

Example Corpus

Pre-loaded sample texts for quick testing
Domain-specific examples (technical docs, product descriptions, FAQs)
Remove existing texts or add new texts to the sample

Visualization

Real-time vector visualization
Clustering of similar documents
Interactive exploration of embedding space
Visual representation of semantic relationships

Cosine Similarity Calculations

Measure similarity between any two texts
Find most similar documents in your corpus
Understand semantic distance between concepts
Test query-document matching for RAG applications

This hands-on experience helps developers understand how their text is transformed into vectors, making it easier to build effective RAG applications.

Introducing xAI Grok Code Fast Model

We’re excited to offer xAI’s new grok-code-fast-1 model, a speedy and economical reasoning model specifically designed for agentic coding workflows. Built from scratch with a brand-new architecture, this model excels at real-world programming tasks while maintaining blazing-fast response times.

Model Highlights

Performance Characteristics

190 tokens per second generation speed
70.8% score on SWE-Bench-Verified
Optimized for agentic coding workflows with loops of reasoning and tool calls
Prompt caching with 90%+ cache hit rates for improved performance

Language Expertise

Exceptionally versatile across the full software development stack
Particularly adept at TypeScript, Python, Java, Rust, C++, and Go
Mastered common tools like grep, terminal, and file editing
Seamless integration with IDE environments

Ideal Use Cases

Building zero-to-one projects from scratch
Answering complex codebase questions
Performing surgical bug fixes
Agentic coding workflows requiring rapid tool calling
Everyday development tasks with minimal oversight
Real-time code completion in supported IDEs

What’s Next

We continue our rapid innovation cycle with exciting features in development:

Bring Your Own Provider Keys: Use your existing API keys from OpenAI, Anthropic, and other providers
Extended Providers Support: More providers joining our platform
MCP (Model Context Protocol): Support for Anthropic’s MCP standard to enable seamless integration with external tools and data sources

Get Started Today

Ready to enhance your AI applications with embedding support and lightning-fast code generation? Sign up now and receive $5 free credit with your business email.

Have questions or feedback? Reach out through our in-app support or join the conversation in our Slack community.

Stay tuned for next week’s update! Follow us on LinkedIn for the latest product announcements and AI insights.

Nurul Arif Setiawan

September 3, 2025

New to service mesh?

Get up to speed with free online courses at Tetrate Academy and quickly learn Istio and Envoy.

Learn more

Using Kubernetes?

Tetrate Enterprise Gateway for Envoy (TEG) is the easiest way to get started with Envoy Gateway for production use cases. Get the power of Envoy Proxy in an easy-to-consume package managed via the Kubernetes Gateway API.

Learn more

Getting started with Istio?

Tetrate Istio Subscription (TIS) is the most reliable path to production, providing a complete solution for running Istio and Envoy securely in mission-critical environments. It includes:

Tetrate Istio Distro – A 100% upstream distribution of Istio and Envoy.

Compliance-ready – FIPS-verified and FedRAMP-ready for high-security needs.

Enterprise-grade support – The ONLY enterprise support for 100% upstream Istio, ensuring no vendor lock-in.

Learn more

Need global visibility for Istio?

TIS+ is a hosted Day 2 operations solution for Istio designed to streamline workflows for platform and support teams. It offers:

A global service dashboard

Multi-cluster visibility

Service topology visualization

Workspace-based access control

Learn more

MCP Catalog Now Available: Simplified Discovery, Configuration, and AI Observability in Tetrate Agent Router Service

Tetrate Agent Router Service Adds Embedding Support and xAI Grok Code Fast Model

Tetrate Agent Router Service Adds Embedding Support and xAI Grok Code Fast Model