Announcing Built On Envoy: Making Envoy Extensions Accessible to Everyone

Learn more

Tetrate Agent Router Service Adds Embedding Support and xAI Grok Code Fast Model

This week's update introduces embedding model support for RAG applications and adds xAI's new Grok Code Fast model, expanding our AI development capabilities for both code generation and semantic search use cases.

Tetrate Agent Router Service Adds Embedding Support and xAI Grok Code Fast Model

Tetrate Agent Router Service Adds Embedding Support and xAI Grok Code Fast Model

We’re excited to announce two major enhancements to Tetrate Agent Router Service that significantly expand its capabilities for AI developers. This week’s update brings native embedding model support for building RAG (Retrieval-Augmented Generation) applications and adds xAI’s latest Grok Code Fast model for enhanced code generation capabilities.

If you haven’t tried Tetrate Agent Router Service yet, you can sign up in a single step to experience these new capabilities with $5 free credit when using your business email.

Embedding Model Support for RAG Applications

We’re thrilled to introduce native embedding model support, enabling developers to build sophisticated RAG (Retrieval-Augmented Generation) applications directly through Tetrate Agent Router Service. This feature provides seamless access to OpenAI’s state-of-the-art embedding models through our unified API.

Understanding Embeddings and RAG

Embeddings are numerical representations of text that capture semantic meaning, enabling AI applications to understand context and relationships between different pieces of information. In RAG applications, embeddings are crucial for:

Semantic Search

  • Find relevant documents based on meaning, not just keywords
  • Retrieve contextually similar information even with different wording
  • Improve search accuracy by understanding user intent

Knowledge Retrieval

  • Convert your knowledge base into searchable vectors
  • Quickly find the most relevant information from large datasets
  • Enhance LLM responses with accurate, up-to-date information

Context Enhancement

  • Provide LLMs with relevant background information
  • Reduce hallucinations by grounding responses in actual data
  • Enable domain-specific AI applications without fine-tuning

Embedding Playground

We’ve enhanced our Playground to support embedding models, making it incredibly easy to experiment with text embeddings and understand how they work. You can now generate embeddings for any text, visualize their dimensions, and compare how different models represent the same content.

Embedding Playground Interface
Embedding Playground Interface

The new Embedding Playground features:

Example Corpus

  • Pre-loaded sample texts for quick testing
  • Domain-specific examples (technical docs, product descriptions, FAQs)
  • Remove existing texts or add new texts to the sample

Visualization

  • Real-time vector visualization
  • Clustering of similar documents
  • Interactive exploration of embedding space
  • Visual representation of semantic relationships

Cosine Similarity Calculations

  • Measure similarity between any two texts
  • Find most similar documents in your corpus
  • Understand semantic distance between concepts
  • Test query-document matching for RAG applications

This hands-on experience helps developers understand how their text is transformed into vectors, making it easier to build effective RAG applications.

Introducing xAI Grok Code Fast Model

We’re excited to offer xAI’s new grok-code-fast-1 model, a speedy and economical reasoning model specifically designed for agentic coding workflows. Built from scratch with a brand-new architecture, this model excels at real-world programming tasks while maintaining blazing-fast response times.

Model Highlights

Performance Characteristics

  • 190 tokens per second generation speed
  • 70.8% score on SWE-Bench-Verified
  • Optimized for agentic coding workflows with loops of reasoning and tool calls
  • Prompt caching with 90%+ cache hit rates for improved performance

Language Expertise

  • Exceptionally versatile across the full software development stack
  • Particularly adept at TypeScript, Python, Java, Rust, C++, and Go
  • Mastered common tools like grep, terminal, and file editing
  • Seamless integration with IDE environments

Ideal Use Cases

  • Building zero-to-one projects from scratch
  • Answering complex codebase questions
  • Performing surgical bug fixes
  • Agentic coding workflows requiring rapid tool calling
  • Everyday development tasks with minimal oversight
  • Real-time code completion in supported IDEs

What’s Next

We continue our rapid innovation cycle with exciting features in development:

  • Bring Your Own Provider Keys: Use your existing API keys from OpenAI, Anthropic, and other providers
  • Extended Providers Support: More providers joining our platform
  • MCP (Model Context Protocol): Support for Anthropic’s MCP standard to enable seamless integration with external tools and data sources

Get Started Today

Ready to enhance your AI applications with embedding support and lightning-fast code generation? Sign up now and receive $5 free credit with your business email.

Have questions or feedback? Reach out through our in-app support or join the conversation in our Slack community.

Stay tuned for next week’s update! Follow us on LinkedIn for the latest product announcements and AI insights.

Product background Product background for tablets
Building AI agents

Agent Router Enterprise provides managed LLM & MCP Gateways plus AI Guardrails in your dedicated instance. Graduate agents from prototype to production with consistent model access, governed tool use, and runtime supervision — built on Envoy AI Gateway by its creators.

  • LLM Gateway – Unified model catalog with automatic fallback across providers
  • MCP Gateway – Curated tool access with per-profile authentication and filtering
  • AI Guardrails – Enforce policies, prevent data loss, and supervise agent behavior
  • Learn more
    Replacing NGINX Ingress

    Tetrate Enterprise Gateway for Envoy (TEG) is the enterprise-ready replacement for NGINX Ingress Controller. Built on Envoy Gateway and the Kubernetes Gateway API, TEG delivers advanced traffic management, security, and observability without vendor lock-in.

  • 100% upstream Envoy Gateway – CVE-protected builds
  • Kubernetes Gateway API native – Modern, portable, and extensible ingress
  • Enterprise-grade support – 24/7 production support from Envoy experts
  • Learn more
    Decorative CTA background pattern background background
    Tetrate logo in the CTA section Tetrate logo in the CTA section for mobile

    Ready to enhance your
    network

    with more
    intelligence?