Simplify Local AI Agents with Goose and Tetrate Agent Router Service

Learn more

Tetrate Agent Router Service Adds Embedding Support and xAI Grok Code Fast Model

This week's update introduces embedding model support for RAG applications and adds xAI's new Grok Code Fast model, expanding our AI development capabilities for both code generation and semantic search use cases.

Tetrate Agent Router Service Adds Embedding Support and xAI Grok Code Fast Model

Tetrate Agent Router Service Adds Embedding Support and xAI Grok Code Fast Model

We’re excited to announce two major enhancements to Tetrate Agent Router Service that significantly expand its capabilities for AI developers. This week’s update brings native embedding model support for building RAG (Retrieval-Augmented Generation) applications and adds xAI’s latest Grok Code Fast model for enhanced code generation capabilities.

If you haven’t tried Tetrate Agent Router Service yet, you can sign up in a single step to experience these new capabilities with $5 free credit when using your business email.

Embedding Model Support for RAG Applications

We’re thrilled to introduce native embedding model support, enabling developers to build sophisticated RAG (Retrieval-Augmented Generation) applications directly through Tetrate Agent Router Service. This feature provides seamless access to OpenAI’s state-of-the-art embedding models through our unified API.

Understanding Embeddings and RAG

Embeddings are numerical representations of text that capture semantic meaning, enabling AI applications to understand context and relationships between different pieces of information. In RAG applications, embeddings are crucial for:

Semantic Search

  • Find relevant documents based on meaning, not just keywords
  • Retrieve contextually similar information even with different wording
  • Improve search accuracy by understanding user intent

Knowledge Retrieval

  • Convert your knowledge base into searchable vectors
  • Quickly find the most relevant information from large datasets
  • Enhance LLM responses with accurate, up-to-date information

Context Enhancement

  • Provide LLMs with relevant background information
  • Reduce hallucinations by grounding responses in actual data
  • Enable domain-specific AI applications without fine-tuning

Embedding Playground

We’ve enhanced our Playground to support embedding models, making it incredibly easy to experiment with text embeddings and understand how they work. You can now generate embeddings for any text, visualize their dimensions, and compare how different models represent the same content.

Embedding Playground Interface
Embedding Playground Interface

The new Embedding Playground features:

Example Corpus

  • Pre-loaded sample texts for quick testing
  • Domain-specific examples (technical docs, product descriptions, FAQs)
  • Remove existing texts or add new texts to the sample

Visualization

  • Real-time vector visualization
  • Clustering of similar documents
  • Interactive exploration of embedding space
  • Visual representation of semantic relationships

Cosine Similarity Calculations

  • Measure similarity between any two texts
  • Find most similar documents in your corpus
  • Understand semantic distance between concepts
  • Test query-document matching for RAG applications

This hands-on experience helps developers understand how their text is transformed into vectors, making it easier to build effective RAG applications.

Introducing xAI Grok Code Fast Model

We’re excited to offer xAI’s new grok-code-fast-1 model, a speedy and economical reasoning model specifically designed for agentic coding workflows. Built from scratch with a brand-new architecture, this model excels at real-world programming tasks while maintaining blazing-fast response times.

Model Highlights

Performance Characteristics

  • 190 tokens per second generation speed
  • 70.8% score on SWE-Bench-Verified
  • Optimized for agentic coding workflows with loops of reasoning and tool calls
  • Prompt caching with 90%+ cache hit rates for improved performance

Language Expertise

  • Exceptionally versatile across the full software development stack
  • Particularly adept at TypeScript, Python, Java, Rust, C++, and Go
  • Mastered common tools like grep, terminal, and file editing
  • Seamless integration with IDE environments

Ideal Use Cases

  • Building zero-to-one projects from scratch
  • Answering complex codebase questions
  • Performing surgical bug fixes
  • Agentic coding workflows requiring rapid tool calling
  • Everyday development tasks with minimal oversight
  • Real-time code completion in supported IDEs

What’s Next

We continue our rapid innovation cycle with exciting features in development:

  • Bring Your Own Provider Keys: Use your existing API keys from OpenAI, Anthropic, and other providers
  • Extended Providers Support: More providers joining our platform
  • MCP (Model Context Protocol): Support for Anthropic’s MCP standard to enable seamless integration with external tools and data sources

Get Started Today

Ready to enhance your AI applications with embedding support and lightning-fast code generation? Sign up now and receive $5 free credit with your business email.

Have questions or feedback? Reach out through our in-app support or join the conversation in our Slack community.

Stay tuned for next week’s update! Follow us on LinkedIn for the latest product announcements and AI insights.

Product background Product background for tablets
New to service mesh?

Get up to speed with free online courses at Tetrate Academy and quickly learn Istio and Envoy.

Learn more
Using Kubernetes?

Tetrate Enterprise Gateway for Envoy (TEG) is the easiest way to get started with Envoy Gateway for production use cases. Get the power of Envoy Proxy in an easy-to-consume package managed via the Kubernetes Gateway API.

Learn more
Getting started with Istio?

Tetrate Istio Subscription (TIS) is the most reliable path to production, providing a complete solution for running Istio and Envoy securely in mission-critical environments. It includes:

  • Tetrate Istio Distro – A 100% upstream distribution of Istio and Envoy.
  • Compliance-ready – FIPS-verified and FedRAMP-ready for high-security needs.
  • Enterprise-grade support – The ONLY enterprise support for 100% upstream Istio, ensuring no vendor lock-in.
  • Learn more
    Need global visibility for Istio?

    TIS+ is a hosted Day 2 operations solution for Istio designed to streamline workflows for platform and support teams. It offers:

  • A global service dashboard
  • Multi-cluster visibility
  • Service topology visualization
  • Workspace-based access control
  • Learn more
    Decorative CTA background pattern background background
    Tetrate logo in the CTA section Tetrate logo in the CTA section for mobile

    Ready to enhance your
    network

    with more
    intelligence?