Tetrate is excited to announce the general availability of Tetrate Istio Subscription Plus (TIS+), the best Kubernetes network troubleshooting for Istio users.
TIS+ provides production visibility across environments with self-service application troubleshooting for developers and rapid root cause analysis for admins. The global status and troubleshooting console includes a service inventory, visualization of upstream/downstream dependencies, and aggregates Istio metrics across instances and environments.
Tetrate Istio Subscription+ now supports bring-your-own Istio (BYOI), extending its enterprise-grade support to upstream Istio deployments. Already fully compatible with Tetrate Istio Distribution (TID), this enhanced flexibility allows organizations to benefit from Tetrate’s expert support, security updates, and compliance tools, regardless of their Istio distribution.
Taken together, TIS and TIS+ provide a comprehensive solution to support every stage of your Istio journey. From Istio break/fix support to troubleshooting the full TIS offering empowers organizations to confidently deploy, manage, and optimize Istio in any environment.
TIS+ is available standalone or as an add-on to your TIS package.To see TIS+ in action, visit our website, get a demo, or read onwards!
Addressing Market Gaps in Istio Troubleshooting
In distributed environments, service troubleshooting has become increasingly complex. Developers need access to service metrics for troubleshooting and optimization, but admins need their views limited to only their services. Configuring traditional APMs to manage Istio metrics and segregate by ownership is both cumbersome to achieve and difficult to maintain. Alternate OSS Istio monitoring solutions are difficult to configure for global, multi-tenant views of your entire service network.
TIS+ solves for these problems and improves mean time to detection (MTTD) and mean time to resolution (MTTR) with a few key features that are offered through its multi-tenant and role-based access control (RBAC) controlled platform:
- Bring-Your-Own Istio: Ideal for deployments already running Istio
- Service Topology: Top-level multi-cluster graphical view of all service workloads monitored by TIS+ with real time state, health, and dependencies
- Aggregated Metrics: Pre-aggregated metrics across all running instances that provide the true health a service and its application
- Tracing & Logs: Troubleshooting essentials to identify issues in a services footprint spread across clusters
And TIS+ will continue to improve in the coming months:
- Tetrate will bring the power of TIS+ to Istio ambient mode.
- User experience enhancements
Bring-Your-Own Istio (BYOI)
Tetrate was founded by core members of the Istio founding team, and has since continued to be a top contributor to both Istio and Envoy. As part of Tetrate Istio Subscription (TIS), Tetrate provides a 100% upstream build of Istio called Tetrate Istio Distro (TID), with available FIPS-compliant builds and enterprise support.
However, we understand that customers may already have existing Istio deployments they wish to maintain. BYOI support has been added to TIS & TIS+ with that flexibility in mind.
With BYOI, customers may:
- Use their existing Istio—provided it is within the expanded TIS support window of four releases behind the latest GA version.
- Onboard BYOI clusters in a lightweight “observe” mode into TIS+, and take advantage of the unified TIS+ user interface that centralizes the consumption of all metrics, distributed tracing, and live streaming logs.
Service Topology
Figure 1: TIS+ service topology view.
From the TIS+ service topology view example (Figure 1, above), it’s immediately obvious that some instances of the ratings and details microservices (ratings-v2 and details-v2, respectively, shown in red) are unhealthy. As we’ll see below, we can drill down to a more detailed view to start investigating root cause.
Aggregated and Detailed Metrics
Figure 2: Metrics for the ratings-v1 service show no 5xx errors.
Figure 3: Metrics for the ratings-v2 service show regular 5xx errors.
We can see the ratings-v2 service regularly throws 5xx errors and is in an “Unacceptable” health state. As the web app’s ratings are served by both instances of the ratings service, whenever ratings-v2 is invoked, the resulting 5xx error may prevent the display of ratings on the app.
We can run a trace to see the end-to-end call flow. This trace will indicate the point of failure, and will work seamlessly even if the services along the call path fall in different clusters.
Call Tracing
Figure 4: Call trace and the failure point.
When we run a trace, the point of failure is clearly highlighted as the cause of the 503 error. Next, let’s look at the ratings-v2 service logs.
Logging
Figure 5: ratings-v2 service logs reveal 503 errors.
The logs show the ratings-v2 service is returning 503 errors, resulting in a failure to display ratings whenever it is invoked. On the other hand, ratings-v1 returns a healthy 200 response code.
Figure 6: Logs from the ratings-v2 service that show a healthy 200 response code.
Istio Ambient Mode
Istio ambient mode is now stable as of Istio 1.24.
Istio’s ambient mode promises a reduced resource overhead in some use cases compared to the traditional sidecar deployment mode, which may result in lower costs and improved performance for some users by eliminating the need for a dedicated proxy container within each pod, while still providing service mesh capabilities like traffic management and security. Most customers will be running a mesh with both the traditional, sidecar enabled Istio and ambient mode.
As of January 2025, TIS+ will support service discovery and observability of clusters running in Istio ambient mode. In this mode, TIS+ will:
- Offer service discovery and service dependency visualization, and provide a full service topology, health, and inter-service communication view of the deployment
- Collect aggregated metrics from the data path services
Support distributed call tracing in a way that is supported by the ambient mesh’s observability feature-set.
User Experience Enhancements
TIS+ provides an enterprise grade Management framework in which any organization can model their teams on the basis of ownership and permissions. The per-team (in TIS+ terms, “per-tenant”/”per-workspace”) deployment state visualizations is already available in TIS+.
This month, we are introducing the following new capabilities for enhanced user experience:
Error Analysis Dashboard
This dashboard is for administrators and application teams to immediately see their unhealthy services based on Istio’s Golden Signals.
This dashboard will provide the user with:
- Top unhealthy services by APDEX score
- Services with worst response times
- Services with worst throughput rates
- Services with worst overall SLAs
Figure 7: Error Analysis Dashboard
This top-level view enables administrators to identify and isolate services that are repeat offenders along with the types of issues that cause these services to show operational degradation.
Application Troubleshooting Workflows
Another enhancement that we are doing is adding simplified workflows for Application Troubleshooting by stepping users through steps that will help in identifying underlying issues.
Figure 8: Application Troubleshooting Workflow
With this enhancement, the user can search for an error scenario, and pick the best suggested option, and follow the steps to self-diagnose the issue. The addition of this type of capabilities allows platform teams to truly shift left for Day 2 operations.
API Metrics Summarization
When endpoint metrics are collected and possibly kept historically, they can easily overwhelm storage requirements. We have added an optimization through machine learning that intelligently summarizes metrics from each running service instance or endpoint, and provides an aggregated view for the actual endpoint being monitored.
For example, if there is an API with multiple parameters:
Before summarization, metrics would be collected against each item:
/api/version1
/api/version2
/api/version3
After summarization, metrics would be summarized and reported as:
/api/{var}
This greatly reduces the storage volume, and provides actual meaningful metrics that application owners monitor and act upon.
What’s Next
In this monthly blog series, we will continuously announce TIS+ features, capabilities, and news.
Stay tuned for the next blog that will cover:
- TIS+ support for virtual machines
- TIS+ support for sidecar-less services (eBPF mode)
- More user experience enhancements
Get Started
Here are some resources to help you get started:
- Free training. If you’re new to Istio, check out our free Istio training at Tetrate Academy, including Istio Fundamentals and our Istio 0-60 and Istio Wasm workshops ›
- Hardened Istio distribution. If you’re looking to take Istio for a test drive, install Tetrate Istio Distro, our open source, 100% upstream Istio builds with extended CVE patching and version support.
- CVE scanner. If you’re already running Istio, try our free CVE scanner ›
- Config verification Learn more about our Istio config verification tooling, Tetrate Configuration Analyzer: Istio Configuration Security: How to Avoid Misconfigurations ›
- Let’s talk. Schedule a call with an Istio expert to find out how TIS and TIS+ can help speed delivery, reduce risk, and streamline Istio operations ›
TIS+ Documentation. Learn about TIS+ through our online documentation.