Diagnose Service Mesh Network Performance with eBPF


This article will show how to use Apache SkyWalking with eBPF to make network troubleshooting easier in a service mesh environment.

Apache SkyWalking is an application performance monitor tool for distributed systems. It observes metrics, logs, traces, and events in the service mesh environment and uses that data to generate a dependency graph of your pods and services. This dependency graph can provide quick insights into your system, especially when there’s an issue.

However, when troubleshooting network issues in SkyWalking’s service topology, it is not always easy to pinpoint where the error actually is. There are two reasons for the difficulty:

  • Traffic through the Envoy sidecar is not easy to observe. Data from Envoy’s Access Log Service (ALS) shows traffic between services (sidecar-to-sidecar), but not metrics on communication between the Envoy sidecar and the service it proxies. Without that information, it is more difficult to understand the impact of the sidecar.
  • There is a lack of data from transport layer (OSI Layer 4) communication. Since services generally use application layer (OSI Layer 7) protocols such as HTTP, observability data is generally restricted to application layer communication. However, the root cause may actually be in the transport layer, which is typically opaque to observability tools.

Access to metrics from Envoy-to-service and transport layer communication can make it easier to diagnose service issues. To this end, SkyWalking needs to collect and analyze transport layer metrics between processes inside Kubernetes pods—a task well suited to eBPF. We investigated using eBPF for this purpose and present our results and a demo below.

Read More
Istio Cost Analyzer
Istio, Open Source, Service Mesh, Tetrate

Use Tetrate’s Open Source Istio Cost Analyzer to Optimize Your Cloud Egress Costs

Who Is This For?

You should read this if you run Kubernetes and/or Istio on a public cloud, and you care about your cloud bill. Cloud providers charge money for data egress, including data leaving one availability zone destined for another. If your Kubernetes deployments span availability zones, you are likely being charged for egress between internal components. Even if you don’t run Kubernetes/Istio, you’ll still run into cross-zone data egress costs, which this article will help you understand and minimize.

Read More
Minimizing Cross-Zone Traffic

Minimizing Cross-Zone Traffic Charges with Istio

Deploying Kubernetes clusters across availability zones can offer significant reliability benefits, especially when you use Istio for application routing and load balancing. If you have built redundant failure domains in separate zones, the mesh can automatically shift traffic to another zone should one zone fail. Istio’s locality-aware load balancing can also help reduce latency and cross-zone traffic charges from your cloud provider by keeping traffic within the same zone as much as possible.

Read More
Automate Istio CA rotation

Automate Istio CA rotation in production at scale

One of Istio’s core capabilities is to facilitate a zero trust network architecture by managing identity for services in the mesh. To retrieve valid certificates for mTLS communication in the mesh, individual workloads issue a certificate signing request (CSR) to istiod. Istiod, in turn, validates the request and uses a certificate authority (CA) to sign the CSR to generate the certificate. By default, Istio uses its own self-signed CA for this purpose, but best practice is to integrate Istio into your existing PKI by creating an intermediate CA for each Istio deployment.

Read More
David Wang
Announcements, Tetrate

David Wang joins Tetrate as the Head of Marketing

Tetrate is excited to announce and welcome David Wang to the team! David is joining as the Head of Marketing for Tetrate. He will be building and leading a world-class marketing team to develop a strategic narrative for Tetrate in the emerging Service Mesh market. David will spearhead an innovative, repeatable, and scalable GTM strategy for Tetrate. In addition, he will also create brand awareness and credibility with the analyst firms, enterprises, and the market while continuing to grow Tetrate’s unrivaled reputation within the developer community.

Read More

Brian Dussault joins Tetrate as the Head of Engineering

Tetrate is excited to announce and welcome Brian Dussault to the team! Brian is joining as the Head of Engineering. He will lead and scale the Engineering organization owning TSB and open source initiatives that offer rich and highly performant solutions empowering multiple personas across the enterprise in their Service Mesh journey.

Read More
eBPF and Sidecars
Service Mesh, Tetrate

eBPF and Sidecars – Getting the Most Performance and Resiliency out of the Service Mesh

If you’ve been watching the service mesh space recently, you’ll have noticed a lot of talk about eBPF and “sidecar-less” meshes. In fact, there’s been so much talk about these things that I’m hoping for a lot of readers for this blog post, just because I’ve got all of it in the title!

But what actually are “sidecar-less” service meshes? How do they work? And do they solve the problems we’ve been told they do, namely improving performance and reducing resource usage? In this post I’ll explain what these two technologies are, what they can and can’t do for the mesh, and how they do — and do not — work together.

Read More
Apache SkyWalking

Pinpoint Service Mesh Critical Performance Impact by using eBPF

Introducing performance analysis in production with SkyWalking Rover.


Apache SkyWalking observes metrics, logs, traces, and events for services deployed into the service mesh. When troubleshooting, SkyWalking error analysis can be an invaluable tool helping to pinpoint where an error occurred. However, performance problems are more difficult: It’s often impossible to locate the root cause of performance problems with pre-existing observation data. To move beyond the status quo, dynamic debugging and troubleshooting are essential service performance tools. In this article, we’ll discuss how to use eBPF technology to improve the profiling feature in SkyWalking and analyze the performance impact in the service mesh.

Read More
Istio vs Linkerd vs Consul

Istio vs. Linkerd vs. Consul

Introduction to Service Mesh

Service mesh is an infrastructure layer between application components and the network via a proxy. These app components are often microservices, but any workload from serverless containers to traditional n-tier applications in VMs or on bare metal can participate in a mesh. Rather than each component communicating directly with other components over the network, the proxies mediate that communication. These proxies form the data plane, providing many capabilities for implementing security and traffic policy and producing telemetry about the services the proxies are deployed with. Read more about service mesh capabilities.

Read More