Tetrate

eBPF-Enhanced HTTP Observability: L7 Metrics and Tracing with SkyWalking

Background

Apache SkyWalking is an open-source Application Performance Management system that helps users collect and aggregate logs, traces, metrics, and events for display on a UI. In a previous article, we introduced how to use Apache SkyWalking Rover to analyze lower-level (Layer 4) network performance issues in a service mesh environment. Since modern applications often use mature Layer 7 protocols, such as HTTP, for interactions between systems, it’s important to be able to quickly troubleshoot issues at Layer 7, as well. In this article, we will discuss how to use eBPF techniques to analyze performance bottlenecks of Layer 7 protocols and how to enhance the tracing system using network sampling.

This article will show how to use Apache SkyWalking with eBPF to enhance metrics and traces in HTTP observability.

Read More
Tetrate

Boost Root Cause Analysis Quickly with SkyWalking’s New Trace-Metrics Association Feature

Observability for modern distributed applications is critical for understanding how they behave under a variety of conditions and for troubleshooting and resolving issues when they arise. Traces, metrics, and logs are regarded as fundamental parts of the observability stack. Traces are the footprints of distributed system executions, meanwhile, metrics measure system performance with numbers in the timeline. Essentially, they measure performance from two dimensions. Being able to quickly visualize the connection between traces and corresponding metrics makes it possible to quickly diagnose which process flows are correlated to potentially pathological behavior. This powerful new capability is now available in SkyWalking 9.3.0.

Read More
eBPF
Tetrate

Diagnose Service Mesh Network Performance with eBPF

Background

This article will show how to use Apache SkyWalking with eBPF to make network troubleshooting easier in a service mesh environment.

Apache SkyWalking is an application performance monitor tool for distributed systems. It observes metrics, logs, traces, and events in the service mesh environment and uses that data to generate a dependency graph of your pods and services. This dependency graph can provide quick insights into your system, especially when there’s an issue.

However, when troubleshooting network issues in SkyWalking’s service topology, it is not always easy to pinpoint where the error actually is. There are two reasons for the difficulty:

  • Traffic through the Envoy sidecar is not easy to observe. Data from Envoy’s Access Log Service (ALS) shows traffic between services (sidecar-to-sidecar), but not metrics on communication between the Envoy sidecar and the service it proxies. Without that information, it is more difficult to understand the impact of the sidecar.
  • There is a lack of data from transport layer (OSI Layer 4) communication. Since services generally use application layer (OSI Layer 7) protocols such as HTTP, observability data is generally restricted to application layer communication. However, the root cause may actually be in the transport layer, which is typically opaque to observability tools.

Access to metrics from Envoy-to-service and transport layer communication can make it easier to diagnose service issues. To this end, SkyWalking needs to collect and analyze transport layer metrics between processes inside Kubernetes pods—a task well suited to eBPF. We investigated using eBPF for this purpose and present our results and a demo below.

Read More
Apache SkyWalking

Pinpoint Service Mesh Critical Performance Impact by using eBPF

Introducing performance analysis in production with SkyWalking Rover.

Background

Apache SkyWalking observes metrics, logs, traces, and events for services deployed into the service mesh. When troubleshooting, SkyWalking error analysis can be an invaluable tool helping to pinpoint where an error occurred. However, performance problems are more difficult: It’s often impossible to locate the root cause of performance problems with pre-existing observation data. To move beyond the status quo, dynamic debugging and troubleshooting are essential service performance tools. In this article, we’ll discuss how to use eBPF technology to improve the profiling feature in SkyWalking and analyze the performance impact in the service mesh.

Read More
Apache SkyWalking application performance monitoring tool
Apache SkyWalking, Open Source, Tetrate

SkyWalking 8.4 provides infrastructure monitoring for VMs

Apache SkyWalking– the APM tool for distributed systems–  has historically focused on providing observability around tracing and metrics, but service performance is often affected by the host. The newest release, SkyWalking 8.4.0, introduces a new feature for monitoring virtual machines. Users can easily detect possible problems from the dashboard– for example, when CPU usage is overloaded, when there’s not enough memory or disk space, or when the network status is unhealthy, etc.

Read More
SkyWalking service topology
Apache SkyWalking, Observability, Open Source

Observability at scale: SkyWalking it is!

SkyWalking, a top-level Apache project, is the open source APM and observability analysis platform that is solving the problems of 21st-century systems that are increasingly large, distributed, and heterogenous. It’s built for the struggles system admins face today: To identify and locate needles in a haystack of interdependent services, to get apples-to-apples metrics across polyglot apps, and to get a complete and meaningful view of performance. 

SkyWalking is a holistic platform that can observe microservices on or off a mesh, and can provide consistent monitoring with a lightweight payload.

Let’s take a look at how SkyWalking evolved to address the problem of observability at scale, and grew from a pure tracing system to a feature-rich observability platform that is now used to analyze deployments that collect tens of billions of traces per day. 

Read More