Announcing Tetrate Agent Operations Director for GenAI Runtime Visibility and Governance

Learn more
< Back

What is Apache SkyWalking? Observing the Heterogenous Stack at Scale

The observability problem for modern DevOps is familiar – As enterprises move to microservices, containerization, multi-language RPC frameworks

What%20is%20Apache%20SkyWalking%3F%20Observing%20the%20Heterogenous%20Stack%20at%20Scale

The observability problem for modern DevOps is familiar: As enterprises move to microservices, containerization, multi-language RPC frameworks, and service meshes, there’s an increasing need for users to understand a highly complex, distributed architecture and the dependencies between applications. Apache SkyWalking, an application performance monitor (APM) and observability platform, is an open source project that addresses this need — with or without a service mesh.

Introduction to Apache SkyWalking:

Like other observability tools, Apache SkyWalking allows system administrators to track system health and understand what’s going on among abundant and interdependent services. The internal system of a large-scale enterprise will often have scores of subsystems running hundreds of services and thousands of instances. SkyWalking is built to help operation and maintenance teams identify why and where a request is slow, alert them to deviant system performance, provide apples-to-apples, language-agnostic metrics across apps, and efficiently monitor overall system health. 

Heterogeneity distinguishes SkyWalking, which provides a holistic platform for collection, aggregation and a domain-specific query system — with agents for different systems and the potential to seamlessly integrate a service mesh. Organizations might opt to use SkyWalking so that they can maintain consistency and use the same APM system for traditional and cloud native architectures.

Apache SkyWalking was started by Sheng Wu as a personal project and has grown meteorically since then, with 375 contributors today. It was named after a literal “observability platform,” the glass bridge Skywalk at Grand Canyon West, that provides a birds-eye view of the natural landmark.

Functionalities and Use Cases of SkyWalking:

From its humble start as a training project, to help colleagues understand the problems that arise in a distributed system, it evolved from a pure tracing system to a full-featured APM system and observability analysis platform — aimed at microservices and distributed services running in large-sized enterprises at scale. SkyWalking is a top-level Apache project and monitors large-scale distributed systems that include Alibaba, Huawei, Tencent, Baidu, China Telecom, and various banks and insurance companies. It collects and analyzes, in many cases, billions of traces with metrics per day.

“SkyWalking guarantees availability under high-load conditions in production,” says Sheng Wu. “Its users are looking for regular processing power at the level of tens of billions, lightweight process, pluggability, and easy customization.”

SkyWalking’s functionalities fall under the “3 pillars” of observability: metrics, logs, and tracing. Fundamentally, SkyWalking is an APM tool dedicated to application performance — allowing development, operations and maintenance teams to understand the relationships between their systems and their operations in practice. 

Metrics give you aggregated data on application performance — for example, the number of services, average response time, throughput, etc. You can also add a custom-defined metric to the SkyWalking UI, based on individual business requirements. Logs provide a record of events or error messages. Tracing shows you event behavior over time, so that you can track a request from start to finish and identify system defects and errors.  

SkyWalking’s distributed topology maps use the STAM (Streaming Topology Analysis Method), to analyze topology from traces displaying relationships that can’t be pulled from simple metrics SDKs. Used in a service mesh, SkyWalking can support observability with Envoy’s Access Log Service (ALS) — the proxy extension that emits detailed access logs of all requests going through Envoy. SkyWalking gives you various means of making such data useful and actionable: a list view of latency bar graphs to quickly view slow points in the system, alarms triggered by a user-specified service-level objective (SLO) threshold, or a topology diagram to locate the boundaries of a performance issue, to name just a few examples.

Key Components and Recent Updates:

SkyWalking’s architecture includes four key components:

  • The agent; i.e. the language agent or protocol of other projects providing metrics and tracing.
  • The Observability Analysis Platform (OAP); a highly modularized and lightweight analysis program, consisting of a receiver and kernels for stream-processing and queries.
  • Storage.
  • A UI module to query and display data through the standard GraphQL protocol.

Recent SkyWalking updates have focused on making the project increasingly lightweight, pluggable and customizable, with robust visualizations and expanding reach for monitoring its own performance and (most recently) browser data.

For mesh adopters, Apache SkyWalking integrates with Istio and Envoy and comes built into the service mesh management platform Tetrate Service Bridge.

For more on SkyWalking, you can try out the interactive demo or download the free e-book of SkyWalking in Action: Your Guide to Observability at Scale.

Tevah Platt is a content writer for Tetrate and was an editor of SkyWalking in Action; Lizan Zhou is a Tetrate engineer and Envoy maintainer. This article originally appeared on The New Stack.

Product background Product background for tablets
New to service mesh?

Get up to speed with free online courses at Tetrate Academy and quickly learn Istio and Envoy.

Learn more
Using Kubernetes?

Tetrate Enterprise Gateway for Envoy (TEG) is the easiest way to get started with Envoy Gateway for production use cases. Get the power of Envoy Proxy in an easy-to-consume package managed via the Kubernetes Gateway API.

Learn more
Getting started with Istio?

Tetrate Istio Subscription (TIS) is the most reliable path to production, providing a complete solution for running Istio and Envoy securely in mission-critical environments. It includes:

  • Tetrate Istio Distro – A 100% upstream distribution of Istio and Envoy.
  • Compliance-ready – FIPS-verified and FedRAMP-ready for high-security needs.
  • Enterprise-grade support – The ONLY enterprise support for 100% upstream Istio, ensuring no vendor lock-in.
  • Learn more
    Need global visibility for Istio?

    TIS+ is a hosted Day 2 operations solution for Istio designed to streamline workflows for platform and support teams. It offers:

  • A global service dashboard
  • Multi-cluster visibility
  • Service topology visualization
  • Workspace-based access control
  • Learn more
    Decorative CTA background pattern background background
    Tetrate logo in the CTA section Tetrate logo in the CTA section for mobile

    Ready to enhance your
    network

    with more
    intelligence?