Organizations often want to know how a service mesh can help provide better visibility into their deployments, so they can get a clearer understanding of their user experience.
But neither metrics nor logs can provide specifics on individual cases. That’s where tracing comes in.
A trace gives a developer the full context of a user experience, by attaching a correlation ID to each user request. That correlation ID is the thread that links the trace together through multiple services.
Since all requests go through Envoy, it may seem that Envoy could provide the tracing information, but it’s not quite that simple. To Envoy, each application looks just like the network – Envoy doesn’t have any special insight into the application’s internals. This means Envoy can’t track causality: 10 requests enter a service and 100 leave it, which of the 100 relate to which of the 10? Because Envoy cannot answer that question, it cannot automatically forward trace context – or any other kind of context – on behalf of your application. The presentation below shows this graphically:
Request Context in a Mesh
1. Tracing involves following a path through multiple services to understand the full context of the user experience. A trace begins with a user request, which is assigned a correlation ID.
2. Several headers are attached to the request as the trace gets started, including a normal header.
3. A trace header is also attached to the request, in this example, 1234.
4. A custom header is also attached.
5. Envoy is sitting right beside the application, and the two of them talk to each other. Any request that comes in goes through Envoy.
6. The trace will show everything that happens to the user request. Since the request is going through Envoy, that will be part of the trace. After going through Envoy, the request goes to the application.
7. Envoy can also attach additional headers to gather information about what is happening inside the app.
8. As the request moves through the app, the app will likely contact another system to process the request.
9. We can see inside the application that the request going out is on behalf of one that came in from the user with the trace ID 1234.
10. Every request needs an identity, and we can see here that this request is correlated to the user.
11. This is where things get more complicated. The application has to copy the identity for that user. It can’t get from one step to another without copying the ID.
12. The app sends a response to the user request, and in turn gets a response back.
13. The system will probably have to do multiple requests back and forth to get the full answer for the user request and return it.
14. If this one request was all that the app was receiving, the service mesh could propagate headers and do all the tracing. But there’s never just one request coming in at a time, there’s always multiple requests happening concurrently. It’s that concurrency — multiple things happening at one time — that causes a loss of visibility.
15. Because Envoy can only see the network, and not inside the app, all it sees are multiple requests and responses. There’s no way for Envoy to know which of those belong to the different user requests, because they all happened at the same time. Envoy can’t put the data on the individual requests.
16. If we add multiple user requests at the same time, we can see the back-and-forth starts to grow rapidly.
17. That’s why the application has to be involved, because it has to copy that data. That’s not necessarily easy to do, but there are tools built by the tracing community to make it easier. Tetrate recommends Zipkin.
This is why the business logic itself needs to forward headers from the incoming request onto outgoing requests. There are many libraries for this for a variety of tracing systems and languages. TSB ships with Zipkin. Therefore, any of Zipkin’s listed libraries/agents would work out of the box.
For custom, non-tracing headers that need to be forwarded, you need to implement these yourself in a library your developers use.
Zack Butcher is a Tetrate engineer and an Istio contributor; Eileen AJ Connelly is a content writer for Tetrate. Tetrate writer Tevah Platt contributed.
Tetrate is a service mesh company that makes it easier for companies to adopt and use Istio and Envoy