This is the second in a series of articles on the value of Envoy Gateway as it reaches the 1.0 release milestone and ready for production use.
Envoy Gateway has reached 1.0 and we’re working with folks deploying it in production already! One of the biggest questions that comes up with talking about using Envoy Gateway as a replacement for their primary ingress – and often API gateway replacement – is “how do I extend Envoy and Envoy Gateway with the custom functionality I need?”
Envoy offers extensibility points via four mechanisms, each with different trade-offs and built with slightly different use cases in mind. Briefly, they are:
- An in-binary C++ filter (i.e. natively extending Envoy)
- Wasm executed by Envoy; your program must target the proxy-wasm SDKs
- Envoy’s External Processing API, ext_proc; a gRPC or REST API for processing a request’s metadata and body
- Envoy’s External Authorization API, ext_authz; a gRPC or REST API for processing a requests’s metadata
In this article, we’ll briefly cover the pros and cons of each mechanism and offer recommendations for when to use them.
Custom C++ Filter: High Performance… at a Cost
When you need the best possible performance and throughput, a native C++ filter in Envoy can be the way to go. You must take on the overhead of building and distributing your own Envoy binary that includes your custom filter, as Envoy does not support loading compiled filters “dynamically” (e.g. via a `.so` file). It’s a relatively big lift in terms of executing your own build and deployment pipeline for Envoy, and depending on your organization you may not be able to join the early security disclosure list, meaning you’re playing catchup on CVE announcements. This is likely only worth it in cases where you need to implement L7 processing for a custom protocol, or are doing some fairly heavyweight processing and transformation, or in some way have significant business logic embedded in the proxy (don’t do this, Envoy is not a webserver!).
Wasm in Envoy: Fast Prototyping, Suitable for Production
Wasm is the future of in-process dynamic extensibility in Envoy. WebAssembly was originally developed for the web, to enable some pretty incredible applications (like Google Docs, where I’m drafting this post). It’s evolved since that original use case to a more general purpose target, to something like the bytecode for new Wasm VMs (V8’s Wasm sandbox being one example, and Wazero executing directly on the host system being a second).
Envoy exposes the Proxy-Wasm ABI to Wasm applications, and provides SDKs in a few languages for them. You write your program using one of the SDKs, compile it into a Wasm binary, and then you can configure Envoy to load that binary dynamically at runtime. There are two key limitations to this extensibility method:
- Additional Overhead: Because Wasm is executing in a VM (the V8 VM, to be exact), and that VM does not share memory with Envoy, we have to pay the price of copying the request (including the payload) into and out of the VM for the Wasm program to process it. A C++ filter does not pay this cost, but this cost is lower than either of the two gRPC APIs which need to copy in and out and traverse at least part of the network stack (more on that soon).
- Bleeding Edge: The Wasm ecosystem is relatively immature, and this manifests in several ways:
- Debugging: Debugging a Wasm program can be challenging – though tools like Wazero can make it a lot easier
- Compiler Optimization: Wasm is a new target for compilers, and many compilers don’t produce code that’s as performant as when they compile for a native target. C++ and Rust do alright here as LLVM is relatively mature, but Go suffers quite a bit from this.
The rule of thumb I’d suggest when thinking about Wasm filters is: if you’re considering building it as a C++ filter, prototype it as Wasm first and measure. If the performance tradeoff is acceptable for your use case, Wasm will likely be easier to work with than building and deploying your own custom Envoy binary.
The gRPC APIs: out-of-Process Execution
Both ext_proc and ext_authz are very similar: they’re gRPC APIs that Envoy invokes per-request. By default, both receive the request’s metadata: its headers, source IP and port, etc. Both are allowed to return additional headers or modify existing headers, and both can respond telling Envoy to drop the request. However, ext_authz is only called on requests, and cannot process responses. Ext_proc can process both requests and their responses, and can opt-in to receiving the request (and response) body, in addition to the metadata – in other words, it can receive the entire request and response from Envoy.
Because ext_proc can receive the entire request and response payload, it’s assumed that an ext_proc server is deployed as a sidecar to Envoy. Then Envoy can send the data over localhost or a domain socket, and we can avoid a ton of extra network bandwidth mirroring requests over to an external server for processing. Because ext_authz only processes a request’s metadata, it can safely be deployed remotely – and often is, for use cases like Oauth2 authentication.
In short, use ext_proc when you need to process the response, or you need to operate on the body of request or response – and deploy an instance of the ext_proc server next to every Envoy instance. Otherwise, use ext_authz and run it centrally. Of course, nothing stops you from running an ext_proc server remotely, or an ext_authz server locally as a sidecar. And because the APIs are so similar, it’s straightforward to implement both using the same core logic/library. You can always implement one way, measure, and pivot.
Summary
Envoy offers four extensibility mechanisms. Use a native C++ filter when performance is paramount and other alternatives won’t work. Use Wasm when you want things local in Envoy and performant, but are willing to put up with Wasm idiosyncrasies. Use the ext_proc gRPC API when you want to be out of the Envoy process but need to operate on the request’ or response’s body. Finally, use ext_authz when you only need to operate on the request’s metadata.
Next Steps
Envoy Gateway (EG) is a project driven by the Envoy community to make Envoy easy to use and operate for ingress. It focuses on ease of use, and making the common case easy and leverages the Kubernetes Gateway API for managing Envoy and exposing applications. Tetrate helped start the EG project and continues to invest in it heavily.
Tetrate offers an enterprise-ready distribution of Envoy Gateway—Tetrate Enterprise Gateway for Envoy—that you can start using right away. Check out the docs to learn more and take it for a spin ›