Can Rate Limiting Help Control Compute Costs?

It’s an exhilarating feeling. Your application or platform is really popular, and the traffic is pouring in.

Erica Hughberg

August 26, 2024

Can%20Rate%20Limiting%20Help%20Control%20Compute%20Costs%3F

It’s an exhilarating feeling. Your application or platform is really popular, and the traffic is pouring in.

Then reality hits as you see the cloud computing bill. Your services have been scaling excessively due to high traffic and demand, and the excitement might fade away.

How to mitigate this problem? Setting boundaries with rate limiting.

With effective rate limiting you control the incoming traffic to a system. It can help control computing costs by stopping excessive usage and abuse.

However, when we talk about rate limiting, we often need to be more precise. It is one of those things that changes meaning depending on context.

What Do You Mean by “Rate Limiting”?

There are three main categories of rate limiting to consider:

Upstream services protection: This form of rate limiting shields the underlying systems from being flooded with excessive requests.
Reasonable usage limits: These limits are based on reasonable user activity and prevent abnormal usage patterns.
Product-defined limits: These are limits based on a business agreement. If you have third-party clients accessing your services, you likely have a specific agreement regarding rate limits for their usage.

Enforce Rate Limits with a Scalable Gateway

Now, the question arises: How do you effectively enforce these limits?

The answer lies in using a Gateway solution, such as the Envoy Gateway, which offers simple and configurable rate limiting for Envoy Proxy.

When you enable Envoy Gateway in your Kubernetes cluster, it automatically installs the control plane and rate-limiting server required to enforce rate limiting for your resources.

When you run Envoy Proxy controlled by Envoy Gateway on Kubernetes it gives you a scalable gateway solution. As traffic volume changes, the data plane handling the requests can scale up and down as necessary.

Want global rate-limiting across gateways and regions? Connecting the rate-limiting to a cloud-hosted, cross-region replicated Redis allows you to achieve truly cross-cluster global rate limiting for your system.

A Simple Approach to Defining Rate Limits

Here’s a simple way to approach defining your rate limits:

First, assess how much traffic your underlying service can handle. This forms the foundation for your rate limits, usually set for a short period targeting the underlying service. This should be the most comprehensive rate limit, as it impacts all incoming requests. This is your upstream services protection rate limiting.

Next, consider limiting the requests from end users. If you have an application, distinguishing between reasonable human activity and non-reasonable activity helps you establish limits across different routes. This is a more restrictive approach but ensures fair and regular usage. Figuring out the appropriate rate limits here allows you to set reasonable usage limits.

Finally, if third-party clients access your services programmatically, it’s essential to adhere to any rate limits set in your business agreement. These limits will be based on the agreement with the client and may vary for different APIs, but they allow you to track usage based on the client. These are your product-defined limits, which should never have a greater value than your upstream services protection rate limiting.

Parting Thoughts

Rate limiting controls incoming traffic and helps, in turn, to control compute costs. Implementing a layered approach to rate limiting can help organizations strike the right balance between ensuring fair usage and protecting their underlying systems.

Erica Hughberg

August 26, 2024

New to service mesh?

Get up to speed with free online courses at Tetrate Academy and quickly learn Istio and Envoy.

Learn more

Using Kubernetes?

Tetrate Enterprise Gateway for Envoy (TEG) is the easiest way to get started with Envoy Gateway for production use cases. Get the power of Envoy Proxy in an easy-to-consume package managed via the Kubernetes Gateway API.

Learn more

Getting started with Istio?

Tetrate Istio Subscription (TIS) is the most reliable path to production, providing a complete solution for running Istio and Envoy securely in mission-critical environments. It includes:

Tetrate Istio Distro – A 100% upstream distribution of Istio and Envoy.

Compliance-ready – FIPS-verified and FedRAMP-ready for high-security needs.

Enterprise-grade support – The ONLY enterprise support for 100% upstream Istio, ensuring no vendor lock-in.

Learn more