DESIGN A RATE LIMITER (day 4)

Designing a rate limiter is a common problem in the field of distributed systems and is used to control the rate at which requests are processed in order to prevent overloading of the system.

A rate limiter works by maintaining a count of the number of requests that have been processed in a given time interval and limiting the number of requests that can be processed in a subsequent time interval.

There are several approaches to designing a rate limiter, each with its own trade-offs. Some common approaches include:

Fixed Window: In this approach, the rate limiter keeps track of the number of requests processed in a fixed time window and denies any further requests if the limit has been reached. This approach is simple to implement but can result in bursty behavior, as a large number of requests may be processed in a short period of time and then no requests can be processed for a longer period of time.
Sliding Window: In this approach, the rate limiter keeps track of the number of requests processed in a sliding time window and denies any further requests if the limit has been reached. This approach provides more even processing of requests and reduces bursty behavior, but can be more complex to implement.
Token Bucket: In this approach, the rate limiter maintains a bucket of tokens that represents the number of requests that can be processed. When a request is processed, a token is removed from the bucket. If there are no tokens available, the request is denied. This approach is simple to implement and provides even processing of requests, but can result in a buildup of tokens if the rate of requests decreases.
Leaky Bucket: In this approach, the rate limiter maintains a leaky bucket that collects incoming requests. If the number of requests in the bucket exceeds a limit, any additional requests are denied. This approach is simple to implement and provides even processing of requests, but can result in a buildup of requests if the rate of requests exceeds the processing rate.

In designing a rate limiter, it's important to consider the requirements of the system, such as the desired processing rate, the number of requests that can be processed in a given time interval, and the desired behavior in the event that the limit is reached.

In conclusion, designing a rate limiter is an important aspect of distributed systems and can be accomplished using several different approaches, each with its own trade-offs. The choice of approach will depend on the specific requirements of the system.

Algorithms for rate limiting

Rate limiting can be implemented using different algorithms, and each of them has distinct pros and cons. Even though this chapter does not focus on algorithms, understanding them at high-level helps to choose the right algorithm or combination of algorithms to fit our use cases. Here is a list of popular algorithms: • Token bucket • Leaking bucket • Fixed window counter • Sliding window log • Sliding window counter

Here is a Python implementation of four different rate limiting algorithms, each with a brief explanation:

fixed window

This algorithm maintains a counter that keeps track of the number of requests processed in a fixed time window. If the current time minus the last request time is greater than the time window, the counter is reset to zero. If the counter is less than the limit, the request is allowed and the counter is incremented. If the counter is equal to or greater than the limit, the request is denied.

import time


class FixedWindowRateLimiter:
    def __init__(self, limit, time_window):
        self.limit = limit
        self.time_window = time_window
        self.counter = 0
        self.timestamp = time.time()


    def is_request_allowed(self):
        current_time = time.time()
        if current_time - self.timestamp > self.time_window:
            self.counter = 0
            self.timestamp = current_time
        if self.counter < self.limit:
            self.counter += 1
            return True
        else:
            return False

Sliding Window:

This algorithm uses a deque to keep track of the timestamps of the requests processed in a sliding time window. If the difference between the current time and the earliest request in the deque is greater than the time window, that request is popped from the deque. If the length of the deque is less than the limit, the current request is allowed and its timestamp is appended to the deque. If the length of the deque is equal to or greater than the limit, the request is denied.

import collections
import time


class SlidingWindowRateLimiter:
    def __init__(self, limit, time_window):
        self.limit = limit
        self.time_window = time_window
        self.counter = collections.deque()


    def is_request_allowed(self):
        current_time = time.time()
        while self.counter and current_time - self.counter[0] > self.time_window:
            self.counter.popleft()
        if len(self.counter) < self.limit:
            self.counter.append(current_time)
            return True
        else:
            return False

3. Token Bucket:

This algorithm uses a token bucket to keep track of the number of requests that can be processed. If the difference between the current time and the last request time

import time

class TokenBucketRateLimiter:
    def __init__(self, limit, fill_rate):
        self.limit = limit
        self.fill_rate = fill_rate
        self.tokens = limit
        self.timestamp = time.time()


    def is_request_allowed(self):
        current_time = time.time()
        time_passed = current_time - self.timestamp
        self.tokens += time_passed * self.fill_rate
        self.tokens = min(self.tokens, self.limit)
        self.timestamp = current_time
        if self.tokens >= 1:
            self.tokens -= 1
            return True
        else:
            return False

we are going to take the token bucket to deep dive in

also there's a good explanation here

LinkedIn respects your privacy

DESIGN A RATE LIMITER (day 4)

Haithem Mihoubi

System Design and DevOps

466 followers

More articles by Haithem Mihoubi

Explore content categories

System Design and DevOps

466 followers

More articles by Haithem Mihoubi

DevOps, DataOps, and MLOps: introduction to Devops

Sharing My Journey to Learn and Share About DevOps, DataOps, and MLOps

جهل العقول و الات تتعلم🤔

كيفاش إستفدت من كورسيرا 😍👌

Consistent hashing

Design news feed system (Day 5)

System Design Scalability (day 3)

Content delivery network (CDN)- (day 2)

System design for beginners, day 1

Explore content categories