DESIGN A RATE LIMITER (day 4)

DESIGN A RATE LIMITER (day 4)

Designing a rate limiter is a common problem in the field of distributed systems and is used to control the rate at which requests are processed in order to prevent overloading of the system.

A rate limiter works by maintaining a count of the number of requests that have been processed in a given time interval and limiting the number of requests that can be processed in a subsequent time interval.

There are several approaches to designing a rate limiter, each with its own trade-offs. Some common approaches include:

  1. Fixed Window: In this approach, the rate limiter keeps track of the number of requests processed in a fixed time window and denies any further requests if the limit has been reached. This approach is simple to implement but can result in bursty behavior, as a large number of requests may be processed in a short period of time and then no requests can be processed for a longer period of time.
  2. Sliding Window: In this approach, the rate limiter keeps track of the number of requests processed in a sliding time window and denies any further requests if the limit has been reached. This approach provides more even processing of requests and reduces bursty behavior, but can be more complex to implement.
  3. Token Bucket: In this approach, the rate limiter maintains a bucket of tokens that represents the number of requests that can be processed. When a request is processed, a token is removed from the bucket. If there are no tokens available, the request is denied. This approach is simple to implement and provides even processing of requests, but can result in a buildup of tokens if the rate of requests decreases.
  4. Leaky Bucket: In this approach, the rate limiter maintains a leaky bucket that collects incoming requests. If the number of requests in the bucket exceeds a limit, any additional requests are denied. This approach is simple to implement and provides even processing of requests, but can result in a buildup of requests if the rate of requests exceeds the processing rate.

In designing a rate limiter, it's important to consider the requirements of the system, such as the desired processing rate, the number of requests that can be processed in a given time interval, and the desired behavior in the event that the limit is reached.

In conclusion, designing a rate limiter is an important aspect of distributed systems and can be accomplished using several different approaches, each with its own trade-offs. The choice of approach will depend on the specific requirements of the system.

Algorithms for rate limiting

Rate limiting can be implemented using different algorithms, and each of them has distinct pros and cons. Even though this chapter does not focus on algorithms, understanding them at high-level helps to choose the right algorithm or combination of algorithms to fit our use cases. Here is a list of popular algorithms: • Token bucket • Leaking bucket • Fixed window counter • Sliding window log • Sliding window counter

Here is a Python implementation of four different rate limiting algorithms, each with a brief explanation:

fixed window

This algorithm maintains a counter that keeps track of the number of requests processed in a fixed time window. If the current time minus the last request time is greater than the time window, the counter is reset to zero. If the counter is less than the limit, the request is allowed and the counter is incremented. If the counter is equal to or greater than the limit, the request is denied.

import time


class FixedWindowRateLimiter:
    def __init__(self, limit, time_window):
        self.limit = limit
        self.time_window = time_window
        self.counter = 0
        self.timestamp = time.time()


    def is_request_allowed(self):
        current_time = time.time()
        if current_time - self.timestamp > self.time_window:
            self.counter = 0
            self.timestamp = current_time
        if self.counter < self.limit:
            self.counter += 1
            return True
        else:
            return False        

  • Sliding Window:

This algorithm uses a deque to keep track of the timestamps of the requests processed in a sliding time window. If the difference between the current time and the earliest request in the deque is greater than the time window, that request is popped from the deque. If the length of the deque is less than the limit, the current request is allowed and its timestamp is appended to the deque. If the length of the deque is equal to or greater than the limit, the request is denied.

import collections
import time


class SlidingWindowRateLimiter:
    def __init__(self, limit, time_window):
        self.limit = limit
        self.time_window = time_window
        self.counter = collections.deque()


    def is_request_allowed(self):
        current_time = time.time()
        while self.counter and current_time - self.counter[0] > self.time_window:
            self.counter.popleft()
        if len(self.counter) < self.limit:
            self.counter.append(current_time)
            return True
        else:
            return False

        

  • 3. Token Bucket:

This algorithm uses a token bucket to keep track of the number of requests that can be processed. If the difference between the current time and the last request time


import time

class TokenBucketRateLimiter:
    def __init__(self, limit, fill_rate):
        self.limit = limit
        self.fill_rate = fill_rate
        self.tokens = limit
        self.timestamp = time.time()


    def is_request_allowed(self):
        current_time = time.time()
        time_passed = current_time - self.timestamp
        self.tokens += time_passed * self.fill_rate
        self.tokens = min(self.tokens, self.limit)
        self.timestamp = current_time
        if self.tokens >= 1:
            self.tokens -= 1
            return True
        else:
            return False        

we are going to take the token bucket to deep dive in

also there's a good explanation here


To view or add a comment, sign in

More articles by Haithem Mihoubi

  • DevOps, DataOps, and MLOps: introduction to Devops
    BERJAYA

    1. Understanding devOps DevOps is a set of practices that combines software development (Dev) and IT operations (Ops)…

    BERJAYA BERJAYA BERJAYA
    7 Comments
  • Sharing My Journey to Learn and Share About DevOps, DataOps, and MLOps
    BERJAYA

    As a software developer, I've always been fascinated by the latest trends and best practices in software development…

    BERJAYA BERJAYA BERJAYA
    7 Comments
  • جهل العقول و الات تتعلم🤔
    BERJAYA

    جهل العقول و الات تتعلم🤔 عصر الغباء البشري و الذكاء الاصطناعي ..

    BERJAYA BERJAYA BERJAYA
  • كيفاش إستفدت من كورسيرا 😍👌
    BERJAYA

    واحدة من أحسن المنصات التي ساعدتني على الإستثمار في نفسي. كنت نتفرج برشا يوتيوب لين نهار جربت فيها كورس و حبيتها و…

    BERJAYA BERJAYA BERJAYA
    4 Comments
  • Consistent hashing
    BERJAYA

    Consistent Hashing is a technique used in computer science to distribute data across multiple nodes in a network. The…

    BERJAYA BERJAYA
  • Design news feed system (Day 5)
    BERJAYA

    Design a News Feed System All the social media sites have some sort of news feed system, like those in Facebook…

    BERJAYA BERJAYA
  • System Design Scalability (day 3)
    BERJAYA

    Scalability in system design refers to the ability of a system to handle an increased load, such as an increased number…

    BERJAYA BERJAYA
  • Content delivery network (CDN)- (day 2)
    BERJAYA

    A content delivery network (CDN) is a distributed network of servers that can efficiently deliver web content to users.…

    BERJAYA BERJAYA
  • System design for beginners, day 1
    BERJAYA

    System design is the process of planning, designing, and building a complex system that meets the needs of the users…

    BERJAYA BERJAYA

Explore content categories