Talk about the implementation of token bucket algorithm.



This paper mainly studies the implementation of token bucket algorithm.

Overview of Current Limiting Algorithms

There are mainly the following:

  • Semaphore based on semaphore

Only quantity dimension, no time dimension

  • Based on fixed window

With the time dimension, however, it is easy to exceed the current limit at the critical point of the two windows. For example, the limit is 10 requests per minute, 10 requests at 00:59 and 10 requests at 01:01. Judging from the time window of 00:30-01:30, 20 requests per minute are not well controlled.

  • Based on rolling window

To solve the problem of window criticality that fixed window does not solve, there are mainly algorithms based on token bucket and leaky bucket.

Token bucket algorithm

  • Token is added to bucket at the specified rate
  • A bucket has its capacity limit, and excess token will be discarded if it exceeds its capacity.
  • When the request comes, first try to get the token, if the remaining tokens are enough, then release, if not, then release is not allowed (May Wait for token Enough to Continue)

Simple implementation

 * The minimalistic token-bucket implementation
public class MinimalisticTokenBucket {

    private final long capacity;
    private final double refillTokensPerOneMillis;

    private double availableTokens;
    private long lastRefillTimestamp;

     * Creates token-bucket with specified capacity and refill rate equals to refillTokens/refillPeriodMillis
    public MinimalisticTokenBucket(long capacity, long refillTokens, long refillPeriodMillis) {
        this.capacity = capacity;
        this.refillTokensPerOneMillis = (double) refillTokens / (double) refillPeriodMillis;

        this.availableTokens = capacity;
        this.lastRefillTimestamp = System.currentTimeMillis();

    synchronized public boolean tryConsume(int numberTokens) {
        if (availableTokens < numberTokens) {
            return false;
        } else {
            availableTokens -= numberTokens;
            return true;

    private void refill() {
        long currentTimeMillis = System.currentTimeMillis();
        if (currentTimeMillis > lastRefillTimestamp) {
            long millisSinceLastRefill = currentTimeMillis - lastRefillTimestamp;
            double refill = millisSinceLastRefill * refillTokensPerOneMillis;
            this.availableTokens = Math.min(capacity, availableTokens + refill);
            this.lastRefillTimestamp = currentTimeMillis;

    private static final class Selftest {

        public static void main(String[] args) {
            // 100 tokens per 1 second
            MinimalisticTokenBucket limiter = new MinimalisticTokenBucket(100, 100, 1000);

            long startMillis = System.currentTimeMillis();
            long consumed = 0;
            while (System.currentTimeMillis() - startMillis < 10000) {
                if (limiter.tryConsume(1)) {


  • The above is a simple implementation given by bucket4j for understanding token bucket algorithm.
  • This algorithm does not use threads to refresh token, because if there are too many bucket, there will be too many threads and cpu consumption.
  • This algorithm does not store the token used by each period. lastRefillTimestamp field is designed to calculate the token to be filled.
  • When trying tryConsume each time, the method first calls refill, calculates the tokens that need to be supplemented in this time period according to the set speed and time difference, and updates the availableTokens and lastRefillTimestamp.
  • After that, the current limit judgment is to judge the availableTokens and the requested numberTokens.


The token bucket algorithm is based on qps to limit the current. Its simple implementation is to calculate the rate of supplementing tokens per unit time, and then revise the availableTokens according to the rate each time a tryConsume.