# Simple Cooldown and Rate Control Algorithm

Software that accepts requests from external, potentially unfriendly sources often need to implement some kind of rate restrictions or cooldowns to limit the amount of resources single user or app client can use.

Typical uses for cooldowns are

• Limit number of login attempts per user
• Limit rate of API calls
• Batch incoming messages or jobs

There are several approaches how limitations are calculated, but it usually boils down to simple TPS limit, i.e. transactions per second.

Well behaving rate limiter must allow bursts, as client may not have control over the network buffering and even when it sends on steady rate server can receive requests as chunked together. Therefore it does not make sense to enforce strict TPS limit that only understands about seconds but it’s better to use average over time. Optimally rate limiter should also not require lot of data management nor storage.

The algorithm

This is proposal for implementing described rate limiter with minimal overhead as variant of Leaky Bucket algorithm.

• Let `T` be time span in seconds and `n` number of operations we want to allow per `T`. For example `n = 3` and `T = 60` would mean 3 operations per minute.
• Let `S = 0` be initial score for rate limited resource.

Every time resource is accessed,

1. Compute trial score `Sn` where `now` is seconds since some globally fixed reference time.
• `Sn = now + T/n` if `S=0`
• `Sn = Sn + T/n` otherwise.
2. Compare trial score `Sn` and current time.
• Update `S = now + T/n` and allow request if `Sn < now`.
• Reject request if `Sn > now + T`
• Otherwise update score `S = Sn` and allow request.

This algorithm has some performance benefits as it only needs to store a single integer for score for each throttle and it does not need to be updated if request is rejected so its system load characteristics are well bounded. Algorithm also accepts max `n` requests in T as burst and averages to `(n/T)` TPS for constant load.
Implementation also fits well in distributed environments where redis or memcached can be used to atomically increment the score value.

Example

Here is javascript example for reference, save it in file and run with node.js.

```

var n = 3; // # of requests
var T = 60; // timestamp

var S = 0;
function throttle() {

var Sn = S;
var now = ~~(Date.now() / 1000);
if (!Sn) {
Sn = now + T/n;
} else {
Sn += T/n;
}
if ( Sn < now ) {
Sn = now + T/n; // reset back to initial value
} else if ( Sn > now + T ) {
console.log('REJECT', S - now);
return false; // throttled
}
S = Sn;

console.log('ALLOW', S - now);
return true; // ok
}

console.log('Allow '+n+' requests per '+T+'s');
input: process.stdin,
output: process.stdout
});
rl.setPrompt('hit>');
rl.prompt();
rl.on('line', function() {
// each enter is request
throttle();
rl.prompt();
});
```

When you run the example, hit enter to simulate request. It tells if you if it was accepted or rejected with the current score delta.

```~/work \$ node throttle.js
Allow 3 requests per 60s
hit>
ALLOW 20
hit>
ALLOW 40
hit>
ALLOW 60
hit>
REJECT 59
hit>
REJECT 55
hit>
REJECT 50
hit>
REJECT 45
hit>
ALLOW 59
hit>
REJECT 58
hit>
```

Adjust n and T to your liking. This implementation is also easy to change to return number of seconds client has to wait until trying again.

See also Simple popularity algorithm with decay how to apply this method for popularity scoring.