Engineering Guide

Token Bucket Algorithm Explained

Understand the token bucket rate limiting algorithm. Learn how refill rates, burst capacity, and Redis implementation work in production.

How it works

A bucket holds tokens up to a maximum capacity. Tokens refill at a constant rate. Each request consumes one or more tokens. If insufficient tokens exist, the request is rejected with HTTP 429.

LimitYourAPI implements these concepts in production with atomic Redis Lua scripts, providing sub-15ms decisions globally.

Rate Limiting Glossary

Understanding rate limiting terminology helps teams communicate requirements clearly across engineering, product, and security teams for Token Bucket Algorithm.

Term	Definition
Rate limit	Maximum number of requests allowed in a time window
Quota	Total allowed usage over a longer period (daily, monthly)
Token bucket	Algorithm allowing bursts up to bucket capacity with steady refill
Sliding window	Counts requests in a rolling time window for precise enforcement
Fail-open	Allow requests when rate limiter is unreachable
Fail-closed	Reject requests when rate limiter is unreachable
429 HTTP Status	Standard HTTP status code for rate limit exceeded
Retry-After	Header indicating seconds until client should retry
Identifier / Key	Unique string identifying the client for rate limiting
Goroutines	Lightweight concurrent threads executing rate limit checks
Channel Caching	Buffered Go channels handling asynchronous metric flushing
Mutex Lock	Thread-safe in-memory cache sync indicators

Next Steps

Ready to protect your API with production-grade rate limiting? Here is the recommended path for Token Bucket Algorithm:

Create a free account at [limityourapi.tech/login](/login) — no credit card required for the Hobby tier
Generate an API key in the dashboard under API Keys
Install the SDK: Run go get github.com/trynayash/limityourapi-go and follow the [Go](/sdk/go) guide
Follow the quick start guide at [/quickstart](/quickstart) for a 2-minute integration
Configure rules in the dashboard for your highest-risk endpoints first
Monitor analytics to tune limits based on real traffic patterns

Questions? Read the [documentation](/docs) or explore the [rate limiting education hub](/learn) for deep technical guides on algorithms, architecture, and production patterns.

Frequently Asked Questions

What is API rate limiting?

API rate limiting controls how many requests a client can make in a given time window. It protects backends from abuse, ensures fair usage across tenants, and prevents cost overruns from traffic spikes or malicious bots.

Why use Redis for rate limiting?

Redis provides sub-millisecond latency, atomic operations via Lua scripts, and horizontal scalability. Centralized state ensures consistent limits across distributed application servers.

How fast is LimitYourAPI?

LimitYourAPI delivers rate limit decisions in under 15ms globally using atomic Redis Lua scripts. This is fast enough for inline middleware without adding perceptible latency to API responses.

Does LimitYourAPI support token bucket and sliding window?

Yes. LimitYourAPI supports token bucket, sliding window, fixed window, and cost-aware algorithms. You can configure per-route strategies without changing infrastructure.

Can I migrate from express-rate-limit or Cloudflare?

Yes. LimitYourAPI provides migration guides with before/after code examples for express-rate-limit, Cloudflare, Upstash, Arcjet, and other providers.

Protect your API in minutes

Join developers using LimitYourAPI for sub-millisecond Redis-backed rate limiting.

Start Free Read the Docs