Rate Limiting — Sliding Window & Token Bucket
Granit.RateLimiting provides per-tenant rate limiting with four algorithms,
configurable policies, Redis-backed counters (with in-memory fallback), and Wolverine
message handler support. Integrates with Granit.Features for plan-based dynamic
quotas.
[DependsOn(typeof(GranitRateLimitingModule))]public class AppModule : GranitModule { }{ "RateLimiting": { "Enabled": true, "KeyPrefix": "rl", "BypassRoles": ["admin"], "FallbackOnCounterStoreFailure": "Deny", "Policies": { "api-default": { "Algorithm": "SlidingWindow", "PermitLimit": 1000, "Window": "00:01:00", "SegmentsPerWindow": 6 }, "api-sensitive": { "Algorithm": "TokenBucket", "TokenLimit": 50, "TokensPerPeriod": 10, "ReplenishmentPeriod": "00:00:10" } } }}Applying to endpoints
Section titled “Applying to endpoints”app.MapGet("/api/v1/appointments", GetAppointments) .RequireGranitRateLimiting("api-default");
app.MapPost("/api/v1/payments", ProcessPayment) .RequireGranitRateLimiting("api-sensitive");Wolverine message handler support
Section titled “Wolverine message handler support”Decorate message types with [RateLimited] and register the Wolverine middleware:
[RateLimited("api-default")]public record SyncPatientCommand(Guid PatientId);// In Wolverine configurationopts.Policies.AddMiddleware<RateLimitMiddleware>( chain => chain.MessageType.GetCustomAttributes(typeof(RateLimitedAttribute), true).Length > 0);When the rate limit is exceeded, RateLimitExceededException is thrown and handled
by Wolverine’s retry policy.
Algorithms
Section titled “Algorithms”| Algorithm | Use case | Key parameters |
|---|---|---|
SlidingWindow | General API rate limiting (default) | PermitLimit, Window, SegmentsPerWindow |
FixedWindow | Simple counter, lowest memory | PermitLimit, Window |
TokenBucket | Controlled burst allowance | TokenLimit, TokensPerPeriod, ReplenishmentPeriod |
Concurrency | Limit simultaneous in-flight requests | PermitLimit |
Key partitioning
Section titled “Key partitioning”Rate limit counters are partitioned according to PartitionBy on each policy.
Redis keys use hash tags for Cluster slot co-location.
PartitionBy | Key format | Use case |
|---|---|---|
Tenant (default) | rl:{tenantId}:policy | Shared tenant quota |
TenantAndIp | rl:{tenantId}:ip:policy | Per-IP within a tenant |
Ip | rl:{ip}:policy | Unauthenticated endpoints (login, reset) |
User | rl:{userId}:policy | Per-user quota |
TenantAndUser | rl:{tenantId}:userId:policy | Per-user within a tenant |
{ "Policies": { "auth": { "Algorithm": "FixedWindow", "PermitLimit": 5, "Window": "00:15:00", "PartitionBy": "Ip" } }}Failure behavior
Section titled “Failure behavior”When Redis is unavailable, the FallbackOnCounterStoreFailure setting controls behavior:
| Value | Behavior | Use case |
|---|---|---|
Allow | Let the request through, log warning | Prefer availability over quota enforcement |
Deny | Reject with 429 | Conservative — prefer safety over availability |
Response headers
Section titled “Response headers”On allowed responses, the following headers are set:
| Header | Description |
|---|---|
X-RateLimit-Limit | Total permit limit for the policy |
X-RateLimit-Remaining | Remaining permits in the current window |
On rejected responses (429), Retry-After is set in addition.
429 response format
Section titled “429 response format”{ "status": 429, "title": "Too Many Requests", "detail": "Too many requests. Please retry later.", "limit": 1000, "remaining": 0, "retryAfter": 10}Dynamic quotas with Granit.Features
Section titled “Dynamic quotas with Granit.Features”When UseFeatureBasedQuotas is enabled, the permit limit is resolved dynamically
from Granit.Features (e.g., per-plan quotas). The convention-based feature name
is RateLimit.{PolicyName}, overridable via RateLimitPolicyOptions.FeatureName.
Configuration reference
Section titled “Configuration reference”| Property | Default | Description |
|---|---|---|
Enabled | true | Enable/disable rate limiting globally |
KeyPrefix | "rl" | Redis key prefix |
FallbackOnCounterStoreFailure | Deny | Behavior when Redis is down |
BypassRoles | [] | Roles that skip rate limiting |
UseFeatureBasedQuotas | false | Use Granit.Features for dynamic quotas |
Policies.* | — | Named rate limiting policies (see below) |
Policy options:
| Property | Default | Description |
|---|---|---|
Algorithm | SlidingWindow | Rate limiting algorithm |
PartitionBy | Tenant | Key partitioning strategy (Tenant, TenantAndIp, Ip, User, TenantAndUser) |
PermitLimit | 1000 | Max permits per window |
Window | 00:01:00 | Window duration |
SegmentsPerWindow | 6 | Sliding window segments (accuracy vs. memory) |
TokenLimit | 50 | Max tokens (TokenBucket only) |
TokensPerPeriod | 10 | Tokens added per replenishment (TokenBucket only) |
ReplenishmentPeriod | 00:00:10 | Replenishment interval (TokenBucket only) |
FeatureName | null | Override feature name for dynamic quotas |
Public API summary
Section titled “Public API summary”| Category | Key types | Package |
|---|---|---|
| Module | GranitRateLimitingModule | — |
| Store | IRateLimitCounterStore, IRateLimitQuotaProvider, RateLimitResult | Granit.RateLimiting |
| Attributes | RateLimitedAttribute | Granit.RateLimiting |
| Exceptions | RateLimitExceededException | Granit.RateLimiting |
| Options | GranitRateLimitingOptions, RateLimitPolicyOptions, RateLimitPartition | Granit.RateLimiting |
| Extensions | AddGranitRateLimiting(), .RequireGranitRateLimiting() | Granit.RateLimiting |
See also
Section titled “See also”- Exception Handling — Problem Details format
- Wolverine module — Message bus, retry policies
- Features module — Feature flags, dynamic quotas
- API & Http overview — All HTTP infrastructure packages