Rate Limiting — Sliding Window & Token Bucket
Granit.RateLimiting provides per-tenant rate limiting with four algorithms,
configurable policies, and Redis-backed counters (with in-memory fallback). The
core is framework-pure — no ASP.NET Core dependency — and the HTTP and
Wolverine enforcement points ship as separate transport bindings. Integrates with
Granit.Features for plan-based dynamic quotas.
Choosing the right package
Section titled “Choosing the right package”The counters, algorithms, quota resolution, and TenantPartitionedRateLimiter
live in the transport-agnostic core. Add the binding that matches where you
enforce the limit — most apps reference both Granit.Http.RateLimiting (incoming
requests) and Granit.RateLimiting.Wolverine (message handlers).
| Package | Reference when | Brings in |
|---|---|---|
Granit.RateLimiting | Always — the core. Counter stores, algorithms, quota providers, GranitRateLimitingModule. | Granit, Granit.Features |
Granit.Http.RateLimiting | You throttle HTTP endpoints. Endpoint filter .RequireGranitRateLimiting("policy") → 429 + Retry-After + X-RateLimit-*. | core + Granit.Http.ExceptionHandling |
Granit.RateLimiting.Wolverine | You throttle message handlers. [RateLimited("policy")] + pipeline middleware. No WolverineFx reference. | core |
This core + per-transport binding layout mirrors Granit.Features and is
captured in ADR-062.
Reference the binding(s) you need — each transport module pulls in the core
GranitRateLimitingModule automatically.
[DependsOn( typeof(GranitHttpRateLimitingModule), // HTTP endpoints typeof(GranitRateLimitingWolverineModule))] // Wolverine handlerspublic class AppModule : GranitModule { }dotnet add package Granit.Http.RateLimitingdotnet add package Granit.RateLimiting.Wolverine{ "RateLimiting": { "Enabled": true, "KeyPrefix": "rl", "BypassRoles": ["admin"], "FallbackOnCounterStoreFailure": "Deny", "Policies": { "api-default": { "Algorithm": "SlidingWindow", "PermitLimit": 1000, "Window": "00:01:00", "SegmentsPerWindow": 6 }, "api-sensitive": { "Algorithm": "TokenBucket", "TokenLimit": 50, "TokensPerPeriod": 10, "ReplenishmentPeriod": "00:00:10" } } }}Applying to endpoints
Section titled “Applying to endpoints”The endpoint filter lives in Granit.Http.RateLimiting:
using Granit.Http.RateLimiting.AspNetCore;
app.MapGet("/api/v1/appointments", GetAppointments) .RequireGranitRateLimiting("api-default");
app.MapPost("/api/v1/payments", ProcessPayment) .RequireGranitRateLimiting("api-sensitive");GranitHttpRateLimitingModule registers the RFC 7807 mapper that turns
RateLimitExceededException into an HTTP 429 response.
Wolverine message handler support
Section titled “Wolverine message handler support”The [RateLimited] attribute and middleware live in Granit.RateLimiting.Wolverine.
Decorate the message type and register the middleware:
using Granit.RateLimiting.Wolverine.Attributes;
[RateLimited("api-default")]public record SyncPatientCommand(Guid PatientId);// In Wolverine configurationopts.Policies.AddMiddleware<RateLimitMiddleware>( chain => chain.MessageType.GetCustomAttributes(typeof(RateLimitedAttribute), true).Length > 0);When the rate limit is exceeded, RateLimitExceededException is thrown and handled
by Wolverine’s retry policy.
Algorithms
Section titled “Algorithms”| Algorithm | Use case | Key parameters |
|---|---|---|
SlidingWindow | General API rate limiting (default) | PermitLimit, Window, SegmentsPerWindow |
FixedWindow | Simple counter, lowest memory | PermitLimit, Window |
TokenBucket | Controlled burst allowance | TokenLimit, TokensPerPeriod, ReplenishmentPeriod |
Concurrency | Limit simultaneous in-flight requests | PermitLimit |
Key partitioning
Section titled “Key partitioning”Rate limit counters are partitioned according to PartitionBy on each policy.
Redis keys use hash tags for Cluster slot co-location.
PartitionBy | Key format | Use case |
|---|---|---|
Tenant (default) | rl:{tenantId}:policy | Shared tenant quota |
TenantAndIp | rl:{tenantId}:ip:policy | Per-IP within a tenant |
Ip | rl:{ip}:policy | Unauthenticated endpoints (login, reset) |
User | rl:{userId}:policy | Per-user quota |
TenantAndUser | rl:{tenantId}:userId:policy | Per-user within a tenant |
{ "Policies": { "auth": { "Algorithm": "FixedWindow", "PermitLimit": 5, "Window": "00:15:00", "PartitionBy": "Ip" } }}Failure behavior
Section titled “Failure behavior”When Redis is unavailable, the FallbackOnCounterStoreFailure setting controls behavior:
| Value | Behavior | Use case |
|---|---|---|
Allow | Let the request through, log warning | Prefer availability over quota enforcement |
Deny | Reject with 429 | Conservative — prefer safety over availability |
Response headers
Section titled “Response headers”On allowed responses, the following headers are set:
| Header | Description |
|---|---|
X-RateLimit-Limit | Total permit limit for the policy |
X-RateLimit-Remaining | Remaining permits in the current window |
On rejected responses (429), Retry-After is set in addition.
429 response format
Section titled “429 response format”{ "status": 429, "title": "Too Many Requests", "detail": "Too many requests. Please retry later.", "limit": 1000, "remaining": 0, "retryAfter": 10}Policies in the wild — OData feeds
Section titled “Policies in the wild — OData feeds”The OData exposure module bridges QueryDefinition<TEntity>
to OData v4 EntitySets for BI tools (Power BI, Excel, Tableau). Every OData route is
gated by the granit-odata policy (per-tenant feed) or granit-odata-host (cross-tenant
host feed):
{ "RateLimiting": { "Policies": { "granit-odata": { "Algorithm": "SlidingWindow", "PartitionBy": "Tenant", "PermitLimit": 60, "Window": "00:01:00" }, "granit-odata-host": { "Algorithm": "SlidingWindow", "PartitionBy": "User", "PermitLimit": 600, "Window": "00:01:00" } } }}Per-tenant partitioning is what makes the feed safe against a single tenant
saturating the backend with a misconfigured Power BI refresh job. Accepted requests
carry X-RateLimit-Limit + X-RateLimit-Remaining so the BI tool’s incremental
refresh logic can throttle itself before hitting 429.
Dynamic quotas with Granit.Features
Section titled “Dynamic quotas with Granit.Features”When UseFeatureBasedQuotas is enabled, the permit limit is resolved dynamically
from Granit.Features (e.g., per-plan quotas). The convention-based feature name
is RateLimit.{PolicyName}, overridable via RateLimitPolicyOptions.FeatureName.
Configuration reference
Section titled “Configuration reference”| Property | Default | Description |
|---|---|---|
Enabled | true | Enable/disable rate limiting globally |
KeyPrefix | "rl" | Redis key prefix |
FallbackOnCounterStoreFailure | Deny | Behavior when Redis is down |
BypassRoles | [] | Roles that skip rate limiting |
UseFeatureBasedQuotas | false | Use Granit.Features for dynamic quotas |
Policies.* | — | Named rate limiting policies (see below) |
Policy options:
| Property | Default | Description |
|---|---|---|
Algorithm | SlidingWindow | Rate limiting algorithm |
PartitionBy | Tenant | Key partitioning strategy (Tenant, TenantAndIp, Ip, User, TenantAndUser) |
PermitLimit | 1000 | Max permits per window |
Window | 00:01:00 | Window duration |
SegmentsPerWindow | 6 | Sliding window segments (accuracy vs. memory) |
TokenLimit | 50 | Max tokens (TokenBucket only) |
TokensPerPeriod | 10 | Tokens added per replenishment (TokenBucket only) |
ReplenishmentPeriod | 00:00:10 | Replenishment interval (TokenBucket only) |
FeatureName | null | Override feature name for dynamic quotas |
Public API summary
Section titled “Public API summary”| Category | Key types | Package |
|---|---|---|
| Modules | GranitRateLimitingModule, GranitHttpRateLimitingModule, GranitRateLimitingWolverineModule | — |
| Store | IRateLimitCounterStore, IRateLimitQuotaProvider, RateLimitResult, TenantPartitionedRateLimiter | Granit.RateLimiting |
| Exceptions | RateLimitExceededException | Granit.RateLimiting |
| Options | GranitRateLimitingOptions, RateLimitPolicyOptions, RateLimitPartition | Granit.RateLimiting |
| Core extensions | AddGranitRateLimiting() | Granit.RateLimiting |
| HTTP binding | .RequireGranitRateLimiting(), AddGranitHttpRateLimiting() (Granit.Http.RateLimiting.AspNetCore) | Granit.Http.RateLimiting |
| Wolverine binding | RateLimitedAttribute, RateLimitMiddleware (Granit.RateLimiting.Wolverine.Attributes) | Granit.RateLimiting.Wolverine |
Migrating from the single package
Section titled “Migrating from the single package”Granit.RateLimiting was a single ASP.NET-Core-coupled package before the
layer split. These are breaking moves (pre-1.0):
| Was | Now |
|---|---|
using Granit.RateLimiting.AspNetCore; (endpoint filter) | using Granit.Http.RateLimiting.AspNetCore; + reference Granit.Http.RateLimiting |
RateLimitedAttribute in Granit.RateLimiting.Attributes | Granit.RateLimiting.Wolverine.Attributes + reference Granit.RateLimiting.Wolverine |
AddGranitRateLimiting() wired the 429 mapper | AddGranitHttpRateLimiting() (auto via GranitHttpRateLimitingModule) |
One GranitRateLimitingModule | Core module + GranitHttpRateLimitingModule / GranitRateLimitingWolverineModule |
See also
Section titled “See also”- Feature Flags — same core + transport-binding split, and the source of dynamic quotas
- ADR-062: framework-pure core + transport bindings — the layering convention
- Exception Handling — Problem Details format
- Wolverine module — Message bus, retry policies
- API & Http overview — All HTTP infrastructure packages