Skip to content

Rate Limiting — Sliding Window & Token Bucket

Granit.RateLimiting provides per-tenant rate limiting with four algorithms, configurable policies, and Redis-backed counters (with in-memory fallback). The core is framework-pure — no ASP.NET Core dependency — and the HTTP and Wolverine enforcement points ship as separate transport bindings. Integrates with Granit.Features for plan-based dynamic quotas.

The counters, algorithms, quota resolution, and TenantPartitionedRateLimiter live in the transport-agnostic core. Add the binding that matches where you enforce the limit — most apps reference both Granit.Http.RateLimiting (incoming requests) and Granit.RateLimiting.Wolverine (message handlers).

PackageReference whenBrings in
Granit.RateLimitingAlways — the core. Counter stores, algorithms, quota providers, GranitRateLimitingModule.Granit, Granit.Features
Granit.Http.RateLimitingYou throttle HTTP endpoints. Endpoint filter .RequireGranitRateLimiting("policy")429 + Retry-After + X-RateLimit-*.core + Granit.Http.ExceptionHandling
Granit.RateLimiting.WolverineYou throttle message handlers. [RateLimited("policy")] + pipeline middleware. No WolverineFx reference.core

This core + per-transport binding layout mirrors Granit.Features and is captured in ADR-062.

Reference the binding(s) you need — each transport module pulls in the core GranitRateLimitingModule automatically.

[DependsOn(
typeof(GranitHttpRateLimitingModule), // HTTP endpoints
typeof(GranitRateLimitingWolverineModule))] // Wolverine handlers
public class AppModule : GranitModule { }
Terminal window
dotnet add package Granit.Http.RateLimiting
dotnet add package Granit.RateLimiting.Wolverine
{
"RateLimiting": {
"Enabled": true,
"KeyPrefix": "rl",
"BypassRoles": ["admin"],
"FallbackOnCounterStoreFailure": "Deny",
"Policies": {
"api-default": {
"Algorithm": "SlidingWindow",
"PermitLimit": 1000,
"Window": "00:01:00",
"SegmentsPerWindow": 6
},
"api-sensitive": {
"Algorithm": "TokenBucket",
"TokenLimit": 50,
"TokensPerPeriod": 10,
"ReplenishmentPeriod": "00:00:10"
}
}
}
}

The endpoint filter lives in Granit.Http.RateLimiting:

using Granit.Http.RateLimiting.AspNetCore;
app.MapGet("/api/v1/appointments", GetAppointments)
.RequireGranitRateLimiting("api-default");
app.MapPost("/api/v1/payments", ProcessPayment)
.RequireGranitRateLimiting("api-sensitive");

GranitHttpRateLimitingModule registers the RFC 7807 mapper that turns RateLimitExceededException into an HTTP 429 response.

The [RateLimited] attribute and middleware live in Granit.RateLimiting.Wolverine. Decorate the message type and register the middleware:

using Granit.RateLimiting.Wolverine.Attributes;
[RateLimited("api-default")]
public record SyncPatientCommand(Guid PatientId);
// In Wolverine configuration
opts.Policies.AddMiddleware<RateLimitMiddleware>(
chain => chain.MessageType.GetCustomAttributes(typeof(RateLimitedAttribute), true).Length > 0);

When the rate limit is exceeded, RateLimitExceededException is thrown and handled by Wolverine’s retry policy.

AlgorithmUse caseKey parameters
SlidingWindowGeneral API rate limiting (default)PermitLimit, Window, SegmentsPerWindow
FixedWindowSimple counter, lowest memoryPermitLimit, Window
TokenBucketControlled burst allowanceTokenLimit, TokensPerPeriod, ReplenishmentPeriod
ConcurrencyLimit simultaneous in-flight requestsPermitLimit

Rate limit counters are partitioned according to PartitionBy on each policy. Redis keys use hash tags for Cluster slot co-location.

PartitionByKey formatUse case
Tenant (default)rl:{tenantId}:policyShared tenant quota
TenantAndIprl:{tenantId}:ip:policyPer-IP within a tenant
Iprl:{ip}:policyUnauthenticated endpoints (login, reset)
Userrl:{userId}:policyPer-user quota
TenantAndUserrl:{tenantId}:userId:policyPer-user within a tenant
{
"Policies": {
"auth": {
"Algorithm": "FixedWindow",
"PermitLimit": 5,
"Window": "00:15:00",
"PartitionBy": "Ip"
}
}
}

When Redis is unavailable, the FallbackOnCounterStoreFailure setting controls behavior:

ValueBehaviorUse case
AllowLet the request through, log warningPrefer availability over quota enforcement
DenyReject with 429Conservative — prefer safety over availability

On allowed responses, the following headers are set:

HeaderDescription
X-RateLimit-LimitTotal permit limit for the policy
X-RateLimit-RemainingRemaining permits in the current window

On rejected responses (429), Retry-After is set in addition.

{
"status": 429,
"title": "Too Many Requests",
"detail": "Too many requests. Please retry later.",
"limit": 1000,
"remaining": 0,
"retryAfter": 10
}

The OData exposure module bridges QueryDefinition<TEntity> to OData v4 EntitySets for BI tools (Power BI, Excel, Tableau). Every OData route is gated by the granit-odata policy (per-tenant feed) or granit-odata-host (cross-tenant host feed):

{
"RateLimiting": {
"Policies": {
"granit-odata": {
"Algorithm": "SlidingWindow",
"PartitionBy": "Tenant",
"PermitLimit": 60,
"Window": "00:01:00"
},
"granit-odata-host": {
"Algorithm": "SlidingWindow",
"PartitionBy": "User",
"PermitLimit": 600,
"Window": "00:01:00"
}
}
}
}

Per-tenant partitioning is what makes the feed safe against a single tenant saturating the backend with a misconfigured Power BI refresh job. Accepted requests carry X-RateLimit-Limit + X-RateLimit-Remaining so the BI tool’s incremental refresh logic can throttle itself before hitting 429.

When UseFeatureBasedQuotas is enabled, the permit limit is resolved dynamically from Granit.Features (e.g., per-plan quotas). The convention-based feature name is RateLimit.{PolicyName}, overridable via RateLimitPolicyOptions.FeatureName.

PropertyDefaultDescription
EnabledtrueEnable/disable rate limiting globally
KeyPrefix"rl"Redis key prefix
FallbackOnCounterStoreFailureDenyBehavior when Redis is down
BypassRoles[]Roles that skip rate limiting
UseFeatureBasedQuotasfalseUse Granit.Features for dynamic quotas
Policies.*Named rate limiting policies (see below)

Policy options:

PropertyDefaultDescription
AlgorithmSlidingWindowRate limiting algorithm
PartitionByTenantKey partitioning strategy (Tenant, TenantAndIp, Ip, User, TenantAndUser)
PermitLimit1000Max permits per window
Window00:01:00Window duration
SegmentsPerWindow6Sliding window segments (accuracy vs. memory)
TokenLimit50Max tokens (TokenBucket only)
TokensPerPeriod10Tokens added per replenishment (TokenBucket only)
ReplenishmentPeriod00:00:10Replenishment interval (TokenBucket only)
FeatureNamenullOverride feature name for dynamic quotas
CategoryKey typesPackage
ModulesGranitRateLimitingModule, GranitHttpRateLimitingModule, GranitRateLimitingWolverineModule
StoreIRateLimitCounterStore, IRateLimitQuotaProvider, RateLimitResult, TenantPartitionedRateLimiterGranit.RateLimiting
ExceptionsRateLimitExceededExceptionGranit.RateLimiting
OptionsGranitRateLimitingOptions, RateLimitPolicyOptions, RateLimitPartitionGranit.RateLimiting
Core extensionsAddGranitRateLimiting()Granit.RateLimiting
HTTP binding.RequireGranitRateLimiting(), AddGranitHttpRateLimiting() (Granit.Http.RateLimiting.AspNetCore)Granit.Http.RateLimiting
Wolverine bindingRateLimitedAttribute, RateLimitMiddleware (Granit.RateLimiting.Wolverine.Attributes)Granit.RateLimiting.Wolverine

Granit.RateLimiting was a single ASP.NET-Core-coupled package before the layer split. These are breaking moves (pre-1.0):

WasNow
using Granit.RateLimiting.AspNetCore; (endpoint filter)using Granit.Http.RateLimiting.AspNetCore; + reference Granit.Http.RateLimiting
RateLimitedAttribute in Granit.RateLimiting.AttributesGranit.RateLimiting.Wolverine.Attributes + reference Granit.RateLimiting.Wolverine
AddGranitRateLimiting() wired the 429 mapperAddGranitHttpRateLimiting() (auto via GranitHttpRateLimitingModule)
One GranitRateLimitingModuleCore module + GranitHttpRateLimitingModule / GranitRateLimitingWolverineModule