Skip to content

Rate Limiting Pattern — API Throttling

Rate Limiting controls the number of requests a client can send within a given time window. In a multi-tenant SaaS context, it protects against the noisy neighbor problem — a greedy tenant degrading performance for everyone else. Granit implements this pattern via Granit.RateLimiting with per-tenant partitioning, atomic Redis counters (Lua scripts), and dynamic quotas linked to pricing plans via Granit.Features.

flowchart LR
    R[HTTP Request] --> F{Bypass?}
    F -- Admin role --> A[Allowed]
    F -- No --> T[Tenant resolution]
    T --> Q[Quota resolution]
    Q --> C{Redis counter}
    C -- within limit --> A
    C -- over limit --> D[429 Too Many Requests]
    D --> RA[Retry-After header]
sequenceDiagram
    participant Client
    participant Filter as Endpoint Filter
    participant Limiter as TenantPartitionedRateLimiter
    participant Redis

    Client->>Filter: GET /api/patients
    Filter->>Limiter: CheckAsync("api", clientIp)
    Limiter->>Redis: EVALSHA sliding_window.lua
    Redis-->>Limiter: count: 42, oldest: 0
    Limiter-->>Filter: Allowed (remaining: 58)
    Filter-->>Client: 200 OK + X-RateLimit-Remaining: 58

    Note over Client,Redis: After 100 requests in 60s...

    Client->>Filter: GET /api/patients
    Filter->>Limiter: CheckAsync("api", clientIp)
    Limiter->>Redis: EVALSHA sliding_window.lua
    Redis-->>Limiter: count: 101, oldest: 18000
    Limiter-->>Filter: Rejected (retryAfter: 18s)
    Filter-->>Client: 429 + Retry-After: 18
PackageRole
Granit.RateLimitingComplete module: counters, middleware, options, metrics

Each algorithm is implemented as a Lua script executed atomically by Redis (EVALSHA). Timestamps are taken server-side (redis.call('TIME')) to avoid clock drift issues between pods.

AlgorithmRedis structureUse case
Sliding WindowSorted set (ZADD + ZREMRANGEBYSCORE)Public APIs — maximum precision
Fixed WindowCounter (INCR + PEXPIRE)Low-volume endpoints — simplicity
Token BucketHash (HMGET/HSET + refill)Export jobs — controlled bursts

The Redis key is structured with a hash tag to guarantee co-location in Redis Cluster. The PartitionBy policy option controls the key strategy:

PartitionByKey patternUse case
Tenant (default)rl:{tenantId}:apiShared tenant quota
TenantAndIprl:{tenantId}:1.2.3.4:apiPer-IP within a tenant
Iprl:{1.2.3.4}:authPre-auth endpoints (login)
Userrl:{userId}:exportPer-user quota
TenantAndUserrl:{tenantId}:userId:apiPer-user within a tenant

Without multi-tenancy, the global segment is used. Each partition key has its own counters — a tenant (or IP, or user) can never consume another’s quota.

When UseFeatureBasedQuotas is enabled, the PermitLimit is resolved dynamically from Granit.Features instead of static configuration:

// Convention: Numeric feature named "RateLimit.{policyName}"
context.Add(
new FeatureDefinition("RateLimit.api", FeatureValueType.Numeric(100, 10, 10000))
);

The Features resolution chain (Default > Plan > Tenant) enables differentiated quotas:

PlanRateLimit.apiRateLimit.export
Free60/min5/h
Pro500/min50/h
Enterprise5000/minUnlimited
// --- ASP.NET Core: endpoint filter ---
app.MapGet("/api/v1/patients", GetPatientsAsync)
.RequireGranitRateLimiting("api");
// --- Wolverine: attribute on the message ---
[RateLimited("export")]
public sealed record GeneratePatientExportCommand(Guid PatientId);

The HTTP filter returns 429 Too Many Requests (RFC 7807) with a Retry-After header. The Wolverine middleware throws RateLimitExceededException, usable with RetryWithCooldown.

When Redis is unavailable, the behavior is configurable:

ModeBehaviorWhen to use
Deny (default)Systematic 429Fail-closed — security first
AllowRequest allowed + warningAvailability > quota protection
FileRole
src/Granit.RateLimiting/Internal/LuaScripts.cs3 atomic Lua scripts
src/Granit.RateLimiting/TenantPartitionedRateLimiter.csCore logic (partition, bypass, quota, metrics)
src/Granit.RateLimiting/Internal/RedisRateLimitCounterStore.csRedis execution with fallback
src/Granit.RateLimiting/Internal/FeatureBasedRateLimitQuotaProvider.csQuota resolution via Granit.Features
src/Granit.RateLimiting/AspNetCore/RateLimitEndpointExtensions.csEndpoint filter 429 + Retry-After
src/Granit.RateLimiting/Wolverine/RateLimitMiddleware.csWolverine BeforeAsync middleware
ProblemSolution
Greedy tenant saturates the API for everyone (noisy neighbor)Counters partitioned by tenant/IP/user via PartitionBy
Identical quota limits for all plansGranit.Features Numeric resolves dynamically by plan
Redis failure = blocked serviceConfigurable graceful degradation (Allow/Deny)
Clock drift between pods = inconsistent countersredis.call('TIME') in Lua scripts
Rate limiting HTTP but not messagingDual integration endpoint filter + Wolverine middleware
Admin blocked by their own rate limitingConfigurable BypassRoles
// --- appsettings.json ---
// {
// "RateLimiting": {
// "BypassRoles": ["Admin"],
// "UseFeatureBasedQuotas": true,
// "Policies": {
// "api": { "Algorithm": "SlidingWindow", "PermitLimit": 100, "Window": "00:01:00" },
// "auth": { "Algorithm": "FixedWindow", "PermitLimit": 5, "Window": "00:15:00", "PartitionBy": "Ip" }
// }
// }
// }
// --- Module registration ---
[DependsOn(typeof(GranitRateLimitingModule))]
public sealed class AppModule : GranitModule { }
// --- Applying policies ---
app.MapGet("/api/v1/appointments", ListAppointmentsAsync)
.RequireGranitRateLimiting("api");
app.MapPost("/api/v1/auth/login", LoginAsync)
.RequireGranitRateLimiting("auth"); // 5 attempts / 15 min