Rate Limiting Pattern — API Throttling
Definition
Section titled “Definition”Rate Limiting controls the number of requests a client can send within a
given time window. In a multi-tenant SaaS context, it protects against the
noisy neighbor problem — a greedy tenant degrading performance for
everyone else. Granit implements this pattern via Granit.RateLimiting with
per-tenant partitioning, atomic Redis counters (Lua scripts), and dynamic
quotas linked to pricing plans via Granit.Features.
Diagram
Section titled “Diagram”flowchart LR
R[HTTP Request] --> F{Bypass?}
F -- Admin role --> A[Allowed]
F -- No --> T[Tenant resolution]
T --> Q[Quota resolution]
Q --> C{Redis counter}
C -- within limit --> A
C -- over limit --> D[429 Too Many Requests]
D --> RA[Retry-After header]
sequenceDiagram
participant Client
participant Filter as Endpoint Filter
participant Limiter as TenantPartitionedRateLimiter
participant Redis
Client->>Filter: GET /api/patients
Filter->>Limiter: CheckAsync("api", clientIp)
Limiter->>Redis: EVALSHA sliding_window.lua
Redis-->>Limiter: count: 42, oldest: 0
Limiter-->>Filter: Allowed (remaining: 58)
Filter-->>Client: 200 OK + X-RateLimit-Remaining: 58
Note over Client,Redis: After 100 requests in 60s...
Client->>Filter: GET /api/patients
Filter->>Limiter: CheckAsync("api", clientIp)
Limiter->>Redis: EVALSHA sliding_window.lua
Redis-->>Limiter: count: 101, oldest: 18000
Limiter-->>Filter: Rejected (retryAfter: 18s)
Filter-->>Client: 429 + Retry-After: 18
Implementation in Granit
Section titled “Implementation in Granit”Package
Section titled “Package”| Package | Role |
|---|---|
Granit.RateLimiting | Complete module: counters, middleware, options, metrics |
Three algorithms via Lua scripts
Section titled “Three algorithms via Lua scripts”Each algorithm is implemented as a Lua script executed atomically by Redis
(EVALSHA). Timestamps are taken server-side (redis.call('TIME')) to avoid
clock drift issues between pods.
| Algorithm | Redis structure | Use case |
|---|---|---|
| Sliding Window | Sorted set (ZADD + ZREMRANGEBYSCORE) | Public APIs — maximum precision |
| Fixed Window | Counter (INCR + PEXPIRE) | Low-volume endpoints — simplicity |
| Token Bucket | Hash (HMGET/HSET + refill) | Export jobs — controlled bursts |
Key partitioning
Section titled “Key partitioning”The Redis key is structured with a hash tag to guarantee co-location in
Redis Cluster. The PartitionBy policy option controls the key strategy:
PartitionBy | Key pattern | Use case |
|---|---|---|
Tenant (default) | rl:{tenantId}:api | Shared tenant quota |
TenantAndIp | rl:{tenantId}:1.2.3.4:api | Per-IP within a tenant |
Ip | rl:{1.2.3.4}:auth | Pre-auth endpoints (login) |
User | rl:{userId}:export | Per-user quota |
TenantAndUser | rl:{tenantId}:userId:api | Per-user within a tenant |
Without multi-tenancy, the global segment is used. Each partition key has its
own counters — a tenant (or IP, or user) can never consume another’s quota.
Dynamic quotas by plan
Section titled “Dynamic quotas by plan”When UseFeatureBasedQuotas is enabled, the PermitLimit is resolved
dynamically from Granit.Features instead of static configuration:
// Convention: Numeric feature named "RateLimit.{policyName}"context.Add( new FeatureDefinition("RateLimit.api", FeatureValueType.Numeric(100, 10, 10000)));The Features resolution chain (Default > Plan > Tenant) enables differentiated quotas:
| Plan | RateLimit.api | RateLimit.export |
|---|---|---|
| Free | 60/min | 5/h |
| Pro | 500/min | 50/h |
| Enterprise | 5000/min | Unlimited |
Dual integration: HTTP + Messaging
Section titled “Dual integration: HTTP + Messaging”// --- ASP.NET Core: endpoint filter ---app.MapGet("/api/v1/patients", GetPatientsAsync) .RequireGranitRateLimiting("api");
// --- Wolverine: attribute on the message ---[RateLimited("export")]public sealed record GeneratePatientExportCommand(Guid PatientId);The HTTP filter returns 429 Too Many Requests (RFC 7807) with a
Retry-After header. The Wolverine middleware throws RateLimitExceededException,
usable with RetryWithCooldown.
Graceful degradation
Section titled “Graceful degradation”When Redis is unavailable, the behavior is configurable:
| Mode | Behavior | When to use |
|---|---|---|
Deny (default) | Systematic 429 | Fail-closed — security first |
Allow | Request allowed + warning | Availability > quota protection |
Reference files
Section titled “Reference files”| File | Role |
|---|---|
src/Granit.RateLimiting/Internal/LuaScripts.cs | 3 atomic Lua scripts |
src/Granit.RateLimiting/TenantPartitionedRateLimiter.cs | Core logic (partition, bypass, quota, metrics) |
src/Granit.RateLimiting/Internal/RedisRateLimitCounterStore.cs | Redis execution with fallback |
src/Granit.RateLimiting/Internal/FeatureBasedRateLimitQuotaProvider.cs | Quota resolution via Granit.Features |
src/Granit.RateLimiting/AspNetCore/RateLimitEndpointExtensions.cs | Endpoint filter 429 + Retry-After |
src/Granit.RateLimiting/Wolverine/RateLimitMiddleware.cs | Wolverine BeforeAsync middleware |
Rationale
Section titled “Rationale”| Problem | Solution |
|---|---|
| Greedy tenant saturates the API for everyone (noisy neighbor) | Counters partitioned by tenant/IP/user via PartitionBy |
| Identical quota limits for all plans | Granit.Features Numeric resolves dynamically by plan |
| Redis failure = blocked service | Configurable graceful degradation (Allow/Deny) |
| Clock drift between pods = inconsistent counters | redis.call('TIME') in Lua scripts |
| Rate limiting HTTP but not messaging | Dual integration endpoint filter + Wolverine middleware |
| Admin blocked by their own rate limiting | Configurable BypassRoles |
Usage example
Section titled “Usage example”// --- appsettings.json ---// {// "RateLimiting": {// "BypassRoles": ["Admin"],// "UseFeatureBasedQuotas": true,// "Policies": {// "api": { "Algorithm": "SlidingWindow", "PermitLimit": 100, "Window": "00:01:00" },// "auth": { "Algorithm": "FixedWindow", "PermitLimit": 5, "Window": "00:15:00", "PartitionBy": "Ip" }// }// }// }
// --- Module registration ---[DependsOn(typeof(GranitRateLimitingModule))]public sealed class AppModule : GranitModule { }
// --- Applying policies ---app.MapGet("/api/v1/appointments", ListAppointmentsAsync) .RequireGranitRateLimiting("api");
app.MapPost("/api/v1/auth/login", LoginAsync) .RequireGranitRateLimiting("auth"); // 5 attempts / 15 min