Skip to content

Granit.Observability & Diagnostics

Granit.Observability wires Serilog structured logging and OpenTelemetry (traces + metrics) into a single AddGranitObservability() call. Granit.Diagnostics adds Kubernetes-native health check endpoints with stampede-protected caching and a JSON response writer for Grafana dashboards.

  • Granit.Observability Serilog + OpenTelemetry OTLP export
  • Granit.Diagnostics Kubernetes health probes (liveness, readiness, startup)
PackageRoleDepends on
Granit.ObservabilitySerilog + OpenTelemetry (traces, metrics, logs)Granit.Core
Granit.DiagnosticsHealth check endpoints, response writer, cachingGranit.Timing
graph LR
    App[ASP.NET Core App] --> Serilog
    App --> OTel[OpenTelemetry SDK]

    Serilog -->|WriteTo.OpenTelemetry| Collector[OTLP Collector :4317]
    OTel -->|OTLP gRPC| Collector

    Collector --> Loki[Loki — Logs]
    Collector --> Tempo[Tempo — Traces]
    Collector --> Mimir[Mimir — Metrics]

    Loki --> Grafana
    Tempo --> Grafana
    Mimir --> Grafana
[DependsOn(typeof(GranitObservabilityModule))]
public class AppModule : GranitModule { }
{
"Observability": {
"ServiceName": "my-backend",
"ServiceVersion": "1.2.0",
"OtlpEndpoint": "http://otel-collector:4317",
"ServiceNamespace": "my-company",
"Environment": "production"
}
}

AddGranitObservability() configures two Serilog sinks:

SinkPurpose
ConsoleLocal development, [HH:mm:ss LEV] SourceContext Message
OpenTelemetryOTLP export to Loki via the collector

Every log entry is enriched with ServiceName, ServiceVersion, and Environment properties, matching the OpenTelemetry resource attributes for correlation.

Additional Serilog settings (minimum level, overrides, extra sinks) can be added via standard Serilog configuration in appsettings.jsonReadFrom.Configuration is called before the Granit enrichers.

Three built-in instrumentations are registered automatically:

InstrumentationWhat it captures
ASP.NET CoreInbound HTTP requests (method, route, status code)
HttpClientOutbound HTTP calls (dependency tracking)
EF CoreDatabase queries (command text, duration)

Health check endpoints (/health/*) are filtered out of traces to avoid noise.

Granit modules register their own ActivitySource names via GranitActivitySourceRegistry.Register() during host configuration. AddGranitObservability() reads the registry and calls AddSource() for each — no manual wiring needed.

// Inside a module's AddGranit*() extension method
GranitActivitySourceRegistry.Register("Granit.Workflow");

Modules that create spans register their ActivitySource name at startup. The table below lists every source and the span names it emits.

ActivitySourceSpan names
Granit.Vaultvault.encrypt, vault.decrypt, vault.get-secret, vault.check-rotation
Granit.Vault.Azureakv.encrypt, akv.decrypt, akv.get-secret, akv.check-rotation
Granit.Wolverinewolverine.send, wolverine.handle
Granit.Notificationsnotification.dispatch, notification.deliver
Granit.Notifications.Email.Smtpsmtp.send
Granit.Notifications.Email.AwsSesses.send
Granit.Notifications.Email.AzureCommunicationServicesacs-email.send
Granit.Notifications.Sms.AzureCommunicationServicesacs-sms.send
Granit.Notifications.MobilePush.AzureNotificationHubsanh.send
Granit.Notifications.Brevobrevo.send
Granit.Notifications.Zulipzulip.send
Granit.Workflowworkflow.transition
Granit.BlobStorageblob.upload, blob.download, blob.delete
Granit.DataExchangeimport.execute, export.execute

MapGranitHealthChecks() registers three endpoints, all AllowAnonymous (the kubelet cannot authenticate):

ProbePathBehaviorFailure effect
Liveness/health/liveAlways returns 200 — no dependency checksPod restart
Readiness/health/readyChecks tagged "readiness"Pod removed from load balancer
Startup/health/startupChecks tagged "startup"Liveness/readiness disabled until healthy

Status code mapping (readiness and startup)

Section titled “Status code mapping (readiness and startup)”
HealthStatusHTTPEffect
Healthy200Pod receives traffic
Degraded200Pod stays in load balancer (non-critical degradation)
Unhealthy503Pod removed from load balancer

Granit modules provide opt-in health checks via AddGranit*HealthCheck() extension methods on IHealthChecksBuilder. Each check follows the same pattern: sanitized error messages (never exposing credentials), structured data where applicable, and appropriate tags for Kubernetes probes.

ModuleExtension methodProbeTags
Granit.PersistenceAddGranitDbContextHealthCheck()EF Core CanConnectAsyncreadiness, startup
Granit.Caching.StackExchangeRedisAddGranitRedisHealthCheck()Redis PING with latency thresholdreadiness, startup
Granit.VaultAddGranitVaultHealthCheck()Vault seal status + authreadiness, startup
Granit.Vault.AwsAddGranitKmsHealthCheck()KMS DescribeKey (Degraded on PendingDeletion)readiness, startup
Granit.Identity.KeycloakAddGranitKeycloakHealthCheck()client_credentials token requestreadiness, startup
Granit.Identity.EntraIdAddGranitEntraIdHealthCheck()client_credentials token requestreadiness, startup
Granit.BlobStorage.S3AddGranitS3HealthCheck()ListObjectsV2(MaxKeys=1)readiness, startup
Granit.Notifications.Email.SmtpAddGranitSmtpHealthCheck()EHLO handshake via MailKitreadiness
Granit.Notifications.Email.AwsSesAddGranitAwsSesHealthCheck()GetAccount() (Degraded if sending paused)readiness, startup
Granit.Notifications.BrevoAddGranitBrevoHealthCheck()GET /accountreadiness
Granit.Notifications.ZulipAddGranitZulipHealthCheck()GET /api/v1/users/mereadiness
Granit.Vault.AzureAddGranitAzureKeyVaultHealthCheck()GetKey probereadiness
Granit.Notifications.Email.AzureCommunicationServicesAddGranitAcsEmailHealthCheck()Send probereadiness
Granit.Notifications.Sms.AzureCommunicationServicesAddGranitAcsSmsHealthCheck()Send probereadiness
Granit.Notifications.MobilePush.AzureNotificationHubsAddGranitAzureNotificationHubsHealthCheck()Hub descriptionreadiness

All built-in health checks wrap their external call with .WaitAsync(10s, cancellationToken). If the dependency does not respond within 10 seconds, the check returns Unhealthy immediately instead of blocking the Kubernetes probe cycle.

Tag your health checks with "readiness" and/or "startup" so they are picked up by the correct probe. Use the built-in extension methods when available:

builder.Services.AddHealthChecks()
.AddGranitDbContextHealthCheck<AppDbContext>()
.AddGranitRedisHealthCheck(degradedThreshold: TimeSpan.FromMilliseconds(100))
.AddGranitKeycloakHealthCheck()
.AddGranitAwsSesHealthCheck()
.AddGranitBrevoHealthCheck()
.AddGranitZulipHealthCheck()
.AddGranitAzureKeyVaultHealthCheck()
.AddGranitAcsEmailHealthCheck()
.AddGranitAcsSmsHealthCheck()
.AddGranitAzureNotificationHubsHealthCheck();

GranitHealthCheckWriter produces a structured JSON payload for Grafana/Loki dashboards. Kubernetes only reads the HTTP status code; the body is for operations teams.

{
"status": "Healthy",
"duration": 12.3,
"checks": [
{
"name": "database",
"status": "Healthy",
"duration": 8.1,
"tags": ["readiness", "startup"]
}
]
}

CachedHealthCheck wraps any IHealthCheck with a SemaphoreSlim double-check locking pattern to prevent stampede when many pods are probed simultaneously.

sequenceDiagram
    participant P1 as Probe 1
    participant P2 as Probe 2
    participant C as CachedHealthCheck
    participant DB as Dependency

    P1->>C: CheckHealthAsync()
    P2->>C: CheckHealthAsync()
    C->>C: Cache expired
    C->>C: Acquire SemaphoreSlim
    Note over P2,C: P2 waits (lock held)
    C->>DB: inner.CheckHealthAsync()
    DB-->>C: Healthy
    C->>C: Cache result (10s default)
    C->>C: Release lock
    C-->>P1: Healthy
    C->>C: Double-check → cache hit
    C-->>P2: Healthy (from cache)

With 50 pods probed every 3 seconds, an uncached database check generates ~16 req/s. The cache reduces that to 1 request per DefaultCacheDuration per pod.

ObservabilityOptions section: Observability

Section titled “ObservabilityOptions ”
PropertyTypeDefaultDescription
ServiceNamestring"unknown-service"Service name for OTEL resource
ServiceVersionstring"0.0.0"Service version
OtlpEndpointstring"http://localhost:4317"OTLP gRPC endpoint
ServiceNamespacestring"my-company"Service namespace
Environmentstring"development"Deployment environment
EnableTracingbooltrueEnable trace export via OTLP
EnableMetricsbooltrueEnable metrics export via OTLP

DiagnosticsOptions section: DiagnosticsOptions

Section titled “DiagnosticsOptions ”
PropertyTypeDefaultDescription
LivenessPathstring"/health/live"Liveness probe endpoint path
ReadinessPathstring"/health/ready"Readiness probe endpoint path
StartupPathstring"/health/startup"Startup probe endpoint path
DefaultCacheDurationTimeSpan00:00:10Cache TTL for CachedHealthCheck
CategoryKey typesPackage
ModulesGranitObservabilityModule, GranitDiagnosticsModule
OptionsObservabilityOptions, DiagnosticsOptions
Health checksCachedHealthCheck, GranitHealthCheckWriterGranit.Diagnostics
Activity sourcesGranitActivitySourceRegistryGranit.Core
ExtensionsAddGranitObservability(), AddGranitDiagnostics(), MapGranitHealthChecks()