Skip to content

AWS IoT Core Bridge

The AWS bridge is a companion to the cloud-agnostic Device aggregate — not a replacement. Six Ring-3 packages map every device lifecycle event to AWS IoT Core resources (Things, X.509 certificates, Secrets Manager entries, Device Shadow documents, IoT Jobs) without ever modifying the core domain.

Why a bridge, not a replacement aggregate?

Section titled “Why a bridge, not a replacement aggregate?”

The cloud-agnostic Device already owns SerialNumber, Status, Credential, Heartbeat, Workflow state and Timeline entries. Stamping a second AWS-specific Device next to it would create two sources of truth and force a permanent synchronisation problem.

We instead introduce a 1:1 companion AwsThingBinding keyed by Device.Id. The binding stores everything AWS needs (ThingName, ThingArn, CertificateArn, CertificateSecretArn, ProvisioningStatus, LastShadowReportedAt, ClaimCertificateExpiresAt, ProvisionedViaJitp) — and nothing the core domain needs.

Reactions to DeviceProvisionedEvent / DeviceActivatedEvent / DeviceDecommissionedEvent happen via Wolverine handlers in the bridge packages. The same pattern as Notifications and Timeline, just sized for an entire cloud provider.

flowchart TB
  CORE["Granit.IoT (Ring 1)<br/>cloud-agnostic Device + events"]

  subgraph "AWS bridge family (Ring 3)"
    AWS["Granit.IoT.Aws<br/>companion AwsThingBinding,<br/>ThingName VO, IAwsIoTCredentialProvider"]
    EF["Granit.IoT.Aws.EntityFrameworkCore<br/>isolated AwsBindingDbContext (iotaws_*)"]
    P["Granit.IoT.Aws.Provisioning<br/>IAmazonIoT + Secrets Manager,<br/>idempotent saga handler"]
    S["Granit.IoT.Aws.Shadow<br/>reported push + delta polling →<br/>DeviceDesiredStateChangedEvent"]
    J["Granit.IoT.Aws.Jobs<br/>IDeviceCommandDispatcher backed by IoT Jobs"]
    FP["Granit.IoT.Aws.FleetProvisioning<br/>POST /verify + /registered (JITP),<br/>claim cert rotation"]
  end

  AWS --> CORE
  EF --> AWS
  P --> AWS
  S --> AWS
  J --> AWS
  FP --> AWS

The umbrella Granit.Bundle.IoT.Aws meta-package references all six. Hosts targeting AWS pull it once; sister bundles for other providers (Azure IoT Hub, Scaleway IoT Hub) ship separately and never collide.

PackageRole
Granit.IoT.AwsCompanion AwsThingBinding, ThingName VO, IAwsIoTCredentialProvider
Granit.IoT.Aws.EntityFrameworkCoreIsolated AwsBindingDbContext (iotaws_* schema)
Granit.IoT.Aws.ProvisioningIAmazonIoT + IAmazonSecretsManager, idempotent saga handler
Granit.IoT.Aws.ShadowIAmazonIotData, reported-state push + delta polling
Granit.IoT.Aws.JobsIDeviceCommandDispatcher backed by AWS IoT Jobs
Granit.IoT.Aws.FleetProvisioningJITP endpoints + ClaimCertificateRotationCheckService

AwsThingBinding.ProvisioningStatus is the saga state machine:

stateDiagram-v2
  [*] --> Pending
  Pending --> ThingCreated : DescribeThing / CreateThing
  ThingCreated --> CertIssued : CreateKeysAndCertificate
  CertIssued --> SecretStored : Secrets Manager (ClientRequestToken = binding.Id)
  SecretStored --> Active : AttachPolicy + AttachThingPrincipal
  Active --> Decommissioned : DeviceDecommissionedEvent
  CertIssued --> Failed : Crash between cert and secret
  Failed --> [*]
  Decommissioned --> [*]

AwsThingBridgeHandler reacts to DeviceProvisionedEvent and walks the binding through every checkpoint. Each forward step:

  1. Short-circuits if the binding has already crossed the matching checkpoint (lock-free read of ProvisioningStatus).
  2. Defensively calls AWS Describe* before any Create*. A crash between an AWS-side success and the matching DB commit recovers cleanly on Wolverine replay.
  3. Mutates the in-memory binding and lets the handler persist the new state immediately afterwards.

ThingName format — per-tenant IAM isolation

Section titled “ThingName format — per-tenant IAM isolation”

AwsThingBinding.ThingName is imposed as t{tenantId:N}-{serialNumber} (32-char hex tenant id + dash + serial). Two reasons:

  • Zero collision across tenants — Guid uniqueness baked in.
  • IAM policy isolation — AWS IoT policies can use ${iot:Connection.Thing.ThingName} with a strict t{tenantId}-* prefix to enforce per-tenant scoping at the broker level. The unique DB constraint on thing_name therefore also enforces tenant isolation in the persistence layer.

Granit.IoT.Aws ships two IAwsIoTCredentialProvider implementations, selected by configuration:

ModeWhen pickedBehaviour
IAM-role modeFleetCredentialSecretArn is nullReturns null for every key; the AWS SDK default credential chain authenticates outbound traffic (instance role, ECS task role, env vars). IsReady is always true
Rotating modeFleetCredentialSecretArn setBackgroundService polls an IAwsIoTCredentialLoader on RotationCheckIntervalMinutes (default 5). Lock-free volatile reads, stale-ok on refresh failure, bounded initial fetch via TimeProvider

The Secrets-Manager-backed loader ships in Granit.IoT.Aws.Provisioning so Granit.IoT.Aws itself never references the AWS SDK.

The IsReady gate must be checked by every endpoint that calls AWS; hosts publish a matching readiness probe via HealthChecks.

Granit.IoT.Aws.Shadow mirrors device lifecycle into the AWS Device Shadow:

sequenceDiagram
  participant G as Granit Device
  participant H as DeviceLifecycleShadowHandler
  participant AWS as AWS Device Shadow
  participant S as ShadowDeltaPollingService
  participant E as ILocalEventBus

  G->>H: DeviceActivatedEvent
  H->>AWS: UpdateThingShadow(reported = {status:"Active",...})
  Note over S,AWS: Every 30s (default)
  S->>AWS: GetThingShadow(active bindings)
  AWS-->>S: state.delta = {firmware:"1.5.0"}
  S->>E: DeviceDesiredStateChangedEvent
  • Granit → AWS: DeviceLifecycleShadowHandler reacts to Device{Activated,Suspended,Reactivated}Event and pushes {"status":"…","updatedAt":"…"} (IClock injected for deterministic timestamps).
  • AWS → Granit: ShadowDeltaPollingService walks active bindings every PollIntervalSeconds (default 30), parses state.delta, and publishes DeviceDesiredStateChangedEvent on ILocalEventBus.

Granit.IoT.Aws.Jobs consumes that event and turns each delta key into an AWS IoT Job dispatched against the originating Thing. The correlationId is SHA-256(deviceId, shadowVersion) so re-delivery dispatches into the same Job.

Granit.IoT.Aws.FleetProvisioning handles the inverted flow where AWS IoT creates the Thing and certificate before notifying Granit:

sequenceDiagram
  participant D as Device (claim cert)
  participant AWS as AWS IoT
  participant L as Customer Lambda
  participant G as Granit FleetProvisioningEndpoints
  participant DB as iotaws_*

  D->>AWS: CONNECT with claim cert
  AWS->>L: Trigger pre-provisioning hook
  L->>G: POST /api/iot/fleet-provisioning/verify
  G-->>L: 200 OK (or 403 if serial is in deny-list)
  L->>AWS: CreateThing + CreateOperationalCert
  AWS->>L: New cert returned to device
  L->>G: POST /api/iot/fleet-provisioning/registered
  G->>DB: INSERT Device (Active) + AwsThingBinding (ProvisionedViaJitp = true)

The standard saga handler short-circuits on ProvisionedViaJitp == true to avoid attempting to recreate a Thing that AWS already created.

ClaimCertificateRotationCheckService runs daily and surfaces expiring claim certificates as ClaimCertificateExpiringEvent so operators rotate them before fleet enrolments break.

builder.Services.AddGranitIoTAwsCredentials();
builder.Services.AddGranitIoTAwsEntityFrameworkCore(o =>
o.UseNpgsql(connectionString));
builder.Services.AddGranitIoTAwsProvisioning();
builder.Services.AddGranitIoTAwsShadow();
builder.Services.AddGranitIoTAwsJobs();
builder.Services.AddGranitIoTAwsFleetProvisioning();
app.MapGranitIoTAwsFleetProvisioningEndpoints();

Or, with the meta-bundle:

builder.Services.AddGranitBundleIoTAws(o => o.UseNpgsql(connectionString));

Each package has its own section under IoT:Aws:*. Defaults are production-sane; override per environment.

SectionKnobs
IoT:Aws:CredentialsFleetCredentialSecretArn, RotationCheckIntervalMinutes, InitialFetchTimeoutSeconds
IoT:Aws:ProvisioningDevicePolicyName (required), SecretNameTemplate, SecretKmsKeyId
IoT:Aws:ShadowPollIntervalSeconds, PollBatchSize, AutoPushLifecycleStatus
IoT:Aws:JobsJobIdPrefix, JobTrackingTtlHours, StatusPollIntervalSeconds, StatusPollBatchSize
IoT:Aws:FleetProvisioningExpiryWarningWindowDays, RotationCheckIntervalHours, RotationCheckBatchSize

Every package ships a dedicated meter (Granit.IoT.Aws.*) tagged with tenant_id and, where it makes sense, the matching operation. Dashboards can roll up across the family or split per package. Architecture tests enforce that every Internal namespace stays internal-only.

  • GDPR: AWS resources tied to a tenant are deleted by the DecommissionAsync path on DeviceDecommissionedEvent — Thing, certificate, secret, and binding all torn down idempotently.
  • ISO 27001: every saga transition is logged via LoggerMessage source-gen with structured fields; every counter is tagged by tenant_id; and the AwsThingBinding aggregate is FullAuditedAggregateRoot so CreatedBy / ModifiedBy / DeletedBy travel with every row.