PII Detection

Free-text fields are a GDPR blind spot. A Comment column that starts as “Great product!” eventually contains “Call Jean Dupont at +32 478 123 456”. Your data map doesn’t list Comment as containing personal data — but it does.

Granit.Privacy.AI provides IAIPiiDetector: a service that scans arbitrary text and returns a structured list of detected PII types. It integrates with Granit.Privacy’s data provider registry and can be called anywhere you handle user-generated text.

PII types detected

`PiiType`	Examples
`PersonName`	”Jean Dupont”, “Dr. Smith”
`Email`	”[email protected]”
`PhoneNumber`	”+32 478 123 456”, “(555) 867-5309”
`Address`	”14 Rue de la Paix, Paris”
`NationalId`	SSN, NISS, BSN, NIE — country-agnostic
`DateOfBirth`	”born 14/03/1985”, “DOB: 1985-03-14”
`BankAccount`	IBAN, routing numbers
`CreditCard`	4-group card numbers
`Other`	Other identifiers not in the list above

Setup

[DependsOn(
    typeof(GranitPrivacyAIModule),
    typeof(GranitAIOllamaModule))]     // strongly recommended for GDPR
public class AppModule : GranitModule { }

builder.AddGranitAI();
builder.AddGranitAIOllama();    // Ollama: data never leaves your server
builder.AddGranitPrivacyAI();

{
  "AI": {
    "Privacy": {
      "WorkspaceName": "pii-detection",
      "TimeoutSeconds": 15
    }
  }
}

Using the detector

public class CommentValidator(IAIPiiDetector piiDetector) : AbstractValidator<CreateCommentRequest>
{
    public CommentValidator()
    {
        RuleFor(x => x.Body)
            .MustAsync(async (text, ct) =>
            {
                PiiDetectionResult result = await piiDetector
                    .ScanAsync(text, ct)
                    .ConfigureAwait(false);

                return !result.ContainsPii;
            })
            .WithMessage("Comments must not contain personal information (names, phone numbers, etc.).");
    }
}

Handling detection results

PiiDetectionResult result = await piiDetector.ScanAsync(userInput, ct).ConfigureAwait(false);

if (result.ContainsPii)
{
    foreach (DetectedPii item in result.Items)
    {
        // item.Type    → PiiType.Email, PiiType.PhoneNumber, ...
        // item.Description → "email address detected in first sentence"
        // Note: Description never contains the actual PII value
        logger.LogWarning("PII detected: {Type} — {Description}", item.Type, item.Description);
    }
}

The Description field explains where and what kind of PII was found, but never contains the actual value — so it is safe to log.

Integration with Granit.Privacy

Granit.Privacy.AI is designed to integrate with the IDataProviderRegistry — the registry that tracks which modules participate in GDPR right-to-erasure and data discovery.

Scanning unstructured fields

// Register a data provider that scans comment fields
public class CommentPiiDataProvider(
    ICommentRepository comments,
    IAIPiiDetector piiDetector) : IDataProvider
{
    public async IAsyncEnumerable<PersonalDataEntry> DiscoverAsync(
        Guid subjectId,
        [EnumeratorCancellation] CancellationToken ct)
    {
        await foreach (Comment comment in comments.GetByAuthorAsync(subjectId, ct))
        {
            PiiDetectionResult scan = await piiDetector
                .ScanAsync(comment.Body, ct)
                .ConfigureAwait(false);

            if (scan.ContainsPii)
            {
                yield return new PersonalDataEntry(
                    Field: "Comment.Body",
                    EntityId: comment.Id,
                    PiiTypes: scan.Items.Select(i => i.Type.ToString()).ToArray());
            }
        }
    }
}

Dedicated workspace (recommended)

For PII detection, configure a separate AI workspace using a local model. This isolates PII traffic from other AI usage:

{
  "AI": {
    "Workspaces": [
      {
        "Name": "pii-detection",
        "Provider": "Ollama",
        "Model": "llama3.1",
        "SystemPrompt": "You are a GDPR compliance assistant. Detect personally identifiable information in text. Be conservative — only flag clear PII, not ambiguous terms.",
        "Temperature": 0.0
      }
    ],
    "Privacy": {
      "WorkspaceName": "pii-detection"
    }
  }
}

Setting Temperature: 0.0 reduces hallucinations — you want deterministic detection, not creative interpretation.

Performance considerations

PII detection calls the LLM synchronously. For high-throughput pipelines, consider running detection asynchronously via Wolverine:

// Wolverine handler — triggered after record creation
public static async Task Handle(
    CommentCreatedEvent evt,
    IAIPiiDetector detector,
    IPrivacyFlagWriter flags,
    CancellationToken ct)
{
    PiiDetectionResult result = await detector.ScanAsync(evt.Body, ct).ConfigureAwait(false);

    if (result.ContainsPii)
    {
        await flags.FlagAsync(evt.CommentId, result.Items, ct).ConfigureAwait(false);
        // Optional: notify DPO, trigger pseudonymization workflow, etc.
    }
}

Scenario	Recommended execution	Rationale
Form validation (user-facing)	Synchronous, 2s timeout, fail-open	Blocking is acceptable for submission
Batch scan of existing data	Async Wolverine handler	Potentially millions of records
Import pipeline	Async, after persistence	Scan after save — upload should not wait

Fail-open design

When the LLM is unavailable or times out, LlmPiiDetector returns:

new PiiDetectionResult { ContainsPii = false, Items = [] }

The call is logged as a warning. The calling code receives a “clean” result and continues. This is intentional: a PII scanner should never block the application — it should flag for review, not gate operations.

If you need hard blocking (reject on scanner failure), wrap the call:

bool safe = await piiDetector.ScanAsync(text, ct)
    .ContinueWith(t => t.IsCompletedSuccessfully && !t.Result.ContainsPii);

Configuration reference

Property	Type	Default	Description
`WorkspaceName`	`string`	`"default"`	AI workspace for PII detection. Use an Ollama workspace in production
`TimeoutSeconds`	`int`	`15`	LLM call timeout. PII detection is allowed more time than moderation