PII Detection — GDPR-Compliant Data Scanning
Free-text fields are a GDPR blind spot. A Comment column that starts as “Great product!”
eventually contains “Call Jean Dupont at +32 478 123 456”. Your data map doesn’t list
Comment as containing personal data — but it does.
Granit.Privacy.AI provides IAIPiiDetector: a service that scans arbitrary text
and returns a structured list of detected PII types. It integrates with
Granit.Privacy’s data provider registry and can be called anywhere you handle
user-generated text.
PII types detected
Section titled “PII types detected”PiiType | Examples |
|---|---|
PersonName | ”Jean Dupont”, “Dr. Smith” |
Email | ”jean@example.com” |
PhoneNumber | ”+32 478 123 456”, “(555) 867-5309” |
Address | ”14 Rue de la Paix, Paris” |
NationalId | SSN, NISS, BSN, NIE — country-agnostic |
DateOfBirth | ”born 14/03/1985”, “DOB: 1985-03-14” |
BankAccount | IBAN, routing numbers |
CreditCard | 4-group card numbers |
Other | Other identifiers not in the list above |
[DependsOn( typeof(GranitPrivacyAIModule), typeof(GranitAIOllamaModule))] // strongly recommended for GDPRpublic class AppModule : GranitModule { }builder.AddGranitAI();builder.AddGranitAIOllama(); // Ollama: data never leaves your serverbuilder.AddGranitPrivacyAI();{ "AI": { "Privacy": { "WorkspaceName": "pii-detection", "TimeoutSeconds": 15 } }}Using the detector
Section titled “Using the detector”public class CommentValidator(IAIPiiDetector piiDetector) : AbstractValidator<CreateCommentRequest>{ public CommentValidator() { RuleFor(x => x.Body) .MustAsync(async (text, ct) => { PiiDetectionResult result = await piiDetector .ScanAsync(text, ct) .ConfigureAwait(false);
return !result.ContainsPii; }) .WithMessage("Comments must not contain personal information (names, phone numbers, etc.)."); }}Handling detection results
Section titled “Handling detection results”PiiDetectionResult result = await piiDetector.ScanAsync(userInput, ct).ConfigureAwait(false);
if (result.ContainsPii){ foreach (DetectedPii item in result.Items) { // item.Type → PiiType.Email, PiiType.PhoneNumber, ... // item.Description → "email address detected in first sentence" // Note: Description never contains the actual PII value logger.LogWarning("PII detected: {Type} — {Description}", item.Type, item.Description); }}The Description field explains where and what kind of PII was found, but
never contains the actual value — so it is safe to log.
Composing PII detection into your pipeline
Section titled “Composing PII detection into your pipeline”IAIPiiDetector is a standalone building block — you call ScanAsync from your own
validators, message handlers, or batch jobs and act on the result. The framework
ships only the detector; the action you take on a finding (queue for review,
pseudonymise, notify the DPO) is application code.
// Wolverine handler — your app subscribes after a record is created.// IPiiReviewQueue is application-defined; the framework provides IAIPiiDetector only.public static async Task Handle( CommentCreatedEvent evt, IAIPiiDetector detector, IPiiReviewQueue reviewQueue, CancellationToken ct){ PiiDetectionResult result = await detector.ScanAsync(evt.Body, ct).ConfigureAwait(false);
if (result.ContainsPii) { // item.Description carries the location/kind, never the raw value — safe to persist. await reviewQueue.EnqueueAsync( evt.CommentId, result.Items.Select(i => (i.Type, i.Description)), ct).ConfigureAwait(false); }}Dedicated workspace (recommended)
Section titled “Dedicated workspace (recommended)”For PII detection, configure a separate AI workspace using a local model. This isolates PII traffic from other AI usage:
{ "AI": { "Workspaces": [ { "Name": "pii-detection", "Provider": "Ollama", "Model": "llama3.1", "SystemPrompt": "You are a GDPR compliance assistant. Detect personally identifiable information in text. Be conservative — only flag clear PII, not ambiguous terms.", "Temperature": 0.0 } ], "Privacy": { "WorkspaceName": "pii-detection" } }}Setting Temperature: 0.0 reduces hallucinations — you want deterministic detection,
not creative interpretation.
Performance considerations
Section titled “Performance considerations”PII detection calls the LLM synchronously. For high-throughput pipelines, run detection asynchronously via a Wolverine handler (see the example above) so the write path does not wait on the model.
| Scenario | Recommended execution | Rationale |
|---|---|---|
| Form validation (user-facing) | Synchronous, short timeout | Blocking is acceptable for submission; keep TimeoutSeconds low |
| Batch scan of existing data | Async Wolverine handler | Potentially millions of records |
| Import pipeline | Async, after persistence | Scan after save — upload should not wait |
Fail-closed by default
Section titled “Fail-closed by default”When the LLM is unavailable or times out, the detector returns a fallback result
governed by FailMode:
Closed(default) — assumes PII is present (ContainsPii = true). A scanner outage therefore blocks a validation that gates on a clean result, rather than silently waving the text through. This is the conservative, GDPR-safe posture.Open— assumes no PII (ContainsPii = false). Permissive, for development and testing only.
Every fallback is logged as a warning. FailMode = Open is rejected at startup
outside the Development environment — a permissive PII scanner in production is
treated as a misconfiguration, not a choice.
Configuration reference
Section titled “Configuration reference”| Property | Type | Default | Description |
|---|---|---|---|
WorkspaceName | string | "default" | AI workspace for PII detection. Use an Ollama workspace in production |
TimeoutSeconds | int | 15 | LLM call timeout. PII detection is allowed more time than moderation |
FailMode | PiiDetectionFailMode | Closed | Behaviour on scanner failure. Closed assumes PII is present (safe default); Open assumes none (Development only — rejected at startup elsewhere) |
See also
Section titled “See also”- Granit.AI setup — providers, workspaces
- Privacy & GDPR — the Privacy module
- AI: Content Moderation — detect toxic/inappropriate content
- AI: Blob Classification — detect PII in filenames