PII Detection
Free-text fields are a GDPR blind spot. A Comment column that starts as “Great product!”
eventually contains “Call Jean Dupont at +32 478 123 456”. Your data map doesn’t list
Comment as containing personal data — but it does.
Granit.Privacy.AI provides IAIPiiDetector: a service that scans arbitrary text
and returns a structured list of detected PII types. It integrates with
Granit.Privacy’s data provider registry and can be called anywhere you handle
user-generated text.
PII types detected
Section titled “PII types detected”PiiType | Examples |
|---|---|
PersonName | ”Jean Dupont”, “Dr. Smith” |
Email | ”[email protected]” |
PhoneNumber | ”+32 478 123 456”, “(555) 867-5309” |
Address | ”14 Rue de la Paix, Paris” |
NationalId | SSN, NISS, BSN, NIE — country-agnostic |
DateOfBirth | ”born 14/03/1985”, “DOB: 1985-03-14” |
BankAccount | IBAN, routing numbers |
CreditCard | 4-group card numbers |
Other | Other identifiers not in the list above |
[DependsOn( typeof(GranitPrivacyAIModule), typeof(GranitAIOllamaModule))] // strongly recommended for GDPRpublic class AppModule : GranitModule { }builder.AddGranitAI();builder.AddGranitAIOllama(); // Ollama: data never leaves your serverbuilder.AddGranitPrivacyAI();{ "AI": { "Privacy": { "WorkspaceName": "pii-detection", "TimeoutSeconds": 15 } }}Using the detector
Section titled “Using the detector”public class CommentValidator(IAIPiiDetector piiDetector) : AbstractValidator<CreateCommentRequest>{ public CommentValidator() { RuleFor(x => x.Body) .MustAsync(async (text, ct) => { PiiDetectionResult result = await piiDetector .ScanAsync(text, ct) .ConfigureAwait(false);
return !result.ContainsPii; }) .WithMessage("Comments must not contain personal information (names, phone numbers, etc.)."); }}Handling detection results
Section titled “Handling detection results”PiiDetectionResult result = await piiDetector.ScanAsync(userInput, ct).ConfigureAwait(false);
if (result.ContainsPii){ foreach (DetectedPii item in result.Items) { // item.Type → PiiType.Email, PiiType.PhoneNumber, ... // item.Description → "email address detected in first sentence" // Note: Description never contains the actual PII value logger.LogWarning("PII detected: {Type} — {Description}", item.Type, item.Description); }}The Description field explains where and what kind of PII was found, but
never contains the actual value — so it is safe to log.
Integration with Granit.Privacy
Section titled “Integration with Granit.Privacy”Granit.Privacy.AI is designed to integrate with the IDataProviderRegistry — the
registry that tracks which modules participate in GDPR right-to-erasure and data discovery.
Scanning unstructured fields
Section titled “Scanning unstructured fields”// Register a data provider that scans comment fieldspublic class CommentPiiDataProvider( ICommentRepository comments, IAIPiiDetector piiDetector) : IDataProvider{ public async IAsyncEnumerable<PersonalDataEntry> DiscoverAsync( Guid subjectId, [EnumeratorCancellation] CancellationToken ct) { await foreach (Comment comment in comments.GetByAuthorAsync(subjectId, ct)) { PiiDetectionResult scan = await piiDetector .ScanAsync(comment.Body, ct) .ConfigureAwait(false);
if (scan.ContainsPii) { yield return new PersonalDataEntry( Field: "Comment.Body", EntityId: comment.Id, PiiTypes: scan.Items.Select(i => i.Type.ToString()).ToArray()); } } }}Dedicated workspace (recommended)
Section titled “Dedicated workspace (recommended)”For PII detection, configure a separate AI workspace using a local model. This isolates PII traffic from other AI usage:
{ "AI": { "Workspaces": [ { "Name": "pii-detection", "Provider": "Ollama", "Model": "llama3.1", "SystemPrompt": "You are a GDPR compliance assistant. Detect personally identifiable information in text. Be conservative — only flag clear PII, not ambiguous terms.", "Temperature": 0.0 } ], "Privacy": { "WorkspaceName": "pii-detection" } }}Setting Temperature: 0.0 reduces hallucinations — you want deterministic detection,
not creative interpretation.
Performance considerations
Section titled “Performance considerations”PII detection calls the LLM synchronously. For high-throughput pipelines, consider running detection asynchronously via Wolverine:
// Wolverine handler — triggered after record creationpublic static async Task Handle( CommentCreatedEvent evt, IAIPiiDetector detector, IPrivacyFlagWriter flags, CancellationToken ct){ PiiDetectionResult result = await detector.ScanAsync(evt.Body, ct).ConfigureAwait(false);
if (result.ContainsPii) { await flags.FlagAsync(evt.CommentId, result.Items, ct).ConfigureAwait(false); // Optional: notify DPO, trigger pseudonymization workflow, etc. }}| Scenario | Recommended execution | Rationale |
|---|---|---|
| Form validation (user-facing) | Synchronous, 2s timeout, fail-open | Blocking is acceptable for submission |
| Batch scan of existing data | Async Wolverine handler | Potentially millions of records |
| Import pipeline | Async, after persistence | Scan after save — upload should not wait |
Fail-open design
Section titled “Fail-open design”When the LLM is unavailable or times out, LlmPiiDetector returns:
new PiiDetectionResult { ContainsPii = false, Items = [] }The call is logged as a warning. The calling code receives a “clean” result and continues. This is intentional: a PII scanner should never block the application — it should flag for review, not gate operations.
If you need hard blocking (reject on scanner failure), wrap the call:
bool safe = await piiDetector.ScanAsync(text, ct) .ContinueWith(t => t.IsCompletedSuccessfully && !t.Result.ContainsPii);Configuration reference
Section titled “Configuration reference”| Property | Type | Default | Description |
|---|---|---|---|
WorkspaceName | string | "default" | AI workspace for PII detection. Use an Ollama workspace in production |
TimeoutSeconds | int | 15 | LLM call timeout. PII detection is allowed more time than moderation |
See also
Section titled “See also”- Granit.AI setup — providers, workspaces
- Privacy & GDPR — the Privacy module
- AI: Content Moderation — detect toxic/inappropriate content
- AI: Blob Classification — detect PII in filenames