Skip to content

PII Detection

Free-text fields are a GDPR blind spot. A Comment column that starts as “Great product!” eventually contains “Call Jean Dupont at +32 478 123 456”. Your data map doesn’t list Comment as containing personal data — but it does.

Granit.Privacy.AI provides IAIPiiDetector: a service that scans arbitrary text and returns a structured list of detected PII types. It integrates with Granit.Privacy’s data provider registry and can be called anywhere you handle user-generated text.

PiiTypeExamples
PersonName”Jean Dupont”, “Dr. Smith”
Email[email protected]
PhoneNumber”+32 478 123 456”, “(555) 867-5309”
Address”14 Rue de la Paix, Paris”
NationalIdSSN, NISS, BSN, NIE — country-agnostic
DateOfBirth”born 14/03/1985”, “DOB: 1985-03-14”
BankAccountIBAN, routing numbers
CreditCard4-group card numbers
OtherOther identifiers not in the list above
[DependsOn(
typeof(GranitPrivacyAIModule),
typeof(GranitAIOllamaModule))] // strongly recommended for GDPR
public class AppModule : GranitModule { }
public class CommentValidator(IAIPiiDetector piiDetector) : AbstractValidator<CreateCommentRequest>
{
public CommentValidator()
{
RuleFor(x => x.Body)
.MustAsync(async (text, ct) =>
{
PiiDetectionResult result = await piiDetector
.ScanAsync(text, ct)
.ConfigureAwait(false);
return !result.ContainsPii;
})
.WithMessage("Comments must not contain personal information (names, phone numbers, etc.).");
}
}
PiiDetectionResult result = await piiDetector.ScanAsync(userInput, ct).ConfigureAwait(false);
if (result.ContainsPii)
{
foreach (DetectedPii item in result.Items)
{
// item.Type → PiiType.Email, PiiType.PhoneNumber, ...
// item.Description → "email address detected in first sentence"
// Note: Description never contains the actual PII value
logger.LogWarning("PII detected: {Type} — {Description}", item.Type, item.Description);
}
}

The Description field explains where and what kind of PII was found, but never contains the actual value — so it is safe to log.

Granit.Privacy.AI is designed to integrate with the IDataProviderRegistry — the registry that tracks which modules participate in GDPR right-to-erasure and data discovery.

// Register a data provider that scans comment fields
public class CommentPiiDataProvider(
ICommentRepository comments,
IAIPiiDetector piiDetector) : IDataProvider
{
public async IAsyncEnumerable<PersonalDataEntry> DiscoverAsync(
Guid subjectId,
[EnumeratorCancellation] CancellationToken ct)
{
await foreach (Comment comment in comments.GetByAuthorAsync(subjectId, ct))
{
PiiDetectionResult scan = await piiDetector
.ScanAsync(comment.Body, ct)
.ConfigureAwait(false);
if (scan.ContainsPii)
{
yield return new PersonalDataEntry(
Field: "Comment.Body",
EntityId: comment.Id,
PiiTypes: scan.Items.Select(i => i.Type.ToString()).ToArray());
}
}
}
}

For PII detection, configure a separate AI workspace using a local model. This isolates PII traffic from other AI usage:

{
"AI": {
"Workspaces": [
{
"Name": "pii-detection",
"Provider": "Ollama",
"Model": "llama3.1",
"SystemPrompt": "You are a GDPR compliance assistant. Detect personally identifiable information in text. Be conservative — only flag clear PII, not ambiguous terms.",
"Temperature": 0.0
}
],
"Privacy": {
"WorkspaceName": "pii-detection"
}
}
}

Setting Temperature: 0.0 reduces hallucinations — you want deterministic detection, not creative interpretation.

PII detection calls the LLM synchronously. For high-throughput pipelines, consider running detection asynchronously via Wolverine:

// Wolverine handler — triggered after record creation
public static async Task Handle(
CommentCreatedEvent evt,
IAIPiiDetector detector,
IPrivacyFlagWriter flags,
CancellationToken ct)
{
PiiDetectionResult result = await detector.ScanAsync(evt.Body, ct).ConfigureAwait(false);
if (result.ContainsPii)
{
await flags.FlagAsync(evt.CommentId, result.Items, ct).ConfigureAwait(false);
// Optional: notify DPO, trigger pseudonymization workflow, etc.
}
}
ScenarioRecommended executionRationale
Form validation (user-facing)Synchronous, 2s timeout, fail-openBlocking is acceptable for submission
Batch scan of existing dataAsync Wolverine handlerPotentially millions of records
Import pipelineAsync, after persistenceScan after save — upload should not wait

When the LLM is unavailable or times out, LlmPiiDetector returns:

new PiiDetectionResult { ContainsPii = false, Items = [] }

The call is logged as a warning. The calling code receives a “clean” result and continues. This is intentional: a PII scanner should never block the application — it should flag for review, not gate operations.

If you need hard blocking (reject on scanner failure), wrap the call:

bool safe = await piiDetector.ScanAsync(text, ct)
.ContinueWith(t => t.IsCompletedSuccessfully && !t.Result.ContainsPii);
PropertyTypeDefaultDescription
WorkspaceNamestring"default"AI workspace for PII detection. Use an Ollama workspace in production
TimeoutSecondsint15LLM call timeout. PII detection is allowed more time than moderation