Skip to content

Import Mapping

Every B2B application faces the same nightmare: data onboarding. Your client exports a CSV from their legacy system (Sage, SAP, AS/400) with columns named Nom_Clt_V2_Final, dt_crea, or simply F_003. Your system expects CustomerName, CreatedAt, AmountExclTax. Someone has to map each column — manually, every time.

Granit.DataExchange.AI solves this by adding an AI-powered Tier 4 to the existing mapping pipeline. When exact and fuzzy matching fail, the LLM analyzes the column names and target schema to suggest the mapping automatically.

Traditional approaches fail on real-world data:

Source columnTarget propertyExact match?Fuzzy match?AI match?
EmailEmailYes
CourrielEmailNoYes (alias)
Nom_Clt_V2_FinalCustomerNameNoNoYes (0.92)
MONTANT HTAmountExclTaxNoNoYes (0.95)
dt_creaCreatedAtNoNoYes (0.88)
COL1???NoNoNeeds preview rows

The first two tiers (exact + fuzzy) handle clean data. The AI tier handles the rest — which is most of what you encounter in production B2B onboarding.

flowchart TD
    H[File headers] --> T1[Tier 1: Saved mappings]
    T1 -->|unmapped columns| T2[Tier 2: Exact match]
    T2 -->|unmapped columns| T3[Tier 3: Fuzzy match]
    T3 -->|unmapped columns| T4[Tier 4: AI Semantic]
    T4 --> R[Final mapping]
    T1 --> R
    T2 --> R
    T3 --> R

    style T4 fill:#e8f5e9,stroke:#4caf50

Each tier only processes columns that previous tiers couldn’t match. The best confidence wins per column. This means the AI is only called when necessary — most columns are matched by cheaper tiers first.

TierConfidenceSpeedCost
SavedHighest (user-confirmed)InstantFree
ExactHigh (case-insensitive name/alias)InstantFree
FuzzyMedium (Levenshtein ≥ 0.8)InstantFree
Semantic (AI)Variable (0.0–1.0)~200msLLM tokens
[DependsOn(
typeof(GranitDataExchangeAIModule),
typeof(GranitAIOllamaModule))]
public class AppModule : GranitModule { }
builder.AddGranitAI();
builder.AddGranitAIOllama(); // or any provider
builder.AddGranitDataExchangeAI();
{
"AI": {
"DataExchange": {
"WorkspaceName": "default",
"TimeoutSeconds": 10,
"MinConfidenceScore": 0.6
}
}
}

That’s it. The AI tier is automatically registered as the ISemanticMappingService implementation, replacing the default no-op.

This is critical for GDPR compliance. The LLM receives only metadata:

Source columns: Nom_Clt_V2_Final, dt_crea, MONTANT HT, TVA
Target properties:
| Property | Type | Display Name | Description | Required |
|----------------|------------------|------------------|----------------------|----------|
| CustomerName | String | Customer Name | Full customer name | Yes |
| CreatedAt | DateTimeOffset | Creation Date | Record creation date | Yes |
| AmountExclTax | Decimal | Amount excl. tax | Net amount | Yes |
| VatRate | Decimal | VAT Rate | VAT percentage | No |

The LLM never receives:

  • Row data (names, emails, amounts)
  • Database records
  • Tenant identifiers
  • Any business data whatsoever

When column headers are meaningless (COL1, F_003, FIELD_A), headers alone aren’t enough. You can opt-in to send the first few data rows to the LLM:

{
"AI": {
"DataExchange": {
"IncludePreviewRows": true,
"PreviewRowCount": 5
}
}
}

The LLM then sees:

Source columns: COL1, COL2, COL3
Sample data (first rows):
| COL1 | COL2 | COL3 |
|-------------------|------------|----------|
| [email protected] | John Doe | +32 123 |
| [email protected] | Jane Smith | +32 456 |

With this context, the LLM can infer that COL1 is an email, COL2 is a name, etc.

Soft dependency — zero changes to DataExchange

Section titled “Soft dependency — zero changes to DataExchange”

Granit.DataExchange.AI is a pure additive package. The DataExchange module defines ISemanticMappingService with a null-object default:

Granit.DataExchange → defines ISemanticMappingService
registers NullSemanticMappingService (IsAvailable = false)
Granit.DataExchange.AI → implements AISemanticMappingService (IsAvailable = true)
references Granit.DataExchange replaces via DI
references Granit.AI

Without the AI package: Tier 4 is silently skipped. With it: Tier 4 activates. No if statements, no feature flags — just DI composition.

RiskMitigation
LLM timeout10s timeout (configurable), fallback to empty result — Tiers 1-3 still work
Wrong mappingMinimum confidence threshold (0.6), user validates in wizard before import
PII in preview rowsDisabled by default, explicit opt-in required
CostOnly called for unmapped columns after 3 free tiers, typically 1-5 columns per import
LLM hallucinationStructured output (JSON), validated against known property paths
PropertyTypeDefaultDescription
WorkspaceNamestring"default"AI workspace for mapping suggestions
TimeoutSecondsint10LLM call timeout
MinConfidenceScoredouble0.6Minimum score to accept a suggestion
IncludePreviewRowsboolfalseInclude sample data rows (GDPR opt-in)
PreviewRowCountint5Number of preview rows when enabled