Skip to content

Log Analysis

A spike in error logs tells you something is wrong. It does not tell you what is wrong or where to look first. Granit.Observability.AI adds IAILogAnalyzer: send a batch of log entries, get back a concise summary and a list of categorized insights.

This is batch analysis — run it as a scheduled job, not per request.

[DependsOn(
typeof(GranitObservabilityAIModule),
typeof(GranitAIOpenAIModule))]
public class AppModule : GranitModule { }

TimeoutSeconds is intentionally high (60s default) — log analysis batches can be large. This service is never called in the request path.

public class LogAnalysisService(IAILogAnalyzer analyzer)
{
public async Task<LogAnalysisReport> AnalyzeRecentErrorsAsync(
IReadOnlyList<LogEntry> recentLogs,
CancellationToken ct)
{
return await analyzer.AnalyzeAsync(recentLogs, ct).ConfigureAwait(false);
}
}
public sealed record LogEntry(
DateTimeOffset Timestamp,
string Level, // "Error", "Warning", "Information", "Debug"
string Message,
string? Exception); // Full exception string when available

Build log entries from your Serilog sink or OpenTelemetry exporter:

IReadOnlyList<LogEntry> entries = serilogEvents
.Where(e => e.Level >= LogEventLevel.Warning)
.Select(e => new LogEntry(
e.Timestamp,
e.Level.ToString(),
e.RenderMessage(),
e.Exception?.ToString()))
.ToList();
public sealed record LogAnalysisReport(
string Summary, // "3 distinct errors, 2 related to database connectivity"
IReadOnlyList<LogInsight> Insights, // Categorized findings
int TotalEntries); // Entries analyzed
public sealed record LogInsight(
string Description, // "NpgsqlException: connection refused — 47 occurrences"
string Severity, // "Critical" | "High" | "Medium" | "Low"
string Category); // "Database" | "Authentication" | "Performance" | "Integration" | "Application"

Given 200 log entries from a 1-hour window:

Summary: 200 entries analyzed. 3 critical error clusters:
(1) Database connectivity failures — NpgsqlException recurring every ~30s
since 14:22 UTC (47 occurrences)
(2) JWT validation errors — 12 occurrences for user agent "curl/7.x",
suggesting automated probing
(3) Slow query warnings — avg 3.2s on /api/invoices/search
Insights:
[Critical/Database] NpgsqlException: connection refused — 47 occurrences since 14:22 UTC
[High/Authentication] JWT validation failures from single IP — possible credential stuffing
[Medium/Performance] P95 latency > 3s on invoice search endpoint
[Low/Application] Null reference in InvoiceMapper.ToDto — 2 occurrences

Log analysis is never interactive — use a cron job:

// Register with Granit.BackgroundJobs
builder.AddGranitBackgroundJob<HourlyLogAnalysisJob>(
cron: "0 * * * *"); // Every hour
// Wolverine handler
public static async Task Handle(
HourlyLogAnalysisJob job,
ILogStore logStore,
IAILogAnalyzer analyzer,
INotificationPublisher notifier,
ITimelineWriter timeline,
CancellationToken ct)
{
DateTimeOffset since = DateTimeOffset.UtcNow.AddHours(-1);
IReadOnlyList<LogEntry> entries = await logStore
.GetEntriesAsync(since, minLevel: "Warning", ct)
.ConfigureAwait(false);
if (entries.Count == 0) return;
LogAnalysisReport report = await analyzer.AnalyzeAsync(entries, ct)
.ConfigureAwait(false);
// Only alert if critical insights found
bool hasCritical = report.Insights.Any(i => i.Severity is "Critical" or "High");
if (hasCritical)
{
await notifier.PublishAsync(
LogAlertNotification.Type,
new LogAlertData(report),
recipients: ["on-call"],
ct).ConfigureAwait(false);
}
// Always post summary to timeline for audit trail
await timeline.PostEntryAsync(
entityType: "System",
entityId: "log-analysis",
entryType: TimelineEntryType.SystemLog,
body: $"[AI Log Analysis] {report.Summary}",
parentEntryId: null,
ct).ConfigureAwait(false);
}

Large log batches can be expensive. Use MaxLogEntries to cap the context window, and pre-filter before sending:

IReadOnlyList<LogEntry> filtered = allEntries
// Only warnings and above
.Where(e => e.Level is "Warning" or "Error" or "Critical")
// Deduplicate — LLM doesn't need 500 identical connection errors
.DistinctBy(e => e.Message[..Math.Min(100, e.Message.Length)])
// Cap at configured max
.Take(options.Value.MaxLogEntries)
.ToList();

Deduplication before the LLM call significantly reduces token cost while preserving analytical quality.

PropertyTypeDefaultDescription
WorkspaceNamestring?null (default)AI workspace for log analysis
TimeoutSecondsint60LLM timeout — batch analysis needs more time
MaxLogEntriesint500Maximum entries per analysis call