tes a TypeScript-based pipeline that normalizes logs, constructs task-specific prompts, invokes Nova Micro via Amazon Bedrock, and enforces output validation.
Architecture Decisions
- Model Selection: AWS Nova Micro is chosen for its 14x cost advantage over legacy models and its 128,000-token context window. It is optimized for high-throughput, token-heavy workloads where semantic understanding outweighs complex reasoning.
- Structured Preprocessing: Raw logs are normalized into a consistent JSON schema before LLM ingestion. This reduces hallucination rates and improves field mapping accuracy, as demonstrated by benchmark tests on CDN and web access logs.
- Deterministic Fallback Layer: Every LLM response is validated against a strict schema. Failed validations trigger a regex/Grok fallback parser, ensuring pipeline continuity during model drift or rate limits.
- Cost-Aware Batching: Logs are grouped into fixed-size batches (e.g., 50 entries) to maximize context window utilization while minimizing API calls. This prevents token fragmentation and reduces per-event inference costs.
Implementation
import { BedrockRuntimeClient, InvokeModelCommand } from "@aws-sdk/client-bedrock-runtime";
import { z } from "zod";
// Strict output schema to prevent hallucination drift
const LogAnalysisSchema = z.object({
template: z.string(),
variables: z.array(z.string()),
severity: z.enum(["INFO", "WARN", "ERROR", "CRITICAL"]),
summary: z.string(),
});
type LogAnalysisResult = z.infer<typeof LogAnalysisSchema>;
interface LogEntry {
timestamp: string;
raw: string;
source: string;
}
export class NovaLogAnalyzer {
private client: BedrockRuntimeClient;
private modelId: string;
private maxBatchSize: number;
constructor(region: string, modelId = "amazon.nova-micro-v1:0", maxBatchSize = 50) {
this.client = new BedrockRuntimeClient({ region });
this.modelId = modelId;
this.maxBatchSize = maxBatchSize;
}
async analyzeBatch(logs: LogEntry[]): Promise<LogAnalysisResult[]> {
const batches = this.chunkLogs(logs, this.maxBatchSize);
const results: LogAnalysisResult[] = [];
for (const batch of batches) {
const prompt = this.buildPrompt(batch);
const rawResponse = await this.invokeModel(prompt);
const validated = this.validateAndParse(rawResponse);
results.push(...validated);
}
return results;
}
private buildPrompt(logs: LogEntry[]): string {
const logBlock = logs.map((l, i) => `[${i + 1}] ${l.timestamp} | ${l.source} | ${l.raw}`).join("\n");
return `
You are an observability engine. Analyze the following log batch.
Extract the log template, list all variable placeholders, classify severity, and provide a one-sentence summary.
Return ONLY valid JSON matching this structure:
{
"template": "string",
"variables": ["string"],
"severity": "INFO|WARN|ERROR|CRITICAL",
"summary": "string"
}
Logs:
${logBlock}
`;
}
private async invokeModel(prompt: string): Promise<string> {
const payload = {
messages: [{ role: "user", content: [{ text: prompt }] }],
inferenceConfig: { maxTokens: 1024, temperature: 0.1 },
};
const command = new InvokeModelCommand({
body: JSON.stringify(payload),
modelId: this.modelId,
contentType: "application/json",
accept: "application/json",
});
const response = await this.client.send(command);
const decoded = new TextDecoder().decode(response.body);
const parsed = JSON.parse(decoded);
return parsed.content[0].text;
}
private validateAndParse(raw: string): LogAnalysisResult[] {
try {
const cleaned = raw.replace(/```json|```/g, "").trim();
const parsed = JSON.parse(cleaned);
return [LogAnalysisSchema.parse(parsed)];
} catch {
// Fallback to deterministic parser in production
return [{
template: "FALLBACK_REQUIRED",
variables: [],
severity: "WARN",
summary: "LLM validation failed. Fallback parser triggered.",
}];
}
}
private chunkLogs(logs: LogEntry[], size: number): LogEntry[][] {
const chunks: LogEntry[][] = [];
for (let i = 0; i < logs.length; i += size) {
chunks.push(logs.slice(i, i + size));
}
return chunks;
}
}
Why This Architecture Works
- Schema-First Validation: LLMs are probabilistic. Wrapping responses in a Zod schema catches malformed JSON, missing fields, or hallucinated severity levels before they pollute downstream systems.
- Low Temperature (0.1): Log analysis requires consistency, not creativity. Lowering temperature reduces variance in template extraction and severity classification.
- Batch Chunking: Feeding 50 logs per request maximizes the 128k context window while keeping API call volume predictable. This directly impacts cost efficiency.
- Deterministic Fallback: The
validateAndParse method ensures the pipeline never breaks. When the LLM fails to conform to the schema, a placeholder triggers a regex/Grok fallback, maintaining observability continuity.
Pitfall Guide
1. Blind Trust in LLM Arithmetic
Explanation: LLMs lack native computational engines. When asked to count API calls, error frequencies, or request rates, they generate plausible-sounding numbers that are frequently inaccurate.
Fix: Never use LLMs for aggregation. Route counting tasks to ClickHouse, Prometheus, or Elasticsearch aggregations. Use the LLM only for semantic classification and template extraction.
2. Ignoring Log Structure Normalization
Explanation: Raw logs contain inconsistent timestamps, mixed delimiters, and unstructured payloads. Feeding them directly into an LLM increases token waste and reduces field mapping accuracy.
Fix: Preprocess logs into a uniform JSON schema before LLM ingestion. Normalize HTTP status codes, extract known fields (e.g., reqPath, statusCode), and strip redundant metadata. Structured inputs consistently outperform raw text in benchmark tests.
3. Overestimating Anomaly Detection Capabilities
Explanation: The benchmark showed 47% anomaly detection accuracy, with models flagging repetitive entries or end-of-batch logs as anomalies. LLMs lack statistical baselines and temporal context required for true anomaly detection.
Fix: Use LLMs for anomaly classification (e.g., "Is this a known failure pattern?"), not anomaly detection. Pair LLM outputs with statistical detectors (Z-score, isolation forests, or time-series forecasting) for production-grade alerting.
4. Context Window Bloat
Explanation: Feeding untrimmed logs into a 128k context window wastes tokens and increases latency. Irrelevant debug traces, stack traces, and verbose headers dilute the signal.
Fix: Implement a pre-inference filter that strips non-essential fields, truncates stack traces to the first 3 frames, and removes debug-level entries unless explicitly requested. This reduces token count by 40-60% without losing analytical value.
5. Security False Confidence
Explanation: The benchmark reported 95% accuracy on malicious content detection, but the sampled datasets lacked obvious threats. High accuracy on clean data does not translate to production threat hunting.
Fix: Treat LLM security analysis as a triage layer, not a detection engine. Use it to classify suspicious patterns (e.g., "Does this log resemble a brute-force attempt?"), but validate findings with SIEM rules, threat intelligence feeds, and behavioral analytics.
Explanation: Minor changes in log structure or model updates can cause JSON parsing failures, missing fields, or inconsistent severity labels.
Fix: Pin model versions, enforce strict output schemas, and implement retry logic with prompt reformatting. Log all validation failures to track drift over time.
7. Skipping Fallback Parsers
Explanation: Relying solely on LLMs creates a single point of failure. Rate limits, model downtime, or schema mismatches can halt log processing.
Fix: Always maintain a deterministic fallback (Grok, regex, or schema-based parser). Route failed LLM responses to the fallback layer and alert engineering teams when fallback usage exceeds a threshold (e.g., >5%).
Production Bundle
Action Checklist
Decision Matrix
| Scenario | Recommended Approach | Why | Cost Impact |
|---|
| High-volume template extraction | AWS Nova Micro + structured preprocessing | 14x cheaper than legacy models, 89% accuracy on normalized logs | Low (optimized batching reduces token waste) |
| Security triage & threat classification | Nova Micro + SIEM validation | LLMs classify patterns well but lack threat intelligence context | Medium (requires dual-processing pipeline) |
| Anomaly hunting & alerting | Statistical detectors + LLM classification | LLMs produce false positives on irrelevant criteria | Low (LLM used only for post-detection labeling) |
| Cost-constrained observability | Nova Micro + deterministic fallbacks | Maintains pipeline continuity while minimizing inference costs | Very Low (fallbacks reduce LLM dependency) |
Configuration Template
# bedrock-log-pipeline.config.yaml
model:
id: "amazon.nova-micro-v1:0"
region: "us-east-1"
inference:
temperature: 0.1
max_tokens: 1024
top_p: 0.9
batching:
max_size: 50
timeout_ms: 3000
retry_attempts: 2
validation:
schema_version: "v1.2"
fallback_parser: "grok"
alert_threshold: 0.05 # Trigger alert if fallback usage > 5%
preprocessing:
strip_debug: true
truncate_stack_frames: 3
normalize_http_codes: true
remove_redundant_metadata: true
Quick Start Guide
- Install Dependencies:
npm install @aws-sdk/client-bedrock-runtime zod
- Configure Credentials: Set
AWS_ACCESS_KEY_ID, AWS_SECRET_ACCESS_KEY, and AWS_REGION in your environment or IAM role.
- Initialize Analyzer: Instantiate
NovaLogAnalyzer with your target region and batch size.
- Ingest & Analyze: Pass an array of normalized
LogEntry objects to analyzeBatch(). The pipeline handles chunking, prompting, invocation, and validation automatically.
- Monitor & Tune: Track validation failure rates and fallback triggers. Adjust batch size, temperature, or preprocessing rules based on your log volume and accuracy requirements.