persisted.
import crypto from 'crypto';
interface KeyRecord {
id: string;
hash: string;
ownerRole: 'employee' | 'analyst' | 'finance' | 'admin';
createdAt: Date;
revoked: boolean;
}
class KeyRegistry {
private store: Map<string, KeyRecord> = new Map();
generateKey(ownerRole: KeyRecord['ownerRole']): { plain: string; record: KeyRecord } {
const raw = crypto.randomBytes(32).toString('hex');
const hash = crypto.createHash('sha256').update(raw).digest('hex');
const id = crypto.randomUUID();
const record: KeyRecord = { id, hash, ownerRole, createdAt: new Date(), revoked: false };
this.store.set(id, record);
return { plain: raw, record };
}
resolveRole(plainKey: string): KeyRecord['ownerRole'] | null {
const inputHash = crypto.createHash('sha256').update(plainKey).digest('hex');
for (const record of this.store.values()) {
if (record.hash === inputHash && !record.revoked) {
return record.ownerRole;
}
}
return null;
}
}
Architecture Rationale: SHA-256 hashing prevents database breaches from exposing usable credentials. The resolveRole method iterates the store to match hashes, ensuring the raw key never touches storage. Role resolution happens before any retrieval logic executes, guaranteeing that client-supplied user_role fields in the request body are discarded.
Revocation must bypass caching layers and session managers. Deleting the hash record forces the next validation attempt to fail instantly.
class KeyRegistry {
// ... previous methods
revokeKey(keyId: string, adminToken: string): boolean {
if (!this.validateAdmin(adminToken)) return false;
const record = this.store.get(keyId);
if (!record) return false;
this.store.delete(keyId); // Immediate removal, no grace period
return true;
}
private validateAdmin(token: string): boolean {
return process.env.ADMIN_TOKEN === token;
}
}
Architecture Rationale: Using Map.delete() ensures O(1) removal with zero propagation delay. Unlike JWT expiration or session store TTLs, this approach requires no background cleanup jobs. The admin token check ensures only authorized operators can trigger revocation, preventing denial-of-service attacks against credential management.
Management operations and data retrieval require distinct authentication scopes. Mixing them creates lateral movement opportunities.
import express, { Request, Response, NextFunction } from 'express';
const app = express();
const keyRegistry = new KeyRegistry();
const enforceAdminScope = (req: Request, _res: Response, next: NextFunction) => {
const adminHeader = req.headers['x-admin-token'] as string;
if (adminHeader !== process.env.ADMIN_TOKEN) {
return _res.status(401).json({ error: 'Management access denied' });
}
next();
};
const enforceQueryScope = (req: Request, res: Response, next: NextFunction) => {
const apiKey = req.headers['x-api-key'] as string;
if (!apiKey) return res.status(401).json({ error: 'Query credential required' });
const role = keyRegistry.resolveRole(apiKey);
if (!role) return res.status(403).json({ error: 'Invalid or revoked key' });
req.authContext = { role };
next();
};
// Management routes (protected by admin token)
app.post('/ingest', enforceAdminScope, handleIngestion);
app.post('/api-keys', enforceAdminScope, handleKeyCreation);
app.get('/audit-logs', enforceAdminScope, handleAuditRetrieval);
// Query route (protected by API key)
app.post('/query', enforceQueryScope, handleRetrieval);
Architecture Rationale: Separating x-admin-token and x-api-key ensures credential compromise in one scope cannot affect the other. A leaked query key cannot trigger ingestion or read audit trails. A leaked admin token lacks user role context and cannot retrieve documents. This principle of least privilege limits blast radius during security incidents.
Step 4: Immutable Audit Trails
Administrative actions require tamper-evident logging. Query operations require separate tracking for retrieval analytics and RBAC enforcement metrics.
class AuditSink {
private logs: Array<{ action: string; timestamp: Date; initiator: string }> = [];
record(action: string, initiator: string): void {
this.logs.push({
action,
timestamp: new Date(),
initiator
});
}
getLogs(): ReadonlyArray<{ action: string; timestamp: Date; initiator: string }> {
return [...this.logs];
}
}
Architecture Rationale: Admin logs track who modified system state. Query logs (handled separately in the retrieval pipeline) track question text, resolved role, citation sources, and RBAC-blocked chunk counts. Together, they satisfy security review requirements by answering: who changed configuration, what was queried, and what was filtered by access controls.
Browser-based attack surfaces must be closed at the framework level, not left to operator configuration.
import helmet from 'helmet';
import cors from 'cors';
app.use(helmet()); // Enables CSP, X-Frame-Options, HSTS, etc. by default
const allowedOrigins = (process.env.CORS_ORIGINS || '').split(',').filter(Boolean);
app.use(cors({
origin: allowedOrigins.length > 0 ? allowedOrigins : false,
methods: ['GET', 'POST'],
credentials: true
}));
Architecture Rationale: helmet applies industry-standard headers automatically. CORS origins are explicitly enumerated; wildcard defaults are rejected. For Azure Container App deployments, this list should contain only the dashboard frontend URL and approved internal tool endpoints. This prevents cross-origin data exfiltration from malicious browser extensions or compromised third-party scripts.
Pitfall Guide
1. Trusting Client-Declared Roles
Explanation: Accepting user_role from the request body allows any caller to impersonate privileged accounts.
Fix: Derive authorization context exclusively from server-validated credentials. Strip or ignore role fields in the request payload before retrieval logic executes.
2. Caching Revoked Credentials
Explanation: Storing revoked keys in memory caches or relying on session TTLs creates a window where compromised credentials remain valid.
Fix: Delete the credential hash immediately upon revocation. Validate against the live store on every request. Avoid caching authorization decisions.
3. Blending Management and Query Endpoints
Explanation: Exposing ingestion, key creation, and audit retrieval through the same authentication mechanism as document queries enables lateral privilege escalation.
Fix: Enforce separate headers (X-Admin-Token vs X-API-Key) with distinct validation pipelines. Never allow a query credential to access administrative routes.
4. Storing Plaintext API Keys
Explanation: Persisting raw secrets in databases or logs allows credential extraction during breaches or backup leaks.
Fix: Hash keys using SHA-256 before storage. Return the plaintext value only once during creation. Implement immediate revocation workflows for lost keys.
5. Defaulting to Wildcard CORS
Explanation: Allowing Access-Control-Allow-Origin: * permits any website to make authenticated requests to your retrieval API, enabling CSRF and data exfiltration.
Fix: Explicitly enumerate allowed origins in environment configuration. Reject requests from unlisted domains. Validate origins at the middleware layer before route handlers execute.
6. Assuming In-Memory Rate Limiting Scales
Explanation: Single-instance counters fail in horizontally scaled deployments, allowing attackers to bypass limits by distributing requests across nodes.
Fix: Migrate to Redis-backed sliding windows or API gateway rate limiting. Implement distributed token buckets that synchronize across instances.
7. Neglecting PII Classification at Ingestion
Explanation: Ingesting unclassified documents exposes sensitive personal or financial data to all authorized roles, violating compliance requirements.
Fix: Implement pre-ingestion scanning using regex patterns or ML-based classifiers. Apply retention policies to prompts and generated answers. Tag chunks with sensitivity levels for downstream RBAC filtering.
Production Bundle
Action Checklist
Decision Matrix
| Scenario | Recommended Approach | Why | Cost Impact |
|---|
| Single-instance internal deployment | In-memory key registry + local audit logs | Low operational overhead, sufficient for controlled environments | Minimal (no external dependencies) |
| Multi-instance production cluster | Redis-backed credential store + distributed rate limiter | Ensures consistent revocation and limit enforcement across nodes | Moderate (Redis infrastructure + monitoring) |
| External or partner-facing access | Entra ID/OIDC JWT validation + strict CORS + PII scanning | Meets enterprise identity standards and compliance requirements | High (identity provider licensing + scanning pipelines) |
| High-security regulated data | Role-bound API keys + immediate revocation + separate admin/query scopes + immutable audit trails | Eliminates privilege escalation and satisfies audit requirements | Moderate (engineering time for pipeline separation) |
Configuration Template
# Authentication & Authorization
ADMIN_TOKEN=your_strong_admin_secret_here
SECURITY_HEADERS_ENABLED=true
CORS_ORIGINS=https://dashboard.internal.corp,https://tools.internal.corp
# Rate Limiting (adjust for deployment scale)
RATE_LIMIT_PER_MINUTE=60
USE_DISTRIBUTED_RATE_LIMITER=false
# Identity Provider (optional, requires live tenant)
AUTH_PROVIDER=local
# AUTH_PROVIDER=entra
# OIDC_ISSUER=https://login.microsoftonline.com/{tenant}/v2.0
# OIDC_AUDIENCE={client_id}
# Storage & Logging
AUDIT_LOG_RETENTION_DAYS=90
QUERY_LOG_RETENTION_DAYS=30
ENABLE_PII_CLASSIFICATION=false
Quick Start Guide
- Initialize the environment: Copy the configuration template to
.env. Set a strong ADMIN_TOKEN and define explicit CORS_ORIGINS.
- Start the service: Run the application server. Verify that
POST /ingest returns 401 when called without the admin header.
- Generate a query credential: Call the key creation endpoint with the admin token. Store the returned plaintext key securely. It will not be available again.
- Validate access control: Submit a query using the
X-API-Key header. Confirm the system resolves the role from the key, ignores any user_role in the body, and returns results filtered by that role.
- Test revocation: Revoke the key via the management endpoint. Immediately retry the query. The system must reject the request on the next attempt with zero delay.