he team level, trended over time, and tied directly to process improvements.
Phase 4: Iterate and Scale
Templates and rituals are refined based on metric feedback and team retrospectives. The system expands to cross-functional groups (product, security, QA) while maintaining consistent safety standards.
Technical Implementation (TypeScript)
The following modules demonstrate how to instrument the framework using type-safe TypeScript. The architecture separates concerns: metrics collection, decision logging, and incident workflow management.
1. Safety Metrics Engine
Tracks behavioral and operational indicators without exposing individual performance data.
interface SafetyMetric {
id: string;
teamId: string;
metricType: 'speak_up_rate' | 'time_to_safe_commit' | 'containment_score' | 'postmortem_quality';
value: number;
timestamp: Date;
metadata: Record<string, unknown>;
}
class SafetyMetricsEngine {
private metrics: Map<string, SafetyMetric[]> = new Map();
record(teamId: string, metric: Omit<SafetyMetric, 'id' | 'timestamp'>): void {
const entry: SafetyMetric = {
...metric,
id: crypto.randomUUID(),
timestamp: new Date(),
};
const existing = this.metrics.get(teamId) || [];
existing.push(entry);
this.metrics.set(teamId, existing);
}
getTrend(teamId: string, metricType: SafetyMetric['metricType'], windowDays: number = 30): number[] {
const teamMetrics = this.metrics.get(teamId) || [];
const cutoff = new Date();
cutoff.setDate(cutoff.getDate() - windowDays);
return teamMetrics
.filter(m => m.metricType === metricType && m.timestamp >= cutoff)
.sort((a, b) => a.timestamp.getTime() - b.timestamp.getTime())
.map(m => m.value);
}
calculateSpeakUpRate(teamId: string, totalPrompts: number, elicitedResponses: number): number {
return totalPrompts > 0 ? (elicitedResponses / totalPrompts) * 100 : 0;
}
}
2. Decision Registry
Enforces structured decision logging with validity windows and explicit ownership.
interface DecisionRecord {
id: string;
title: string;
context: string;
proposedSolution: string;
alternativesConsidered: string[];
evaluationCriteria: string[];
riskAssessment: 'low' | 'medium' | 'high';
owner: string;
rationale: string;
invalidationTriggers: string[];
createdAt: Date;
reviewDate: Date;
status: 'active' | 'superseded' | 'archived';
}
class DecisionRegistry {
private decisions: Map<string, DecisionRecord> = new Map();
register(decision: Omit<DecisionRecord, 'id' | 'createdAt' | 'status'>): DecisionRecord {
const record: DecisionRecord = {
...decision,
id: crypto.randomUUID(),
createdAt: new Date(),
status: 'active',
};
this.decisions.set(record.id, record);
return record;
}
supersede(id: string, replacementId: string): void {
const current = this.decisions.get(id);
if (current) {
current.status = 'superseded';
this.decisions.set(id, current);
}
}
getActiveDecisions(): DecisionRecord[] {
return Array.from(this.decisions.values()).filter(d => d.status === 'active');
}
}
3. Blameless Postmortem Pipeline
Manages incident review workflows with strict separation of system analysis and human factors.
type PostmortemState = 'timeline_capture' | 'root_cause_analysis' | 'action_assignment' | 'published';
interface PostmortemAction {
id: string;
description: string;
owner: string;
dueDate: Date;
type: 'corrective' | 'preventive';
status: 'open' | 'in_progress' | 'completed';
}
interface PostmortemReport {
id: string;
incidentId: string;
state: PostmortemState;
timeline: string[];
impact: string;
systemicRootCauses: string[];
humanContributingFactors: string[];
actions: PostmortemAction[];
culturalLearnings: string[];
publishedAt?: Date;
}
class PostmortemPipeline {
private reports: Map<string, PostmortemReport> = new Map();
initiate(incidentId: string): PostmortemReport {
const report: PostmortemReport = {
id: crypto.randomUUID(),
incidentId,
state: 'timeline_capture',
timeline: [],
impact: '',
systemicRootCauses: [],
humanContributingFactors: [],
actions: [],
culturalLearnings: [],
};
this.reports.set(report.id, report);
return report;
}
transitionState(reportId: string, newState: PostmortemState): void {
const report = this.reports.get(reportId);
if (report) {
report.state = newState;
if (newState === 'published') {
report.publishedAt = new Date();
}
this.reports.set(reportId, report);
}
}
addAction(reportId: string, action: Omit<PostmortemAction, 'id' | 'status'>): void {
const report = this.reports.get(reportId);
if (report) {
report.actions.push({
...action,
id: crypto.randomUUID(),
status: 'open',
});
this.reports.set(reportId, report);
}
}
}
Architecture Rationale
- TypeScript over dynamic languages: Strict typing prevents metric misalignment and ensures decision logs maintain consistent structure across distributed teams.
- Event-driven metric collection: Decouples safety tracking from HR systems, allowing engineering tooling (CI/CD, issue trackers, chat platforms) to feed data directly into the metrics engine.
- State machine for postmortems: Enforces a blameless progression. Teams cannot jump to action assignment without completing timeline and root cause analysis, preventing premature conclusions.
- Validity windows for decisions: Explicit
reviewDate and invalidationTriggers prevent decision rot. Teams regularly reassess architectural choices as system constraints evolve.
Pitfall Guide
| Pitfall | Explanation | Fix |
|---|
| Ritual Theater | Meetings and check-ins are scheduled but produce no behavioral change or actionable output. Teams treat safety as a calendar event rather than a workflow constraint. | Tie every ritual to a concrete artifact. Daily check-ins must produce a risk register. Code reviews must log safety improvements. If a ritual generates no traceable output, sunset it. |
| Leadership Disconnect | Managers preach safety but model blame, interrupt junior engineers, or hide their own mistakes. This creates a trust deficit that no template can fix. | Leadership must publicly document uncertainties, share postmortems of their own decisions, and explicitly invite challenge during design reviews. Track leadership participation in safety metrics. |
| Blame Leakage | Postmortems or retrospectives subtly shift focus to individual actions ("who deployed this?") rather than system constraints ("why did the pipeline allow this?"). | Enforce a strict two-track analysis: systemic root causes and human contributing factors. Use language filters in documentation tools to flag accusatory phrasing. Rotate facilitation to prevent authority bias. |
| Communication Over-Standardization | Teams enforce rigid speaking turns or mandatory participation, which alienates introverted engineers or those in different time zones. | Provide multiple participation channels: synchronous discussion, asynchronous written notes, and anonymous feedback forms. Measure comfort levels, not just vocal volume. |
| Metric Gaming | Teams optimize for high speak-up rates or fast postmortem closure by logging superficial concerns or rushing analysis. | Track trend lines over 90-day windows, not daily snapshots. Pair quantitative metrics with qualitative retrospectives. Audit metric inputs for authenticity; flag sudden spikes for review. |
| Async-Only Isolation | Remote teams rely exclusively on text-based communication, losing nuance and delaying trust-building. | Schedule lightweight synchronous touchpoints for complex discussions. Record async sessions for later review. Use reaction-based feedback in chat to lower participation barriers. |
| Template Rigidity | Teams apply identical postmortem or decision templates across all contexts, ignoring domain-specific constraints (e.g., embedded systems vs. web APIs). | Maintain a core template structure but allow domain-specific extensions. Review template effectiveness quarterly. Let teams vote on modifications based on delivery feedback. |
Production Bundle
Action Checklist
Decision Matrix
| Scenario | Recommended Approach | Why | Cost Impact |
|---|
| Small team (<8 engineers) with established trust | Light-touch rituals + async decision logging | Heavy frameworks create overhead; trust is already present | Low (minimal tooling investment) |
| Remote-first distributed team | Async-first rituals + recorded synchronous touchpoints + multi-channel participation | Time zone fragmentation requires flexible participation models | Medium (tooling for recording & async tracking) |
| Compliance-heavy environment (SOC2, HIPAA, etc.) | Safety rituals aligned with audit trails + transparent process documentation | Regulatory requirements demand traceability; safety metrics can double as compliance evidence | Medium-High (audit integration & documentation overhead) |
| High-velocity startup | Rapid iteration of rituals + weekly metric reviews + leadership modeling | Speed requires fast feedback loops; safety prevents technical debt accumulation | Low-Medium (focus on lightweight templates & CI integration) |
Configuration Template
Copy this TypeScript configuration to initialize the safety framework in your engineering repository. It exports typed interfaces, default metric thresholds, and pipeline state definitions.
// safety-framework.config.ts
export const SAFETY_CONFIG = {
metrics: {
thresholds: {
speakUpRate: { target: 0.75, warning: 0.5 },
timeToSafeCommit: { target: 48, unit: 'hours' },
containmentScore: { target: 0.85, warning: 0.6 },
postmortemQuality: { target: 4.0, scale: 5 },
},
aggregationWindow: 30, // days
},
rituals: {
dailyCheckIn: { durationMinutes: 5, requiredOutput: 'risk_register' },
codeReviewSafety: { durationMinutes: 60, checklist: ['security', 'reliability', 'observability', 'clarity', 'edge_cases'] },
askAnything: { durationMinutes: 15, frequency: 'weekly', rotationRequired: true },
},
postmortem: {
states: ['timeline_capture', 'root_cause_analysis', 'action_assignment', 'published'],
maxActionItems: 5,
requiredOwners: true,
},
decisions: {
reviewInterval: 14, // days
requiredFields: ['context', 'alternatives', 'criteria', 'risk_assessment', 'invalidation_triggers'],
},
};
export type SafetyConfig = typeof SAFETY_CONFIG;
Quick Start Guide
- Initialize the registry: Add the configuration template to your repository. Run
npm init and install TypeScript. Compile the safety modules to verify type safety.
- Instrument your first ritual: Add the daily safety check-in to your team calendar. Require a single risk register entry per session. Log outputs to your issue tracker using the
SafetyMetricsEngine.
- Deploy the postmortem pipeline: Create a new incident review using
PostmortemPipeline.initiate(). Follow the state machine strictly. Assign at least one corrective and one preventive action with owners and due dates.
- Launch baseline measurement: Run the first anonymous safety survey. Record the psychological safety index. Compare against the 30-day trend window to establish your starting point.
- Review and iterate: After 14 days, audit ritual outputs and metric trends. Adjust thresholds, retire ineffective templates, and document changes in the decision registry. Schedule the next review cycle.