, models substitution probability, and dynamically adjusts feature exposure based on economic impact. The implementation follows four sequential phases.
Step 1: Isolate AI Features Behind Dynamic Routing
Hardcoded feature flags create brittle deployment boundaries. Replace them with a journey-aware routing service that evaluates user context, workflow state, and substitution probability before serving either the legacy or AI path. The router must support weighted traffic splitting, fallback chains, and cost-aware suppression.
// router-service.ts
import { FeatureFlagClient } from '@codcompass/flags';
import { SubstitutionModel } from './substitution-model';
export interface RoutingContext {
userId: string;
featureId: string;
legacyWorkflow: string;
aiWorkflow: string;
costThreshold: number;
}
export class CannibalizationRouter {
constructor(
private flags: FeatureFlagClient,
private model: SubstitutionModel
) {}
async resolve(context: RoutingContext): Promise<'legacy' | 'ai' | 'fallback'> {
const flagState = await this.flags.getVariant(context.userId, context.featureId);
const substitutionProb = await this.model.predict(context.userId, context.legacyWorkflow);
if (flagState === 'ai' && substitutionProb > 0.65) {
const estimatedCost = await this.flags.getMetric('inference_cost_usd', context.userId);
if (estimatedCost > context.costThreshold) return 'fallback';
return 'ai';
}
return flagState === 'ai' ? 'ai' : 'legacy';
}
}
Step 2: Implement Cross-Feature Telemetry Pipeline
Traditional analytics track clicks and completions. Cannibalization requires journey-level transition events. Emit structured events that capture workflow entry, exit, abandonment, and cross-feature handoff. Route these events through an event bus to a substitution analytics module.
// telemetry-interceptor.ts
import { EventEmitter } from 'events';
export interface TransitionEvent {
eventType: 'workflow_enter' | 'workflow_exit' | 'cross_feature_handoff' | 'abandon';
userId: string;
sourceWorkflow: string;
targetWorkflow?: string;
timestamp: number;
metadata: Record<string, unknown>;
}
export class JourneyTelemetry {
private emitter = new EventEmitter();
track(event: TransitionEvent): void {
this.emitter.emit('transition', event);
}
onTransition(handler: (event: TransitionEvent) => void): void {
this.emitter.on('transition', handler);
}
}
Step 3: Build Substitution Probability Model
Substitution is not binary. It follows a Markov-like transition pattern where users gradually shift workflows based on latency, accuracy, pricing, and habit. Train a lightweight logistic or gradient-boosted model on transition events to predict migration probability per user. Update weights weekly using rolling windows to avoid concept drift.
// substitution-model.ts
export class SubstitutionModel {
private weights: Map<string, number> = new Map();
private decayFactor = 0.92;
async predict(userId: string, legacyWorkflow: string): Promise<number> {
const historical = await this.fetchUserHistory(userId, legacyWorkflow);
const score = this.calculateTransitionScore(historical);
return this.applyDecay(score);
}
private calculateTransitionScore(history: Array<{ event: string; ts: number }>): number {
// Simplified logistic scoring; production uses trained coefficients
const crossHandoffs = history.filter(e => e.event === 'cross_feature_handoff').length;
const abandons = history.filter(e => e.event === 'abandon').length;
const total = history.length || 1;
return Math.min(1, (crossHandoffs * 0.7 - abandons * 0.3) / total);
}
private applyDecay(score: number): number {
return score * this.decayFactor;
}
private async fetchUserHistory(userId: string, workflow: string) {
// Placeholder: fetch from telemetry warehouse
return [];
}
}
Step 4: Implement Economic Gating & Fallback Chains
Substitution must be bounded by unit economics. If inference cost exceeds the LTV delta gained from migration, the router suppresses AI routing and falls back to legacy or hybrid paths. Implement circuit breakers that trigger on cost thresholds, latency spikes, or model degradation. Fallback chains should prioritize user continuity over feature purity.
Architecture decisions center on decoupling. The routing service communicates with the telemetry pipeline via asynchronous events, not synchronous calls. This prevents AI inference latency from blocking core product rendering. The substitution model runs offline on aggregated events, feeding predictions back to the router via a low-latency key-value store. Pricing and tier logic sit outside the routing layer, reading substitution probabilities to adjust feature visibility or usage caps. This separation ensures that economic decisions do not couple with real-time request paths, maintaining sub-50ms routing SLAs.
Pitfall Guide
-
Treating AI as purely additive
AI workflows replace cognitive and operational steps. Assuming compounding engagement ignores attention economics. Best practice: Model every AI feature as a potential zero-sum replacement. Track workflow substitution rates, not just adoption rates.
-
Ignoring cross-feature attribution
Siloed dashboards cannot detect cannibalization. Users abandon legacy features while adopting AI, but total session time remains flat. Best practice: Implement journey-level telemetry that maps entry, exit, and handoff events across all product modules. Attribute revenue to workflow transitions, not isolated clicks.
-
Hardcoding AI routing logic
Static feature flags create brittle deployment boundaries. When model performance degrades or costs spike, hardcoded routes continue serving expensive paths. Best practice: Use probabilistic routing with cost thresholds and fallback chains. Evaluate routing decisions against real-time unit economics.
-
Optimizing for AI adoption rate
High adoption with low substitution yield indicates dual-execution overhead. Users run both workflows, inflating costs without increasing retention. Best practice: Optimize for net revenue per user and substitution velocity. Deprioritize adoption metrics that do not correlate with LTV delta.
-
Skipping economic modeling
Engineering teams track latency and accuracy. Product teams track engagement. Finance tracks margin. Cannibalization sits at the intersection. Best practice: Build a substitution LTV model that weights inference cost, legacy feature margin, and migration probability. Gate AI exposure when marginal cost exceeds marginal retention gain.
-
Neglecting graceful degradation
AI models fail silently. Routing logic that lacks fallback chains degrades user experience and accelerates churn. Best practice: Implement circuit breakers with tiered fallbacks: AI -> hybrid -> legacy -> cached response. Monitor degradation signals (latency p99, error rate, model confidence) and trigger fallbacks automatically.
-
Over-collecting substitution data
Tracking every micro-interaction violates privacy boundaries and inflates storage costs. Best practice: Use privacy-first event schemas. Emit aggregated transition signals instead of raw session traces. Apply differential privacy or k-anonymity thresholds for user-level predictions. Retain only what feeds the substitution model or routing decision.
Production Bundle
Action Checklist
Decision Matrix
| Scenario | Recommended Approach | Why | Cost Impact |
|---|
| High AI adoption, low substitution | Suppress AI routing, enforce workflow exclusivity | Dual execution inflates inference cost without improving retention | -35% inference spend |
| Model latency > 800ms p99 | Trigger fallback to legacy workflow | Latency degrades substitution probability and increases abandonment | -18% cloud compute |
| LTV delta negative after 30 days | Roll back to hybrid routing, adjust pricing | Economic gating prevents margin erosion during migration | +12% gross margin |
| Control group contamination > 15% | Isolate telemetry pipelines, harden flag boundaries | Spillover invalidates ROI calculations and misguides roadmap | +22% experiment accuracy |
| Substitution rate > 40% in cohort | Accelerate AI-only routing, retire legacy feature | High migration velocity justifies infrastructure consolidation | -28% maintenance overhead |
Configuration Template
# cannibalization-config.yaml
routing:
feature_id: "ai-assistant-v2"
strategy: probabilistic
thresholds:
substitution_probability: 0.65
inference_cost_usd: 3.50
latency_p99_ms: 800
fallback_chain:
- ai
- hybrid
- legacy
- cached
telemetry:
event_schema: journey_transition
retention_days: 90
privacy_mode: k_anonymity
k_threshold: 5
model:
type: logistic
update_frequency: weekly
decay_factor: 0.92
features:
- cross_feature_handoff_count
- abandon_rate
- session_overlap_ratio
- latency_per_transition
economics:
ltv_delta_threshold: 1.15
margin_floor_percent: 28
cost_cap_usd_per_retained_user: 2.00
Quick Start Guide
- Install the routing and telemetry packages:
npm install @codcompass/router @codcompass/telemetry
- Create a
cannibalization-config.yaml matching your workflow IDs, cost thresholds, and fallback chain.
- Attach the
JourneyTelemetry interceptor to your frontend and API gateways to emit transition events.
- Deploy the
CannibalizationRouter behind your feature flag service, pointing it to the config and substitution model endpoint.
- Run a controlled rollout with 10% traffic, monitor substitution probability and inference cost per retained user, then scale routing weights as LTV delta stabilizes above threshold.