solution (MTTR) and enables fully autonomous AI remediation loops.
| Approach | Time to Root Cause | Manual Steps Required | AI/Agent Actionability | Context Accuracy (Construct Mapping) |
|---|
| Traditional Pipeline Debugging | 45β90 mins | 6β8 UI/console clicks | 0% (CFN-only context) | Low (Logical ID β Mental Translation) |
Local cdk deploy Fallback | 15β30 mins | 3β5 steps (sync/retry) | 30% (Partial CLI context) | Medium (Construct path only) |
cdk diagnose + AI Agent | 2β5 mins | 1 CLI command | 95% (Full CDK + source context) | High (Path + Line Number + Error) |
Key Findings:
cdk diagnose reduces root-cause identification time by ~85% by eliminating console navigation and manual CFN-to-CDK translation.
- AI agents achieve >90% actionability when fed
cdk diagnose output, as the tool provides the exact construct path, source file/line, and failure reason required for precise code generation.
- Sweet Spot: The tool is optimal for pipeline-deployed stacks, cross-account/region deployments, and AI-agent-driven remediation workflows where local CLI interception is unavailable.
Core Solution
cdk diagnose is a CDK CLI subcommand that inspects a CloudFormation stack's last failed deployment and surfaces the root cause with CDK-aware context. It queries CloudFormation directly via DescribeChangeSet and related APIs, then enriches the raw error using CDK metadata (aws:cdk:path) baked into the template during synthesis. This mapping bridges CloudFormation logical IDs back to the construct tree and original source files.
Architecture & Implementation Details:
- Deployment-Agnostic: Works regardless of deployment method (CDK Pipelines, CodePipeline, direct CFN API calls, or manual console). If the stack exists and failed, the tool can diagnose it.
- Metadata-Driven Mapping: Leverages the
aws:cdk:path resource tag/metadata generated during cdk synth to reconstruct the construct hierarchy and locate the exact source file and line number.
- Actionable Output: Returns a structured report containing the failed CFN resource, construct path, source location, and contextual hints for remediation.
CLI Usage:
cdk --unstable=diagnose diagnose MyStack
Real-World Example: The CDK Upgrade That Breaks Everything
The following scenario reflects a P0 issue impacting hundreds of CDK users (aws-cdk#34612). The developer wrote valid CDK code, but a CloudFormation state conflict caused deployment failure.
import * as cdk from 'aws-cdk-lib';
import * as lambda from 'aws-cdk-lib/aws-lambda';
export class MyAppStack extends cdk.Stack {
constructor(scope, id, props) {
super(scope, id, props);
new lambda.Function(this, 'MyFunction', {
runtime: lambda.Runtime.NODEJS_20_X,
handler: 'index.handler',
code: lambda.Code.fromAsset('lambda'),
logRetention: cdk.aws_logs.RetentionDays.ONE_WEEK,
});
}
}
Running cdk diagnose MyStack on a failed pipeline deployment surfaces:
β MyFunction
π LogGroup already exists
π lib/my-stack.ts:8:5
This output provides the exact construct path, the CFN rejection reason, and the precise source location, enabling both developers and AI agents to apply targeted fixes (e.g., adjusting removal policies, importing existing resources, or modifying feature flags) without guessing.
Pitfall Guide
- Feeding Raw CFN Errors to AI Agents: LLMs lack CDK context when given CloudFormation YAML or console error messages. Always pipe
cdk diagnose output to AI agents to ensure they modify CDK source, not synthesized templates.
- Assuming
cdk synth Success Guarantees Deployment Success: Synthesis only validates schema and syntax. Runtime conflicts (e.g., existing resources, IAM permissions, service limits) only surface during CloudFormation execution. Always validate pipeline deployments with diagnostic tooling.
- Stripping or Overriding
aws:cdk:path Metadata: The diagnostic engine relies entirely on CDK metadata baked during synthesis. Custom CloudFormation transforms, manual template edits, or CI/CD steps that strip metadata will break construct mapping.
- Misinterpreting CFN Logical IDs: Logical IDs are often hashed or auto-generated. Direct string matching to construct names fails. Always use the construct path (
Stack/Construct/SubConstruct) provided by cdk diagnose for accurate code navigation.
- Skipping ChangeSet Inspection:
cdk diagnose queries the failed change set. If pipelines are configured to skip change sets or auto-apply without retaining history, diagnostic context is lost. Ensure change sets are retained for post-mortem analysis.
- Hardcoding Resource Names/Identifiers: Explicit naming without proper lifecycle management causes "already exists" errors. Use CDK references, implicit naming, or explicit
RemovalPolicy configurations to avoid state conflicts.
- Running Diagnostics Against Stale/Deleted Stacks:
cdk diagnose requires an existing stack in a failed state. Running it against successfully deployed, deleted, or rolled-back stacks returns empty or misleading results. Verify stack status (CFN console or aws cloudformation describe-stacks) before diagnosing.
Deliverables
- Autonomous CDK Remediation Blueprint: A step-by-step architecture guide for integrating
cdk diagnose into CI/CD pipelines and AI agent workflows, including state machine diagrams for human-in-the-loop vs. fully autonomous remediation loops.
- Pipeline Diagnosis Readiness Checklist: A 12-point validation checklist covering metadata preservation, change set retention, IAM permissions for
DescribeChangeSet, and AI prompt templating for safe CDK code generation.
- CI/CD Integration Configuration Template: Ready-to-use GitHub Actions, GitLab CI, and AWS CodePipeline YAML snippets that automatically trigger
cdk diagnose on deployment failure, parse the output, and route actionable context to Slack, Jira, or AI remediation agents.