Learn AWS IAM by Solving 12 Policy Puzzles in the Browser

Architecting Browser-Based Cloud Simulators: State Machines, Canvas Layers, and Real-Time Validation

Current Situation Analysis

Complex cloud services like AWS IAM, networking, and resource policies present a steep learning curve. The evaluation logic is rarely linear; it involves the interaction of identity policies, resource policies, Service Control Policies (SCPs), permissions boundaries, and explicit denies. Developers often struggle to internalize these rules because documentation describes the mechanics, but the emergent behavior only becomes clear through trial and error.

Traditional learning methods fall short. Reading documentation lacks interactivity. Using live AWS sandboxes introduces friction: account setup, billing risks, resource cleanup, and the psychological barrier of potentially breaking production-adjacent environments. Furthermore, sandboxes cannot easily simulate edge cases like specific SCP blocks or cross-account trust relationships without significant configuration overhead.

The industry overlooks the architectural complexity required to build effective browser-based simulators. A simulator is not just a UI wrapper around a JSON editor; it requires a robust state management strategy to handle branching scenarios, a decoupled rendering layer to support animations, and a validation engine that provides instant feedback. Without these, simulators become brittle, slow, or fail to accurately model the service being taught.

Data from interactive learning platforms indicates that immediate feedback loops reduce cognitive load and improve retention. By removing infrastructure dependencies and providing a safe environment to fail, browser simulators enable a "fail-fast" learning model that accelerates mastery of complex policy evaluation logic.

WOW Moment: Key Findings

The following comparison highlights the operational and pedagogical advantages of a browser-based simulator over traditional learning approaches. This data underscores why investing in simulator architecture yields higher engagement and lower barriers to entry.

Approach	Setup Latency	Operational Cost	Safety Risk	Feedback Loop	Scenario Coverage
Live AWS Sandbox	5–10 minutes	Variable ($$$)	High (Billing/Breakage)	Slow (Console refresh)	Limited by account config
Static Documentation	Instant	$0	None	None	Comprehensive but passive
Browser Simulator	Instant	$0	None	Instant (Keystroke)	Unlimited (Code-defined)

Why this matters: The browser simulator approach decouples learning from infrastructure. It allows educators to define arbitrary scenarios, enforce strict validation, and provide real-time guidance without incurring cloud costs or risking resource leakage. The "Instant" feedback loop is critical for policy learning, where understanding the impact of a single condition operator change is the core educational objective.

Core Solution

Building a cloud simulator requires a disciplined architecture that separates domain logic, state orchestration, rendering, and validation. Below is a technical blueprint for implementing such a system, focusing on state machines, canvas layers, and real-time validation.

1. Layered Architecture with Unidirectional Dependencies

The codebase should be organized into strict layers to prevent coupling. Dependencies flow downward; lower layers never import from upper layers. This ensures that domain logic remains testable and UI components remain agnostic.

// Project Structure
// lib/          <- Utilities, types, hooks
// domain/       <- Cloud entities, ARN parsers, base schemas
// levels/       <- Scenario definitions, state machines, objectives
// runtime/      <- Level loader, persistence, event bus
// features/     <- Canvas, editor, dialogs (UI only)
// app/          <- Root composition

Rationale: This structure isolates the "what" (domain/levels) from the "how" (features). The canvas component does not know it is rendering IAM entities; it only knows how to render nodes and edges based on events. This makes the rendering layer reusable for other simulators.

2. Serializable State Machines with a Registry Pattern

Each scenario (level) is managed by a dedicated state machine. The machine orchestrates the flow, tracks objectives, and controls UI restrictions. Crucially, the machine must be serializable to support checkpointing and state restoration.

The Problem: State machines cannot contain runtime functions (validators, evaluators) in their context because functions are not serializable.

The Solution: Use a Registry pattern. The machine stores string keys referencing functions, while the actual logic lives in a separate registry.

// domain/registry.ts
type RegistryKey = string;
type RegistryFn = (...args: any[]) => any;

class FunctionRegistry {
  private store: Map<RegistryKey, RegistryFn> = new Map();

  register(key: RegistryKey, fn: RegistryFn): void {
    this.store.set(key, fn);
  }

  resolve(key: RegistryKey): RegistryFn | undefined {
    return this.store.get(key);
  }
}

export const registry = new FunctionRegistry();

// Example: Registering a policy evaluator
registry.register('evaluate-iam-policy', (policy: any, request: any) => {
  // Complex evaluation logic here
  return { allowed: true, reasons: [] };
});

// levels/machine.ts
import { createMachine, assign } from 'xstate';

export const levelMachine = createMachine({
  id: 'level-1',
  initial: 'editing',
  context: {
    // Store keys, not functions
    evaluatorKey: 'evaluate-iam-policy',
    policyJson: '{}',
    result: null,
  },
  states: {
    editing: {
      on: {
        SUBMIT_POLICY: {
          target: 'evaluating',
          actions: assign({
            policyJson: ({ event }) => event.policy,
          }),
        },
      },
    },
    evaluating: {
      invoke: {
        src: ({ context }) => (callback) => {
          // Resolve function from registry at runtime
          const evaluator = registry.resolve(context.evaluatorKey);
          if (evaluator) {
            const result = evaluator(context.policyJson, context.request);
            callback(result);
          }
        },
        onDone: {
          target: 'editing',
          actions: assign({
            result: ({ event }) => event.data,
          }),
        },
      },
    },
  },
});

Rationale: This pattern preserves serializability. When the machine snapshot is saved, it contains only strings and data. Upon restoration, the registry is re-populated, and the machine functions identically. This enables seamless checkpointing and replay capabilities.

3. Canvas Animation Indirection

The visual representation of cloud entities (users, roles, buckets) is typically rendered on a canvas using a library like ReactFlow. Direct manipulation of the canvas state by the machine causes jarring UI updates, especially during deletions.

The Problem: If the machine removes a node, the DOM element vanishes immediately, preventing exit animations.

The Solution: Implement an indirection layer. The machine emits events; the canvas subscribes and manages its own UI state, including animation lifecycles.

// features/canvas/bridge.ts
import { create } from 'zustand';
import { animate } from 'framer-motion';

interface CanvasNode {
  id: string;
  data: any;
  status: 'active' | 'deleting' | 'removed';
}

interface CanvasStore {
  nodes: CanvasNode[];
  addNodes: (nodes: CanvasNode[]) => void;
  markNodesForDeletion: (ids: string[]) => void;
  removeNodes: (ids: string[]) => void;
}

export const useCanvasStore = create<CanvasStore>((set) => ({
  nodes: [],
  addNodes: (newNodes) => set((state) => ({
    nodes: [...state.nodes, ...newNodes],
  })),
  markNodesForDeletion: (ids) => set((state) => ({
    nodes: state.nodes.map((n) =>
      ids.includes(n.id) ? { ...n, status: 'deleting' } : n
    ),
  })),
  removeNodes: (ids) => set((state) => ({
    nodes: state.nodes.filter((n) => !ids.includes(n.id)),
  })),
}));

// Bridge component subscribes to machine events
export function CanvasBridge({ machineActor }) {
  const markForDeletion = useCanvasStore((s) => s.markNodesForDeletion);
  const removeNodes = useCanvasStore((s) => s.removeNodes);

  useEffect(() => {
    const subscription = machineActor.subscribe((snapshot) => {
      if (snapshot.matches('nodeDeleted')) {
        const ids = snapshot.context.deletedIds;
        // Step 1: Mark for animation
        markForDeletion(ids);
        
        // Step 2: Wait for animation, then remove
        setTimeout(() => removeNodes(ids), 300);
      }
    });
    return () => subscription.unsubscribe();
  }, [machineActor]);

  return null;
}

Rationale: The bridge decouples the logical state from the visual state. The machine declares intent (NODES_DELETED), and the canvas handles the execution, including animations. This results in a polished user experience where entities fade out or slide away rather than popping out of existence.

4. Dual-Layer Validation Engine

Policy editing requires strict validation. Users must adhere to the JSON schema of the cloud service, but also satisfy scenario-specific constraints.

The Solution: Use a JSON schema validator like AJV for structural validation, combined with custom business logic rules compiled once per level.

// features/editor/validation.ts
import Ajv from 'ajv';

const ajv = new Ajv({ allErrors: true });

// Base schema for cloud policies
const baseSchema = {
  type: 'object',
  required: ['Version', 'Statement'],
  properties: {
    Version: { type: 'string', const: '2012-10-17' },
    Statement: { type: 'array', minItems: 1 },
  },
};

const baseValidator = ajv.compile(baseSchema);

// Level-specific rules
interface ObjectiveRule {
  id: string;
  validate: (policy: any) => { valid: boolean; message?: string };
}

const levelRules: ObjectiveRule[] = [
  {
    id: 'trust-policy-service',
    validate: (policy) => {
      const hasEc2 = policy.Statement?.some(
        (s: any) => s.Principal?.Service === 'ec2.amazonaws.com'
      );
      return {
        valid: hasEc2,
        message: hasEc2 ? undefined : 'Trust policy must include ec2.amazonaws.com',
      };
    },
  },
];

export function validatePolicy(policyStr: string) {
  let policy;
  try {
    policy = JSON.parse(policyStr);
  } catch {
    return { valid: false, errors: ['Invalid JSON'] };
  }

  // Layer 1: Schema validation
  const schemaValid = baseValidator(policy);
  if (!schemaValid) {
    return { valid: false, errors: baseValidator.errors?.map(e => e.message) };
  }

  // Layer 2: Business logic validation
  const ruleResults = levelRules.map((rule) => rule.validate(policy));
  const failedRules = ruleResults.filter((r) => !r.valid);
  
  if (failedRules.length > 0) {
    return {
      valid: false,
      errors: failedRules.map((r) => r.message),
    };
  }

  return { valid: true, errors: [] };
}

Rationale: AJV compiles validators to JavaScript functions, offering high performance for keystroke-level validation. Separating schema validation from business logic allows the base schema to be reused across levels while enabling flexible, scenario-specific constraints. This dual approach catches syntax errors early and guides users toward correct policy configurations.

Pitfall Guide

Building interactive simulators introduces unique challenges. Below are common pitfalls and their resolutions based on production experience.

Pitfall	Explanation	Fix
Serialization Trap	Storing functions or complex objects in state machine context breaks serialization, preventing checkpointing and state restoration.	Use a `FunctionRegistry` pattern. Store string keys in context and resolve functions at runtime.
Animation Jank	Directly removing DOM elements when state changes causes visual popping and breaks exit animations.	Implement an indirection layer. Emit events from the machine, mark elements as "deleting" in the UI store, animate, then remove.
Tight Coupling	Canvas or editor components import domain-specific logic, making them non-reusable and hard to test.	Enforce unidirectional dependencies. UI components should only interact with generic types and events. Domain logic lives in lower layers.
Validation Performance	Running heavy validation logic on every keystroke can cause UI lag, especially with large policies.	Compile validators once (e.g., AJV). Debounce validation if necessary. Use Web Workers for complex evaluation logic.
State Explosion	Creating a single monolithic state machine for all levels leads to complex transitions and large bundle sizes.	Create one machine per level. Use dynamic imports to load level code only when needed.
Context Loss in Animations	When deleting nodes, the IDs may be lost if the state updates before the animation completes.	The indirection layer preserves IDs during the animation phase. The machine emits IDs, and the canvas store retains them until removal.
Ignoring Edge Cases	Simulators that only handle happy paths fail to teach robust policy understanding.	Design levels that specifically target common mistakes, such as implicit denies, boundary conflicts, and cross-account trust errors.

Production Bundle

Action Checklist

Define Domain Schema: Create JSON schemas for all cloud entities and policies. Ensure schemas match the target service's specification.
Setup Function Registry: Implement a registry to decouple runtime logic from state machines. Register all evaluators and validators.
Build State Machines: Create one XState machine per level. Use keys for functions and ensure all context is serializable.
Implement Canvas Bridge: Develop an indirection layer between state machines and the canvas. Support node/edge events and animation lifecycles.
Configure Dual Validation: Set up AJV for schema validation. Define level-specific business rules. Integrate validation into the editor extension.
Add Lazy Loading: Use dynamic imports for level machines and heavy dependencies like the code editor to optimize initial load time.
Write Integration Tests: Use a testing framework to simulate user interactions and verify state transitions. Ensure checkpoints restore correctly.

Decision Matrix

Scenario	Recommended Approach	Why	Cost Impact
Teaching IAM Policy Logic	Browser Simulator	Instant feedback, safe environment, covers edge cases without AWS accounts.	$0 infrastructure cost.
Hands-on Cloud Operations	Live Sandbox	Required for learning console workflows, CLI usage, and real resource management.	Variable cost based on usage.
Reference Documentation	Static Docs	Best for comprehensive API references and detailed explanations.	$0, but low engagement.
Complex Multi-Service Workflows	Hybrid (Simulator + Sandbox)	Simulator for policy logic, sandbox for deployment and integration testing.	Moderate cost for sandbox portion.

Configuration Template

State Machine with Registry Integration:

// levels/level-1/machine.ts
import { createMachine, assign } from 'xstate';
import { registry } from '../../domain/registry';

export const level1Machine = createMachine({
  id: 'level-1',
  initial: 'intro',
  context: {
    policy: '{}',
    evaluatorKey: 'iam-evaluator',
    checkpoint: null,
  },
  states: {
    intro: {
      on: { START: 'editing' },
    },
    editing: {
      on: {
        UPDATE_POLICY: {
          actions: assign({ policy: ({ event }) => event.policy }),
        },
        CHECK: {
          target: 'checking',
        },
      },
    },
    checking: {
      invoke: {
        src: ({ context }) => (callback) => {
          const fn = registry.resolve(context.evaluatorKey);
          const result = fn(context.policy);
          callback(result);
        },
        onDone: {
          target: 'editing',
          actions: assign({
            checkpoint: ({ event }) => event.data,
          }),
        },
      },
    },
  },
});

AJV Validator Setup:

// features/editor/validators.ts
import Ajv from 'ajv';

const ajv = new Ajv();

export const policySchema = {
  $id: 'cloud-policy',
  type: 'object',
  properties: {
    Version: { type: 'string' },
    Statement: {
      type: 'array',
      items: {
        type: 'object',
        properties: {
          Effect: { type: 'string', enum: ['Allow', 'Deny'] },
          Action: { type: 'string' },
          Resource: { type: 'string' },
        },
        required: ['Effect', 'Action', 'Resource'],
      },
    },
  },
  required: ['Version', 'Statement'],
};

export const validatePolicy = ajv.compile(policySchema);

Layered Folder Structure:

src/
├── lib/
│   ├── types.ts
│   └── utils.ts
├── domain/
│   ├── registry.ts
│   ├── schemas/
│   └── evaluators/
├── levels/
│   ├── level-1/
│   │   ├── machine.ts
│   │   └── objectives.ts
│   └── level-2/
├── runtime/
│   ├── level-loader.ts
│   ├── persistence.ts
│   └── event-bus.ts
├── features/
│   ├── canvas/
│   │   ├── bridge.ts
│   │   └── components.ts
│   └── editor/
│       ├── validators.ts
│       └── extension.ts
└── app/
    └── index.tsx

Quick Start Guide

Initialize Project: Create a React + TypeScript project using Vite. Install dependencies: xstate, @xstate/react, zustand, ajv, reactflow, codemirror.
Define Registry: Set up the FunctionRegistry in domain/registry.ts. Register a mock evaluator function to test the pattern.
Create First Machine: Implement a simple state machine in levels/level-1/machine.ts with two states and a transition that invokes the registry evaluator.
Build Canvas Bridge: Create the CanvasBridge component and useCanvasStore. Wire up a basic ReactFlow canvas to react to machine events.
Run and Verify: Start the dev server. Interact with the UI, submit policies, and verify that the state machine transitions correctly, the canvas animates, and validation provides feedback. Add a checkpoint test to ensure serialization works.

Mid-Year Sale — Unlock Full Article