Debugging Multi-Agent Systems in TypeScript: From Flat Logs to Execution Trees

Visualizing Agent Concurrency: A Tree-Based Approach to Multi-Agent Observability

Current Situation Analysis

Multi-agent architectures introduce a class of concurrency bugs that traditional observability tools cannot resolve. When a single agent executes, linear logs provide a sufficient audit trail. However, as soon as multiple agents interact—sharing state, competing for resources, or executing in parallel—the system's control flow transforms from a list into a directed acyclic graph (DAG).

Flat logs fail in this environment because they collapse the causal hierarchy into a timestamped sequence. A log entry might show Agent A writing to a database at 14:00:01 and Agent B reading stale data at 14:00:02, but the log does not explicitly encode the dependency or the race condition. Engineers are forced to manually reconstruct the execution tree by correlating timestamps, which is error-prone and unscalable.

This gap is often overlooked because developers initially treat agents as sequential functions. The complexity emerges only when orchestration patterns like fan-out, retry loops, and resource locking are introduced. Without a visualization of the execution tree, debugging becomes a game of inference rather than inspection. You can see the symptoms (errors, conflicts), but the structural cause (parallel branches touching the same resource, decision loops based on outdated state) remains hidden.

WOW Moment: Key Findings

The shift from flat logging to execution tree tracing fundamentally changes how concurrency failures are diagnosed. By capturing the hierarchical relationship between decisions, tool calls, and parallel branches, you gain immediate visibility into race conditions and coordination failures.

Dimension	Linear Logging	Execution Tree Tracing
Concurrency Visibility	Implicit; requires timestamp correlation	Explicit; parallel branches are structural nodes
Root Cause Isolation	Manual; scan logs for error keywords	Structural; trace path to failure node
State Freshness	Unknown; logs rarely capture state snapshots	Visible; each node can record state context
Coordination Gaps	Hidden; orchestrator logic is opaque	Clear; reveals if conflict resolution was bypassed
Debug Latency	High; minutes to hours of log parsing	Low; seconds to identify structural anomaly

Why this matters: Execution trees allow you to distinguish between a tool failure, an LLM reasoning error, and an orchestration bug. In a tree view, a failure inside a parallel branch that prevents the coordinator from reaching a conflict-resolution step is immediately obvious, whereas logs might just show a timeout and a later conflict error with no causal link.

Core Solution

The solution involves instrumenting your TypeScript agent flows with a local-first execution tracer. Tools like agent-inspect provide a structured debugging layer that writes trace data to the local filesystem, enabling rapid iteration without the overhead of a hosted observability platform.

The implementation strategy focuses on three layers:

Session Boundaries: Wrap the top-level orchestration flow to define the scope of a single run.
Phase Instrumentation: Mark logical steps within agents to capture decision points.
Action Tagging: Explicitly label tool calls and LLM interactions to separate side effects from computation.

Implementation Strategy

Consider a payment fraud detection system where a RiskAnalyzer agent evaluates transactions, and a MitigationAgent handles account freezes. The system must ensure that if multiple agents attempt to modify the same account, they do not conflict.

1. Orchestrator Instrumentation

Wrap the main entry point with inspectRun. This creates the root of the execution tree. Use step for logical phases and step.tool for operations that interact with external systems or simulate infrastructure changes.

import { inspectRun, step } from 'agent-inspect';

interface FraudCase {
  caseId: string;
  amount: number;
  userId: string;
}

async function processFraudCase(fraudCase: FraudCase) {
  return inspectRun(
    'fraud-mitigation-pipeline',
    async () => {
      // Phase 1: Risk Assessment
      const riskEvaluation = await step('evaluate-risk-profile', async () => {
        return riskAnalyzer.assess(fraudCase);
      });

      // Phase 2: Conditional Mitigation
      if (riskEvaluation.severity === 'critical') {
        return step('execute-mitigation-protocol', async () => {
          // Parallel execution of independent actions
          return Promise.all([
            step.tool('freeze-user-account', () => 
              accountService.lock(fraudCase.userId)
            ),
            step.tool('notify-compliance', () => 
              complianceService.alert(fraudCase.caseId)
            ),
            step.tool('block-payment-channel', () => 
              paymentGateway.revoke(fraudCase.userId)
            ),
          ]);
        });
      }

      return { status: 'monitored', reason: 'Risk below threshold' };
    },
    { traceDir: './.agent-traces' }
  );
}

2. Agent-Level Granularity

Instrument individual agents to capture internal decision loops. This is critical for identifying when an agent acts on stale state. Use descriptive names that reflect the business logic, not generic labels.

class MitigationAgent {
  async handleCriticalCase(caseId: string) {
    return step('mitigation-agent-flow', async () => {
      // Check current state before acting
      const accountState = await step.tool('fetch-account-lock-status', async () => {
        return this.accountRepo.getLockStatus(caseId);
      });

      // Guard against concurrent modifications
      if (accountState.isLockedByAnotherAgent) {
        return step('defer-to-coordinator', async () => {
          await this.waitForLockRelease();
          return this.handleCriticalCase(caseId);
        });
      }

      // LLM decision point
      const actionPlan = await step.llm('determine-mitigation-steps', async () => {
        return this.model.chat({
          messages: [{
            role: 'user',
            content: JSON.stringify({
              task: 'Select mitigation steps based on account state',
              state: accountState,
            }),
          }],
        });
      });

      // Execute selected actions
      return step.tool('apply-mitigation', async () => {
        return this.executePlan(actionPlan.steps);
      });
    });
  }
}

3. Resolving Coordination Failures

Execution traces often reveal that failures stem from parallel agents acting on shared resources without synchronization. The trace might show two agents attempting to scale a database or freeze an account simultaneously, causing a quorum loss or lock conflict.

Fix 1: State Refresh Guards Ensure agents verify state freshness before committing actions. If a resource is in flux, the agent should wait or re-evaluate.

async function applyMitigation(plan: ActionPlan) {
  return step('safe-mitigation-execution', async () => {
    const currentState = await step.tool('verify-resource-state', async () => {
      return this.resourceClient.getStatus(plan.targetId);
    });

    if (currentState.isModifying) {
      return step('wait-for-stability', async () => {
        await this.pollUntilStable(plan.targetId);
        return this.applyMitigation(plan);
      });
    }

    return this.performAction(plan);
  });
}

Fix 2: Distributed Locking Protect critical sections with locks. The trace should show lock acquisition and release, making contention visible.

async function freezeAccount(userId: string) {
  return step.tool('secure-account-freeze', async () => {
    const lock = await this.lockManager.acquire(`account:${userId}`, 30_000);
    try {
      return this.accountService.freeze(userId);
    } finally {
      await lock.release();
    }
  });
}

Fix 3: Orchestrator Sequencing The coordinator should analyze resource targets and sequence actions that overlap, rather than blindly parallelizing.

async function coordinateRemediation(diagnosis: Diagnosis) {
  return step('orchestrate-remediation', async () => {
    const targets = extractResourceTargets(diagnosis);
    
    // Sequence actions if they target the same resource
    if (targets.hasOverlap(['account', 'payment'])) {
      return step('sequential-remediation', async () => {
        await step.tool('handle-account', () => accountAgent.resolve(targets.account));
        return step.tool('handle-payment', () => paymentAgent.resolve(targets.payment));
      });
    }

    // Parallelize independent resources
    return Promise.all([
      step.tool('handle-network', () => networkAgent.resolve(targets.network)),
      step.tool('handle-storage', () => storageAgent.resolve(targets.storage)),
    ]);
  });
}

Pitfall Guide

1. The Timestamp Trap

Explanation: Relying on log timestamps to determine execution order in concurrent systems. Clock skew and asynchronous scheduling make timestamps unreliable for causality.
Fix: Use execution trees with causal IDs. The tree structure inherently encodes order and parallelism, removing ambiguity.

2. Blind Parallelism on Shared Resources

Explanation: Orchestrating agents to run in parallel without checking if they modify the same resource. This leads to race conditions, data corruption, or conflicting actions.
Fix: Implement resource target analysis in the coordinator. Sequence actions that share resources; parallelize only independent ones.

3. Stale State Decisions

Explanation: Agents making decisions based on state fetched at the start of a long-running flow. By the time the action executes, the state may have changed.
Fix: Add state refresh guards before critical actions. If the state is inconsistent, retry or escalate.

4. Trace Bloat and Noise

Explanation: Instrumenting every function call or internal loop, resulting in massive traces that obscure the important structure.
Fix: Trace boundaries, not internals. Use step for logical phases and step.tool for side effects. Avoid tracing pure computation or tight loops unless debugging specific performance issues.

5. Missing LLM Context

Explanation: Tracing tool calls but omitting the prompts or responses from LLM calls. This makes it impossible to understand why an agent made a specific decision.
Fix: Use step.llm to capture prompt summaries and response structures. Ensure sensitive data is redacted but the reasoning path is preserved.

6. Orchestration Bypass

Explanation: Agents calling tools directly without going through the coordinator, bypassing conflict resolution and locking mechanisms.
Fix: Enforce entry points. Agents should expose high-level methods that the coordinator invokes, rather than allowing direct tool access.

7. Ignoring Trace Retention

Explanation: Local traces accumulate and consume disk space, or are lost after a crash, hindering post-mortem analysis.
Fix: Configure retention policies. Rotate traces based on age or count. For production, consider streaming traces to a centralized store.

Production Bundle

Action Checklist

Define Trace Boundaries: Wrap all agent entry points and orchestration flows with inspectRun or equivalent session wrappers.
Instrument Tool Calls: Tag every external interaction (API calls, DB writes, file ops) with step.tool to capture side effects.
Capture LLM Decisions: Use step.llm to record prompts and responses for all model interactions.
Implement Resource Locking: Add distributed locks for any resource accessed by multiple agents.
Add State Guards: Insert state verification steps before critical actions to prevent stale data usage.
Configure Local Output: Set up local trace directories for development to enable fast iteration.
Review Failed Runs: Establish a workflow to inspect execution trees for every failed agent run.
Redact Sensitive Data: Ensure prompts and tool inputs are sanitized before writing to traces.

Decision Matrix

Scenario	Recommended Approach	Why	Cost Impact
Single Agent Flow	Structured Logging	Low overhead; linear execution is easy to follow with logs.	Minimal
Multi-Agent Development	Local Execution Tree	Fast feedback loop; no infrastructure setup; reveals concurrency bugs.	Low (Disk I/O)
Multi-Agent Production	Distributed Tracing	Centralized storage; long-term retention; cross-service correlation.	Moderate (Network/Storage)
High-Frequency Agents	Sampled Tracing	Reduces overhead while maintaining statistical visibility.	Low
Compliance Auditing	Immutable Trace Store	Provides tamper-evident records of agent decisions and actions.	High (Storage/Integrity)

Configuration Template

Use this configuration to set up local tracing with retention and redaction policies.

{
  "tracer": {
    "enabled": true,
    "mode": "local",
    "outputDirectory": "./.agent-traces",
    "retention": {
      "maxAgeDays": 7,
      "maxSizeMB": 500
    },
    "redaction": {
      "enabled": true,
      "patterns": [
        "apiKey",
        "token",
        "password",
        "creditCard"
      ]
    },
    "sampling": {
      "enabled": false,
      "rate": 1.0
    }
  }
}

Quick Start Guide

Install the Tracer: Add agent-inspect to your project dependencies.
```
npm install agent-inspect
```

Wrap Your Flow: Import inspectRun and step, then wrap your main agent orchestration function.

import { inspectRun, step } from 'agent-inspect';

async function main() {
  return inspectRun('my-agent-run', async () => {
    // Your agent logic here
  });
}

Run and Inspect: Execute your application. After the run, list and view the trace using the CLI.
```
npx agent-inspect list
npx agent-inspect view <run-id>
```
Analyze the Tree: Look for failed nodes, parallel branches, and resource contention. Use the tree structure to identify the root cause of coordination failures.

Mid-Year Sale — Unlock Full Article