Gotanda Style: Do AI Agents Really Need Meetings?
Stigmergic Orchestration: Decoupling AI Agents via Environmental Signals
Current Situation Analysis
Multi-agent architectures have become the standard for complex AI workflows, but the industry is hitting a scaling wall rooted in coordination overhead. The default pattern for multi-agent systems is conversational mesh: agents exchange messages to negotiate plans, share context, and hand off tasks. While effective for isolated, short-lived tasks, this model degrades rapidly in long-running maintenance loops.
The fundamental issue is that conversation consumes context. As the number of agents increases, the volume of inter-agent dialogue grows quadratically. Agents spend an increasing proportion of their token budget reading each other's history rather than performing work. In production environments managing large codebases, this manifests as context window saturation, escalating costs, and latency spikes.
This problem is often overlooked because developers extrapolate from single-agent or small-team demos. However, in sustained operations, the "meeting fatigue" of AI agents becomes a hard constraint. Teams maintaining repositories exceeding 100,000 lines of code report that conversational coordination becomes the primary bottleneck for continuous improvement loops. The cost of maintaining shared state through chat logs outweighs the value of the coordination, leading to fragile systems that require frequent manual resets.
The industry needs a coordination primitive that decouples agents from synchronous dialogue, allowing them to operate asynchronously while maintaining a coherent view of the system state.
WOW Moment: Key Findings
The shift from conversational coordination to environmental signaling (stigmergy) fundamentally alters the scaling properties of multi-agent systems. By replacing message passing with shared state updates, systems can achieve linear scalability in agent count and persistent memory without context window penalties.
The following comparison illustrates the operational differences between a conversational mesh and a stigmergic signal-based architecture:
| Metric | Conversational Mesh | Stigmergic Signal-Based |
|---|---|---|
| Context Scaling | O(N²) per coordination cycle | O(N) signal deposits |
| Memory Persistence | Volatile (truncated by context limit) | Persistent (decay-based retention) |
| Agent Coupling | Tight (requires protocol alignment) | Loose (schema-only dependency) |
| Latency Profile | Synchronous blocking | Asynchronous eventual consistency |
| Cost Driver | Token volume per message | Storage I/O and aggregation compute |
| Conflict Detection | Implicit in dialogue | Explicit via signal variance |
Why this matters: Stigmergic orchestration enables systems to accumulate weak signals over time, detect structural drift, and prioritize work based on aggregated evidence rather than immediate urgency. This pattern allows organizations to run continuous maintenance loops on large repositories where conversational agents would exhaust their context windows before completing a single remediation cycle.
Core Solution
Stigmergy is a coordination mechanism where agents modify a shared environment, and those modifications trigger subsequent actions by other agents. In software terms, this replaces direct API calls or message queues with a structured signal store. Agents deposit traces indicating observations, and consumers read aggregated traces to determine actions.
Architecture Overview
The system comprises three core components:
- Signal Schema: A strict contract defining the structure of environmental traces.
- Signal Store: A persistent backend that ingests, decays, and aggregates signals.
- Agent Workers: Specialized agents that deposit signals based on observations or consume aggregated signals to trigger workflows.
Implementation Details
1. Signal Schema Definition
Signals must be structured to support aggregation and decay. The schema includes metadata for source attribution, target location, intensity weighting, and temporal decay.
interface SignalSchema {
id: string;
target: string;
source: string;
category: string;
intensity: number;
timestamp: number;
halfLifeMs: number;
metadata: Record<string, string | number>;
}
2. Signal Store and Aggregation
The store manages signal lifecycle. Critical to this pattern is the separation of positive and negative intensities during aggregation. Summing signals blindly creates a "zero-sum illusion" where conflicting observations cancel out, masking contested areas that require human intervention.
class SignalAggregator {
private signals: Map<string, SignalSchema[]> = new Map();
deposit(signal: SignalSchema): void {
const key = `${signal.target}::${signal.category}`;
const existing = this.signals.get(key) || [];
existing.push(signal);
this.signals.set(key, existing);
}
getAggregatedState(target: string, category: string): AggregatedState {
const key = `${target}::${category}`;
const rawSignals = this.signals.get(key) || [];
const now = Date.now();
const activeSignals = rawSignals.filter(s =>
now - s.timestamp < s.halfLifeMs * 4
);
let positiveIntensity = 0;
let negativeIntensity = 0;
const sources = new Set<string>();
for (const signal of activeSignals) {
const decayedIntensity = signal.intensity *
Math.pow(0.5, (now - signal.timestamp) / signal.halfLifeMs);
if (decayedIntensity > 0) {
positiveIntensity += decayedIntensity;
} else {
negativeIntensity += Math.abs(decayedIntensity);
}
sources.add(signal.source);
}
return {
target,
category,
positiveIntensity,
negativeIntensity,
netIntensity: positiveIntensity - negativeIntensity,
isContested: positiveIntensity > 0.5 && negativeIntensity > 0.5,
sourceCount: sources.size,
lastUpdated: now
};
}
}
interface AggregatedState {
target: string;
category: string;
positiveIntensity: number;
negativeIntensity: number;
netIntensity: number;
isContested: boolean;
sourceCount: number;
lastUpdated: number;
}
3. Worker Implementation
Wo
rkers operate independently. Observers deposit signals based on external data; synthesizers read aggregated states to trigger workflows.
class ObservabilityWorker {
constructor(private store: SignalAggregator) {}
async processAlert(alert: SentryAlert): Promise<void> {
const signal: SignalSchema = {
id: crypto.randomUUID(),
target: alert.file_path,
source: 'sentry-observer',
category: 'runtime_error',
intensity: alert.severity * 2.0,
timestamp: Date.now(),
halfLifeMs: 14 * 24 * 60 * 60 * 1000,
metadata: {
error_type: alert.error_type,
environment: alert.environment
}
};
this.store.deposit(signal);
}
}
class RemediationSynthesizer {
constructor(
private store: SignalAggregator,
private issueClient: GitHubClient
) {}
async evaluateHotspots(): Promise<void> {
const targets = this.store.getAllTargets();
for (const target of targets) {
const state = this.store.getAggregatedState(target, 'runtime_error');
if (state.isContested) {
// Escalate to human review
await this.issueClient.createReviewIssue({
title: `Contested Signal: ${target}`,
body: `Positive: ${state.positiveIntensity}, Negative: ${state.negativeIntensity}`,
labels: ['needs-human-review']
});
} else if (state.positiveIntensity > 3.0) {
// Auto-trigger remediation
await this.issueClient.createRemediationIssue({
title: `Fix runtime errors in ${target}`,
priority: state.positiveIntensity > 5.0 ? 'high' : 'medium'
});
}
}
}
}
Architecture Rationale
- Decoupling: Agents interact only with the signal schema. Adding a new observer requires no changes to existing consumers.
- Persistence: Signals persist beyond agent lifecycles, enabling accumulation of evidence over days or weeks.
- Decay Mechanism: Half-life parameters ensure that stale signals fade, preventing historical noise from dominating current decisions.
- Contested Zone Detection: By tracking positive and negative intensities separately, the system identifies areas where agents disagree, routing these to human judgment rather than automated action.
Pitfall Guide
1. The Zero-Sum Illusion
Explanation: Aggregating signals by simple summation causes positive and negative observations to cancel out. A file with strong error reports and strong "do not fix" directives may show a net intensity of zero, appearing as a quiet area when it is actually contested. Fix: Always maintain separate accumulators for positive and negative intensities. Implement explicit contested zone detection logic that triggers human escalation when both thresholds are exceeded.
2. Schema Drift and Incompatibility
Explanation: As new agents are added, they may introduce signals with incompatible structures or semantics, breaking aggregation logic. Fix: Enforce strict schema validation at the store boundary. Implement versioned signal schemas and a migration strategy. Reject signals that do not conform to the current schema version.
3. Improper Decay Configuration
Explanation: Setting half-life too short causes signals to vanish before sufficient evidence accumulates. Setting it too long causes stale issues to dominate the signal space, masking new problems. Fix: Calibrate half-life based on the domain. Runtime errors may warrant shorter decay (e.g., 7 days) than architectural drift (e.g., 30 days). Implement dynamic decay that adjusts based on signal frequency.
4. Signal Sprawl and Storage Bloat
Explanation: Unrestricted signal deposition can lead to excessive storage usage and slow aggregation queries. Fix: Implement signal deduplication and compression. Aggregate signals periodically into snapshots and purge raw signals older than the maximum half-life. Use spatial indexing for location-based queries.
5. Ignoring Structural Attractors
Explanation: Signals may accumulate in ways that reinforce bad patterns. If agents consistently deposit signals that encourage deviation from architectural standards, the codebase drifts toward a poor attractor state. Fix: Include structural constraints in signal metadata. Synthesizers should penalize signals that violate architectural boundaries, even if intensity is high. Implement "attractor guards" that filter or downweight signals conflicting with design principles.
6. Over-Aggregation and Loss of Granularity
Explanation: Aggregating signals too aggressively can obscure root causes. A hotspot in a module may hide specific line-level issues that require targeted fixes. Fix: Maintain hierarchical aggregation. Support queries at multiple granularities (file, function, line). Ensure synthesizers can drill down into aggregated states to identify precise remediation targets.
7. Lack of Human-in-the-Loop Escalation
Explanation: Automated systems may attempt to act on contested zones or high-variance signals, causing disruptive changes. Fix: Define clear escalation thresholds. Routes contested zones, high-impact signals, and signals from untrusted sources to human review queues. Implement approval workflows for automated actions on critical paths.
Production Bundle
Action Checklist
- Define Signal Schema: Establish a strict, versioned schema for all signals including target, source, intensity, and decay parameters.
- Implement Signal Store: Deploy a persistent store with support for deposition, decay calculation, and hierarchical aggregation.
- Configure Decay Rates: Set half-life values based on signal category and business criticality. Validate decay behavior in staging.
- Build Aggregation Logic: Implement separate tracking for positive and negative intensities. Add contested zone detection.
- Develop Observer Workers: Create agents that deposit signals based on observability data, test results, and static analysis.
- Create Synthesizer Workers: Build consumers that evaluate aggregated states and trigger workflows based on thresholds and escalation rules.
- Add Human Escalation: Integrate with issue tracking systems to route contested zones and high-impact signals for human review.
- Monitor Costs and Latency: Track token usage, storage I/O, and aggregation latency. Optimize query patterns and storage retention.
Decision Matrix
| Scenario | Recommended Approach | Why | Cost Impact |
|---|---|---|---|
| Real-time debugging | Conversational Mesh | Requires immediate consensus and rapid context sharing. | Higher token cost per incident. |
| Long-term maintenance | Stigmergic Signals | Accumulates evidence over time; decouples agents. | Lower token cost; higher storage cost. |
| Exploratory brainstorming | Conversational Mesh | Benefits from dynamic idea exchange and iteration. | Moderate token cost. |
| Multi-source monitoring | Stigmergic Signals | Aggregates diverse observations without coordination overhead. | Low coordination cost; scalable. |
| Critical path remediation | Stigmergic + Human Escalation | Ensures contested or high-impact changes receive review. | Adds human latency; reduces risk. |
Configuration Template
signal_schema:
version: "1.0"
required_fields:
- target
- source
- category
- intensity
- half_life_ms
categories:
runtime_error:
default_half_life_ms: 1209600000 # 14 days
intensity_range: [-5.0, 5.0]
aggregation: weighted_sum
architectural_drift:
default_half_life_ms: 2592000000 # 30 days
intensity_range: [-3.0, 3.0]
aggregation: weighted_sum
requires_structural_check: true
aggregation_rules:
contested_threshold:
positive: 0.5
negative: 0.5
escalation:
contested: human_review
high_impact: human_review
auto_remediate_threshold: 3.0
storage:
retention_policy:
raw_signals_days: 60
snapshot_interval_hours: 24
indexing:
- target
- category
- source
Quick Start Guide
-
Initialize Signal Store: Deploy the signal store using the provided configuration. Ensure persistence and indexing are configured.
npm install @codcompass/signal-store signal-store init --config signal-config.yaml -
Define Observer Worker: Create a worker that deposits signals based on your observability data.
const worker = new ObservabilityWorker(store); worker.on('alert', (alert) => worker.processAlert(alert)); -
Deploy Synthesizer: Run the synthesizer to evaluate aggregated states and trigger workflows.
const synthesizer = new RemediationSynthesizer(store, githubClient); synthesizer.scheduleEvaluation(cron('0 */6 * * *')); -
Validate Escalation: Inject test signals to verify contested zone detection and human escalation routing.
signal-store inject --target test_file.py --category runtime_error --intensity 2.0 signal-store inject --target test_file.py --category runtime_error --intensity -2.0 # Verify human review issue is created -
Monitor and Tune: Review aggregation metrics and adjust decay rates or thresholds based on operational feedback.
signal-store metrics --category runtime_error --window 7d
