nteraction occurs.
import { registerAgent, MemoryScope, TransportAdapter } from '@hot-dev/sdk';
import { createChatTurnExecutor } from 'hot-ai-agent';
// Personal agent: memory follows the authenticated user
const personalAssistant = registerAgent({
id: 'personal-assistant-v1',
scope: MemoryScope.IDENTITY,
transport: new TransportAdapter(),
config: {
model: 'claude-3-5-sonnet-20240620',
streaming: true,
memoryRetention: 'persistent'
}
});
// Team agent: memory is bound to the active channel
const workspaceBot = registerAgent({
id: 'workspace-bot-v1',
scope: MemoryScope.SESSION,
transport: new TransportAdapter(),
config: {
model: 'claude-3-5-sonnet-20240620',
streaming: true,
memoryRetention: 'session-bound'
}
});
Step 2: Implement Transport-Agnostic Command Routing
Slash commands should be parsed into a neutral shape before reaching the agent logic. This prevents transport-specific formatting from leaking into the execution layer.
import { parseCommand, IncomingMessage } from '@hot-dev/sdk';
function handleIncomingPayload(raw: unknown) {
const message: IncomingMessage = parseCommand(raw);
// Extract normalized command structure
const command = {
name: message.command?.name ?? 'default',
argument: message.command?.argument ?? '',
metadata: {
userId: message.identity?.id,
sessionId: message.session?.id,
timestamp: message.timestamp
}
};
return command;
}
Step 3: Orchestrate the Chat Turn Lifecycle
The most critical architectural decision is enforcing the correct execution order. The hot-ai-agent package provides a canonical lifecycle function that prevents retrieval contamination.
import { executeChatTurn, ChatTurnConfig } from 'hot-ai-agent';
async function processUserQuery(command: ReturnType<typeof handleIncomingPayload>) {
const turnConfig: ChatTurnConfig = {
agentId: command.metadata.sessionId ? 'workspace-bot-v1' : 'personal-assistant-v1',
modelProvider: 'anthropic',
streamingEvents: ['reply:start', 'reply:delta', 'reply:end'],
mcpTools: ['search_memory', 'fetch_docs']
};
// Strict lifecycle: recall -> persist user -> bind request -> stream -> persist assistant
const result = await executeChatTurn(turnConfig, {
userId: command.metadata.userId,
sessionId: command.metadata.sessionId,
query: command.argument,
attachments: command.metadata.attachments
});
return result;
}
Step 4: Stream Responses with Stable Event Labels
Streaming must emit consistent event types regardless of whether the response originates from an LLM or a slash command handler. This allows the UI to render deltas uniformly.
import { EventEmitter } from 'events';
const agentStream = new EventEmitter();
agentStream.on('reply:start', (payload) => {
console.log(`[Stream] Initializing response for ${payload.agentId}`);
});
agentStream.on('reply:delta', (chunk) => {
// Render partial tokens to UI
process.stdout.write(chunk.content);
});
agentStream.on('reply:end', (metadata) => {
console.log(`\n[Stream] Completed. Tokens: ${metadata.usage?.total_tokens}`);
});
Architecture Rationale
- Transport Abstraction: By normalizing incoming payloads into a neutral
IncomingMessage shape, the agent layer remains decoupled from Slack, Telegram, Discord, or web adapters. This keeps the dependency tree minimal and allows swapping frontends without rewriting agent logic.
- Strict Lifecycle Ordering: The
recall -> persist user -> bind request -> stream -> persist assistant sequence is non-negotiable. Persisting the user message before retrieval ensures the fresh query doesn't contaminate the context window. Binding the request mid-turn guarantees that MCP tools and RAG calls operate within the correct session boundary.
- Event-Driven Command Registration: Each slash command registers as an independent event handler rather than funneling through a centralized dispatcher. This improves testability, enables parallel execution, and simplifies debugging via agent graph visualization.
- Per-Agent State Isolation: Each agent maintains its own state ledger, error queue, and notification buffer. Scheduled jobs fan out per session with error isolation, preventing a single failed background task from crashing the entire agent runtime.
Pitfall Guide
1. Lifecycle Order Inversion
Explanation: Persisting the user message before executing retrieval causes the fresh query to appear in the context window during RAG. This creates circular references and degrades response quality.
Fix: Always invoke retrieval first, then persist the user message, bind the request, stream the response, and finally persist the assistant output. Use the harness's built-in executeChatTurn to enforce this sequence.
2. Session/Identity Collision
Explanation: Mixing user_id and session_id resolution leads to memory leakage. Personal notes may appear in team channels, or team decisions may overwrite user preferences.
Fix: Explicitly declare memory scope at agent registration. Validate that MemoryScope.IDENTITY only resolves user_id, while MemoryScope.SESSION only resolves channel_id or thread_id. Reject payloads that contain mismatched identifiers.
3. Transport Coupling
Explanation: Hardcoding Slack or Telegram message formats directly into agent handlers creates vendor lock-in and forces rewrites when switching platforms.
Fix: Implement a translation layer that converts vendor-specific payloads into the neutral IncomingMessage shape. Keep transport adapters in the application layer, never in the agent harness.
4. Streaming Backpressure Ignorance
Explanation: Emitting delta events faster than the UI or downstream consumer can process causes dropped tokens, UI flickering, and memory leaks in long-running streams.
Fix: Implement backpressure handling in the stream consumer. Buffer deltas, apply frame-rate limiting for UI updates, and monitor queue depth. Use stable event labels (:reply:start, :reply:delta, :reply:end) to synchronize state.
5. Centralized Dispatch Anti-Pattern
Explanation: Funneling all slash commands through a single switch or if/else block creates a monolithic handler that is difficult to test, scale, or debug.
Fix: Register each command as an independent event handler with explicit on-event annotations. This enables parallel execution, simplifies agent graph visualization, and isolates failures to specific commands.
Explanation: Exposing agent functions as MCP tools without boundary checks allows external clients (Claude Desktop, Cursor) to bypass memory scoping and access cross-session data.
Fix: Annotate MCP tools with explicit scope restrictions. Validate user_id and session_id inside every tool implementation. Use the harness's per-request session binding to enforce context isolation automatically.
7. Neglecting Error Isolation in Fan-out Jobs
Explanation: Scheduled jobs that iterate over multiple sessions without error isolation cause a single failure to halt the entire batch, leaving other sessions unprocessed.
Fix: Wrap each session iteration in a try/catch block. Log failures to the per-agent error ledger, continue processing remaining sessions, and implement retry logic with exponential backoff for transient errors.
Production Bundle
Action Checklist
Decision Matrix
| Scenario | Recommended Approach | Why | Cost Impact |
|---|
| Personal productivity assistant | Identity-First Memory | Ensures preferences and notes follow the user across devices and sessions | Low: Single-user storage scales linearly |
| Team channel bot / Support inbox | Session-First Memory | Keeps decisions and context visible to all participants in the thread | Medium: Shared storage requires indexing and access controls |
| Multi-platform deployment (Slack + Web) | Transport-Abstraction Layer | Prevents vendor lock-in and allows swapping adapters without rewriting agent logic | Low: One-time translation layer implementation |
| High-volume streaming UI | Stable Event Labels + Backpressure | Prevents dropped tokens, UI flickering, and memory leaks during long responses | Low: Standard event emitter pattern with buffering |
| External tool integration (Cursor/Claude Desktop) | Scoped MCP Tools | Maintains memory boundaries while exposing agent capabilities to third-party clients | Medium: Requires explicit scope validation per tool |
Configuration Template
// agent.config.ts
import { AgentConfig, MemoryScope, StreamingConfig } from '@hot-dev/sdk';
export const personalAgentConfig: AgentConfig = {
id: 'personal-assistant-prod',
scope: MemoryScope.IDENTITY,
model: 'claude-3-5-sonnet-20240620',
streaming: {
enabled: true,
eventPrefix: 'personal-agent',
backpressure: {
bufferSize: 50,
frameRate: 30
}
},
memory: {
retention: 'persistent',
vectorDb: 'pinecone',
index: 'user-notes-prod'
},
mcp: {
enabled: true,
scopeRestriction: 'identity-only',
tools: ['search_personal_memory', 'update_preferences']
},
errorHandling: {
isolation: true,
retry: { maxAttempts: 3, backoff: 'exponential' }
}
};
export const teamAgentConfig: AgentConfig = {
id: 'workspace-bot-prod',
scope: MemoryScope.SESSION,
model: 'claude-3-5-sonnet-20240620',
streaming: {
enabled: true,
eventPrefix: 'workspace-bot',
backpressure: {
bufferSize: 100,
frameRate: 60
}
},
memory: {
retention: 'session-bound',
vectorDb: 'pinecone',
index: 'channel-decisions-prod'
},
mcp: {
enabled: true,
scopeRestriction: 'session-only',
tools: ['search_channel_history', 'fetch_team_docs']
},
errorHandling: {
isolation: true,
retry: { maxAttempts: 5, backoff: 'exponential' }
}
};
Quick Start Guide
- Initialize the project: Install the SDK and agent harness (
npm install @hot-dev/sdk hot-ai-agent). Configure your .env with ANTHROPIC_API_KEY and HOT_API_KEY.
- Define agent scopes: Register two agents using the configuration template above, specifying
MemoryScope.IDENTITY for personal assistants and MemoryScope.SESSION for team bots.
- Implement transport normalization: Create a handler that converts incoming payloads into
IncomingMessage shapes, extracting user_id, session_id, and command arguments.
- Execute chat turns: Use
executeChatTurn with the strict lifecycle sequence. Attach streaming event listeners for :reply:start, :reply:delta, and :reply:end.
- Deploy and validate: Run the harness locally, switch between personal and team modes, verify memory isolation, and monitor agent graph visualization for command routing accuracy.