AgentGuard 0.3.0 β macOS menu bar app, Telegram rollback, and more
Securing AI Coding Workflows: File Integrity Monitoring for CLI Agents
Current Situation Analysis
The rapid adoption of terminal-based AI coding agents has fundamentally changed how developers interact with codebases. Tools like Claude Code, Codex, and Copilot CLI now operate with broad filesystem access, executing multi-step refactors, dependency updates, and configuration changes autonomously. While this accelerates development velocity, it introduces a critical blind spot: unattended file mutation.
When a developer steps away from their machine, an AI agent can silently overwrite environment variables, modify CI/CD pipelines, or alter persistent instruction files. Traditional development workflows assume human oversight for every file change. AI agents break that assumption by operating at machine speed, often without explicit confirmation dialogs.
This problem is frequently overlooked because the industry has focused heavily on agent capabilities rather than safety boundaries. Most teams treat CLI agents as extensions of their own terminal session, assuming that if a command runs successfully, the output is correct. In reality, LLMs can misinterpret context, apply changes to the wrong directory, or chain operations that cascade into unintended modifications. File integrity monitoring exists in enterprise environments, but it lacks the contextual awareness needed for AI-driven workflows. Generic watchers like chokidar or fsnotify trigger on every write, creating noise without distinguishing between a developer's intentional edit and an agent's autonomous operation.
The gap becomes critical when considering persistent agent memory files. Modern CLI agents rely on markdown-based instruction files (e.g., CLAUDE.md, .cursorrules, .hermes/, Aider configs) to maintain context across sessions. If an agent accidentally modifies these files, it can poison its own future behavior, leading to compounding errors that are difficult to trace. Without a dedicated monitoring layer, developers are left reacting to broken builds or misconfigured environments long after the damage occurs.
WOW Moment: Key Findings
Transitioning from passive file watching to AI-agent-aware monitoring reveals a stark difference in operational safety. The table below compares traditional filesystem monitoring against an agent-governance approach:
| Approach | Context Awareness | Rollback Capability | Memory File Protection | Notification Granularity | Agent Compatibility |
|---|---|---|---|---|---|
| Traditional File Watcher | None (blind to source) | Manual only | Not tracked | All events (high noise) | Universal but unfiltered |
| AI-Agent Monitor | Agent-aware routing | Automated/Approved | Dedicated tracking | Severity-based (HIGH/CRITICAL) | Optimized for CLI agents |
This finding matters because it shifts file monitoring from a logging utility to an active governance layer. By classifying files by sensitivity and routing events through approval workflows, developers can maintain autonomy while establishing a safety net. The ability to approve or rollback changes via external channels (like Telegram) means protection extends beyond the terminal, covering periods when developers are away from their machines. This transforms reactive debugging into proactive risk mitigation.
Core Solution
Building an AI-agent file integrity monitor requires a daemon-based architecture that decouples file observation from event handling. The system must track writes, classify them by sensitivity, route them through notification channels, and provide controlled rollback mechanisms.
Architecture Decisions
- File Watcher Over Process Interception: Attempting to intercept commands at the process level requires hooking into TUI frameworks or Rust binaries, which breaks frequently with agent updates. A filesystem watcher operates at the OS level, remains stable across agent versions, and captures the actual state change regardless of how it was triggered.
- Severity-Based Event Routing: Not all file changes require intervention. Grouping files into LOW, MEDIUM, HIGH, and CRITICAL tiers allows the system to suppress noise while escalating sensitive modifications.
- External Approval Channels: Relying solely on terminal prompts fails when developers are away from their desks. Integrating with messaging platforms enables asynchronous governance without blocking the agent's workflow.
- Memory File Isolation: Agent instruction files require separate tracking because they influence future behavior. Modifying them without oversight can corrupt the agent's contextual baseline.
Implementation Walkthrough
1. Configuration Schema
The monitor relies on a structured configuration that defines watch paths, sensitivity tiers, and notification routing.
interface AgentGuardConfig {
watchPaths: string[];
sensitivityRules: SensitivityRule[];
notifications: NotificationConfig;
daemon: DaemonSettings;
}
interface SensitivityRule {
pattern: string;
severity: 'LOW' | 'MEDIUM' | 'HIGH' | 'CRITICAL';
requiresApproval: boolean;
rollbackEnabled: boolean;
}
interface NotificationConfig {
telegram: {
botToken: string;
chatId: string;
webhookPort: number;
};
macOS: {
notifyHigh: boolean;
notifyCritical: boolean;
};
}
interface DaemonSettings {
logLevel: 'info' | 'debug' | 'warn';
reportInterval: number; // days
maxEventHistory: number;
}
2. Telegram Webhook Handler
When a sensitive file changes, the daemon posts a message with inline buttons. The webhook processes approval or rollback requests.
import { Router } from 'express';
import { TelegramBotAPI } from './telegram-client';
const router = Router();
const bot = new TelegramBotAPI(process.env.TELEGRAM_BOT_TOKEN!);
router.post('/webhook/agent-guard', async (req, res) => {
const { callback_query } = req.body;
if (!callback_query) return res.status(200).send('OK');
const { data, message } = callback_query;
const [action, eventId] = data.split(':');
try {
if (action === 'approve') {
await bot.answerCallbackQuery(callback_query.id, 'Change approved');
await EventStore.markApproved(eventId);
} else if (action === 'rollback') {
await bot.answerCallbackQuery(callback_query.id, 'Rolling back...');
await FileRollbackService.restore(eventId);
await bot.editMessageText('β
Rollback completed', {
chat_id: message.chat.id,
message_id: message.message_id,
});
}
} catch (err) {
console.error('Webhook processing failed:', err);
await bot.answerCallbackQuery(callback_query.id, 'Action failed');
}
res.status(200).send('OK');
});
export default router;
3. Daemon Lifecycle Manager
The daemon orchestrates the watcher, event queue, and reporting subsystem.
import { FSWatcher } from 'chokidar';
import { EventEmitter } from 'events';
class AgentDaemon extends EventEmitter {
private watcher: FSWatcher;
private eventQueue: Map<string, FileEvent>;
constructor(config: AgentGuardConfig) {
super();
this.watcher = new FSWatcher({
ignored: /node_modules|\.git/,
persistent: true,
ignoreInitial: true,
});
this.eventQueue = new Map();
this.loadConfig(config);
}
private loadConfig(config: AgentGuardConfig): void {
config.watchPaths.forEach((dir) => this.watcher.add(dir));
this.watcher.on('change', async (filePath, stats) => {
const rule = this.matchSensitivityRule(filePath);
if (!rule) return;
const event: FileEvent = {
id: crypto.randomUUID(),
path: filePath,
severity: rule.severity,
timestamp: Date.now(),
requiresApproval: rule.requiresApproval,
};
this.eventQueue.set(event.id, event);
await this.routeEvent(event);
});
}
private async routeEvent(event: FileEvent): Promise<void> {
if (event.severity === 'HIGH' || event.severity === 'CRITICAL') {
await this.sendMacOSNotification(event);
}
if (event.requiresApproval) {
await this.sendTelegramApproval(event);
}
}
public async generateReport(days: number): Promise<ReportData> {
const cutoff = Date.now() - (days * 24 * 60 * 60 * 1000);
const recentEvents = Array.from(this.eventQueue.values())
.filter((e) => e.timestamp >= cutoff);
return {
period: `${days} days`,
totalChanges: recentEvents.length,
approved: recentEvents.filter((e) => e.status === 'approved').length,
rolledBack: recentEvents.filter((e) => e.status === 'rolled_back').length,
criticalEvents: recentEvents.filter((e) => e.severity === 'CRITICAL').length,
};
}
}
Rationale Behind Choices
- Chokidar over native
fs.watch: Provides cross-platform stability, ignoresnode_modulesby default, and handles rapid successive writes without duplicate events. - UUID-based event tracking: Enables precise rollback targeting and prevents race conditions when multiple files change simultaneously.
- Severity routing: Separates noise from actionable events. LOW/MEDIUM changes log silently, while HIGH/CRITICAL triggers immediate alerts.
- External approval: Decouples governance from the terminal, ensuring protection continues during breaks or off-hours.
Pitfall Guide
1. Relying on Process Interception
Explanation: Attempting to hook into CLI agent processes (especially Rust binaries or TUI frameworks) to intercept commands before execution is fragile. Agent updates frequently change internal APIs, breaking hooks and leaving the system blind. Fix: Use filesystem-level monitoring. It captures the actual state change regardless of how the agent triggers it, and remains stable across version updates.
2. Ignoring Agent Memory Files
Explanation: Files like CLAUDE.md, .cursorrules, .hermes/, and Aider configs store persistent instructions. If an agent modifies these unintentionally, it corrupts its own context window for future sessions, leading to compounding errors.
Fix: Explicitly track memory files in a dedicated sensitivity tier. Require approval for any modification and maintain versioned backups for quick restoration.
3. Webhook Timeout Misconfiguration
Explanation: Telegram's webhook API expects a response within 30 seconds. If the rollback or approval logic performs heavy I/O or waits for external services, the connection drops, causing duplicate messages or lost events.
Fix: Acknowledge the webhook immediately with a 200 OK, then process the action asynchronously. Use a message queue or background worker for rollback operations.
4. Over-Filtering Critical Paths
Explanation: Aggressively ignoring directories to reduce noise can accidentally exclude sensitive configuration files. A misplaced glob pattern might skip .env.production or CI pipeline definitions.
Fix: Use explicit allowlists for sensitive patterns rather than broad ignores. Validate watch rules against a test suite that includes edge-case file paths.
5. Race Conditions During Rollback
Explanation: If an agent continues writing while a rollback is in progress, the restored file can be immediately overwritten, creating a loop of conflicting states. Fix: Implement a temporary write lock on the target file during rollback. Pause the watcher for that specific path until the restoration completes, then resume monitoring.
6. Platform-Specific Watcher Limitations
Explanation: macOS uses FSEvents, Linux uses inotify, and Windows uses ReadDirectoryChangesW. Each has different limits on watch descriptors and event batching. Exceeding these limits causes silent failures.
Fix: Monitor watcher health metrics. Alert when descriptor limits approach thresholds, and implement recursive directory splitting for large projects.
7. Silent Daemon Failures
Explanation: If the monitoring daemon crashes or loses network connectivity, developers remain unaware until a critical change goes unmonitored. Fix: Implement a watchdog process that verifies daemon heartbeat every 60 seconds. Use macOS launchd or systemd to auto-restart on failure, and route health checks to a dedicated alert channel.
Production Bundle
Action Checklist
- Define sensitivity tiers: Map project files to LOW, MEDIUM, HIGH, and CRITICAL based on impact scope.
- Configure memory file tracking: Add
CLAUDE.md,.cursorrules,.hermes/, and Aider configs to the approval-required list. - Set up Telegram bot: Create a bot via BotFather, configure webhook URL, and store tokens securely in environment variables.
- Implement write locking: Add temporary file locks during rollback to prevent agent overwrite conflicts.
- Enable macOS notifications: Register for HIGH/CRITICAL alerts and verify notification center permissions.
- Schedule daily reports: Configure
agentguard daemon report --days=7to run via cron or launchd. - Test rollback paths: Simulate sensitive file changes and verify approval/rollback flows before production deployment.
Decision Matrix
| Scenario | Recommended Approach | Why | Cost Impact |
|---|---|---|---|
| Solo developer, local only | File watcher + macOS notifications | Low overhead, immediate feedback, no external dependencies | Free |
| Team with CI/CD pipelines | File watcher + Telegram approval | Asynchronous governance, covers away-from-desk periods, audit trail | Telegram API free, minimal infra |
| Enterprise compliance | File watcher + Slack/Teams + SIEM integration | Centralized logging, role-based approval, audit retention | Higher infra cost, requires SSO/SCIM |
| High-frequency refactoring | File watcher + memory file isolation + write locks | Prevents context poisoning, handles rapid changes safely | Moderate dev time for lock management |
Configuration Template
# agentguard.config.yaml
watch_paths:
- ./src
- ./config
- ./scripts
sensitivity_rules:
- pattern: "**/.env*"
severity: CRITICAL
requires_approval: true
rollback_enabled: true
- pattern: "**/.github/workflows/*.yml"
severity: HIGH
requires_approval: true
rollback_enabled: true
- pattern: "**/CLAUDE.md"
severity: HIGH
requires_approval: true
rollback_enabled: true
- pattern: "**/.cursorrules"
severity: HIGH
requires_approval: true
rollback_enabled: true
- pattern: "**/.hermes/**"
severity: HIGH
requires_approval: true
rollback_enabled: true
- pattern: "**/.aider*"
severity: MEDIUM
requires_approval: false
rollback_enabled: true
notifications:
telegram:
bot_token: "${TELEGRAM_BOT_TOKEN}"
chat_id: "${TELEGRAM_CHAT_ID}"
webhook_port: 3001
macos:
notify_high: true
notify_critical: true
daemon:
log_level: info
report_interval: 7
max_event_history: 500
Quick Start Guide
Install the package globally:
npm install -g agentguard-devInitialize configuration:
agentguard initThis generates
agentguard.config.yamlin your project root with default sensitivity rules.Configure Telegram (optional): Create a bot via BotFather, set
TELEGRAM_BOT_TOKENandTELEGRAM_CHAT_IDin your environment, and update the config file.Start the daemon:
agentguard daemon startVerify operation with
agentguard daemon status.Launch the menu bar app (macOS):
cd $(npm root -g)/agentguard-dev/tray && npm install agentguard trayClick the shield icon to monitor daemon status, watched directories, and recent events. Use the popup to start/stop the watcher or generate a 7-day report.
Mid-Year Sale β Unlock Full Article
Base plan from just $4.99/mo or $49/yr
Sign in to read the full article and unlock all tutorials.
Sign In / Register β Start Free Trial7-day free trial Β· Cancel anytime Β· 30-day money-back
