Back to KB
Difficulty
Intermediate
Read Time
9 min

Detecting unusual processes on your servers without writing a single rule

By Codcompass TeamΒ·Β·9 min read

Zero-Configuration Server Behavior Baselines: Kernel Telemetry Meets Vector Search

Current Situation Analysis

Traditional endpoint security relies on a fundamental assumption: you can enumerate what malicious activity looks like before it happens. Tools like Falco, OSSEC, and Wazuh operationalize this assumption through signature databases and YAML rule engines. Wazuh, for example, ships with approximately 5,000 preconfigured rules. While comprehensive, this approach suffers from a structural limitation: it can only detect what has already been documented.

The industry pain point isn't rule quality; it's rule coverage. Novel deployment patterns, forgotten debug utilities, custom CI/CD scripts, and zero-day execution chains bypass signature engines entirely. Security teams spend disproportionate time maintaining rule sets, tuning false positives, and writing exceptions for legitimate but unusual workload behavior. This maintenance burden scales linearly with fleet complexity, yet detection coverage plateaus.

The overlooked reality is that server behavior follows predictable patterns per workload. A web server, a background worker, and a database node each have distinct execution profiles. Instead of defining "bad," modern telemetry pipelines can learn "normal" per environment and flag statistical deviations. This shifts security from reactive signature matching to proactive behavioral baselining, eliminating manual rule authoring while improving coverage against unknown execution patterns.

WOW Moment: Key Findings

The transition from signature-based detection to unsupervised vector baselining fundamentally changes the cost-to-coverage ratio. The table below contrasts the operational characteristics of traditional rule engines against a kernel-telemetry vector pipeline.

ApproachDetection CoverageMaintenance OverheadIngestion LatencyAdaptability to Novel Patterns
Signature/Rule EngineHigh for known threats, near-zero for novelHigh (manual rule authoring, tuning, exceptions)Low (pattern matching)None (requires explicit updates)
Vector Baseline (eBPF + ANN)High for deviations from learned normalLow (auto-baselining, threshold tuning only)Low (<0.1ms per event)High (learns new normal automatically)

This finding matters because it decouples security coverage from human rule-writing velocity. A vector baseline continuously updates its understanding of acceptable behavior. When a developer introduces a new deployment script, the system registers it as anomalous on first execution, then incorporates it into the baseline after repeated runs. Security teams stop chasing false positives and start investigating genuine behavioral drift.

Core Solution

Building an unsupervised process baseline requires four coordinated components: kernel-level telemetry capture, deterministic vector representation, embedded vector storage with nearest-neighbor scoring, and a query interface for behavioral search.

Step 1: Kernel Telemetry Capture

Process execution must be observed at the point of creation. Attaching to the sys_enter_execve tracepoint captures every execve() syscall before the new process image loads. This provides complete visibility into process names, command-line arguments, parent context, and initiating user IDs.

The telemetry agent is implemented in Rust using the Aya framework. The eBPF program attaches to the tracepoint, extracts fields, and pushes structured events to a ring buffer. Userspace consumes the buffer, batches events, and forwards them to the backend.

// kernel_telemetry.rs
use aya::programs::TracePoint;
use aya::Bpf;

#[repr(C)]
pub struct ProcessSnapshot {
    pub pid: u32,
    pub ppid: u32,
    pub uid: u32,
    pub comm: [u8; 16],
    pub cmdline: [u8; 128],
    pub timestamp_ns: u64,
}

#[tracepoint]
pub fn capture_exec(ctx: &TracePointContext) -> u32 {
    let pid_tgid = bpf_get_current_pid_tgid();
    let pid = (pid_tgid >> 32) as u32;
    let ppid = bpf_get_current_ppid();
    let uid = bpf_get_current_uid_gid() & 0xFFFFFFFF;

    let mut snapshot = ProcessSnapshot {
        pid,
        ppid,
        uid,
        comm: [0; 16],
        cmdline:

πŸŽ‰ Mid-Year Sale β€” Unlock Full Article

Base plan from just $4.99/mo or $49/yr

Sign in to read the full article and unlock all 635+ tutorials.

Sign In / Register β€” Start Free Trial

7-day free trial Β· Cancel anytime Β· 30-day money-back