kepler-deployment.yaml (abbreviated)

By Codcompass Team·2026-05-19·7 min read

Current Situation Analysis

Software energy consumption remains largely invisible in modern observability stacks. Engineering teams track latency, throughput, error rates, and memory pressure, yet energy usage is treated as an infrastructure abstraction rather than a first-class engineering metric. This blind spot stems from cloud billing models that charge for provisioned or consumed compute cycles, not for the actual joules drawn from the grid. Carbon accounting is typically siloed in ESG compliance teams, while developers operate under the assumption that CPU utilization correlates linearly with cost and efficiency. It does not.

The disconnect is measurable. Industry benchmarks indicate that 30–40% of cloud compute cycles are wasted on inefficient code paths, idle polling, or misconfigured autoscaling thresholds. Energy-proportional computing research demonstrates that power draw scales non-linearly with utilization: a server at 20% CPU may consume 70% of its peak power, while a poorly optimized algorithm can increase energy consumption by 3–5x without significantly impacting latency. Without hardware-level energy telemetry, optimization becomes speculative. Teams reduce CPU allocation or refactor code based on cost projections, not actual energy deltas, leading to suboptimal carbon reduction and missed efficiency gains.

The oversight persists because energy metrics require hardware access (RAPL, ACPI, IPMI) that is abstracted away in containerized and serverless environments. Standard APMs do not expose joules, watts, or carbon intensity. Even when tools like Kepler, Scaphandre, or CodeCarbon are deployed, they are often treated as compliance dashboards rather than engineering feedback loops. The result is a fragmented workflow where sustainability reporting happens post-deployment, while real-time code optimization remains decoupled from energy impact.

WOW Moment: Key Findings

Integrating energy monitoring into the observability pipeline transforms sustainability from a retrospective report into an engineering control surface. The following comparison demonstrates the operational divergence between traditional monitoring and energy-aware monitoring:

Approach	Metric 1	Metric 2	Metric 3
Traditional Monitoring	CPU/Memory Utilization	Request Latency	Cost per Hour
Energy-Aware Monitoring	Joules per Request	Grid Carbon Intensity (gCO2/kWh)	Energy Proportionality Index

Traditional stacks optimize for throughput and latency, often increasing energy waste during traffic spikes or idle periods. Energy-aware monitoring normalizes consumption by workload, correlates it with real-time grid carbon intensity, and measures how efficiently the system converts power into useful work. This shift enables carbon-aware scheduling, energy SLOs, and code-level profiling that directly ties algorithmic changes to joule reduction. The finding matters because it moves sustainability from accounting to architecture: teams can now set burn-down targets for energy per transaction, trigger auto

scaling based on grid carbon intensity, and validate refactors against measurable efficiency gains rather than speculative cost models.

Core Solution

Implementing energy monitoring requires bridging hardware telemetry with application-level metrics. The architecture follows a sidecar or host-agent pattern, depending on deployment environment, with a Prometheus-compatible exporter feeding a time-series database. The application instruments request boundaries to calculate energy deltas, enabling joules-per-request normalization.

Step 1: Select the Measurement Layer

Kubernetes: Deploy Kepler (Kubernetes Efficient Energy Level Exporter) as a DaemonSet. It exposes RAPL and perf counters via Prometheus metrics.
Bare Metal/VM: Use Scaphandre or Intel Power Gadget with node_exporter to expose rapl_*_joules_total metrics.
Serverless/Managed: Rely on cloud provider carbon APIs or statistical estimators (less precise, requires calibration).

Step 2: Instrument Application Boundaries

Wrap request handlers to capture energy deltas at ingress and egress. Normalize by request count to derive joules per operation.

import { Counter, Gauge, Registry } from 'prom-client';
import { Request, Response, NextFunction } from 'express';

const registry = new Registry();

// Energy counters (populated by sidecar/exporter scrape)
const energyJoulesTotal = new Gauge({
  name: 'app_energy_joules_total',
  help: 'Cumulative energy consumed by the application in joules',
  registers: [registry],
});

const requestsTotal = new Counter({
  name: 'app_requests_total',
  help: 'Total number of processed requests',
  registers: [registry],
});

const joulesPerRequest = new Gauge({
  name: 'app_joules_per_request',
  help: 'Average energy consumed per request',
  registers: [registry],
});

let lastEnergySnapshot = 0;
let requestCount = 0;

// Middleware to track energy delta per request
export const energyAwareMiddleware = (req: Request, res: Response, next: NextFunction) => {
  const startEnergy = getHardwareEnergySnapshot(); // Placeholder for RAPL/scrape value
  const startTime = process.hrtime.bigint();

  res.on('finish', () => {
    const endEnergy = getHardwareEnergySnapshot();
    const deltaJoules = Math.max(0, endEnergy - startEnergy);
    
    energyJoulesTotal.set(endEnergy);
    requestsTotal.inc();
    
    requestCount++;
    const avgJoules = deltaJoules > 0 ? deltaJoules / requestCount : 0;
    joulesPerRequest.set(avgJoules);
  });

  next();
};

// Stub for hardware energy read (replace with actual exporter scrape or IPC)
function getHardwareEnergySnapshot(): number {
  // In production, this reads from a local socket, Prometheus scrape target, or RAPL sysfs
  return Math.random() * 1000; // Placeholder
}

Step 3: Export and Normalize

The sidecar agent exposes raw node_rapl_*_joules_total metrics. The application middleware correlates these with request boundaries. Use PromQL to calculate energy proportionality:

rate(app_energy_joules_total[5m]) / rate(app_requests_total[5m])

This yields joules per request over a rolling window, enabling trend analysis and anomaly detection.

Step 4: Dashboard and Alerting

Configure Grafana panels for:

Real-time joules/request
Grid carbon intensity overlay (via Open Power System Data or cloud APIs)
Energy proportionality curve (power vs. throughput)
Alerting on thresholds: joules_per_request > 0.15 for sustained 10m triggers P2 incident.

Architecture Rationale

Sidecar over in-process: Hardware counters require root/privileged access. A sidecar or DaemonSet isolates measurement from application logic, prevents permission drift, and ensures consistent metric collection across languages.
Delta normalization: Raw joules are meaningless without workload context. Normalizing by requests, batch size, or data processed enables cross-service comparison.
Sampling strategy: RAPL updates every ~100ms. Polling faster introduces overhead and noise. Batch aggregation at 1–5 second intervals balances precision and performance.

Pitfall Guide

Confusing Watts and Joules: Watts measure instantaneous power; joules measure cumulative energy. Alerting on watts without time integration yields false positives. Always integrate power over time (∫ W dt) or use hardware counters that directly expose joules.
Ignoring Baseline/Idle Consumption: A system at idle still draws 40–60% of peak power. Failing to subtract baseline skews delta calculations. Measure idle joules over a 5-minute window and subtract from active measurements.
Over-Sampling Measurement: Polling RAPL or ACPI every 10ms adds CPU overhead that inflates energy consumption. Sample at 1–5 second intervals. Use kernel-level counters instead of userspace polling where possible.
Not Normalizing by Workload: Raw joules cannot be compared across services or traffic patterns. Always divide by meaningful units: requests, records processed, or GB transferred. Without normalization, optimization metrics are misleading.
Relying on Software Estimators Without Calibration: Tools like CodeCarbon use statistical models based on CPU/model/TDP. They deviate by 15–30% from hardware counters. Use estimators only when hardware access is impossible, and calibrate against a reference node with RAPL.
Missing Network and Storage Energy Attribution: CPU-only monitoring ignores NIC and disk energy, which can account for 20–35% of total draw in I/O-heavy workloads. Include devicemapper or netdev energy proxies, or use cgroup-aware exporters that attribute power by container.
Treating Energy as a One-Time Metric: Energy efficiency degrades with dependency updates, traffic shifts, and cloud region changes. Treat joules/request as an SLO, not a dashboard. Integrate into CI/CD with regression gates.

Production Bundle

Action Checklist

Deploy Kepler or Scaphandre as a DaemonSet with RAPL/sysfs access enabled
Instrument request boundaries to capture energy deltas at ingress/egress
Normalize energy metrics by workload unit (requests, batches, or data volume)
Configure PromQL queries for joules/request and energy proportionality curves
Overlay real-time grid carbon intensity using cloud or OPSD APIs
Set alerting thresholds on sustained joules/request deviations (>20% baseline)
Add energy regression gates to CI/CD pipelines for critical services
Schedule quarterly hardware counter calibration against known TDP baselines

Decision Matrix

Scenario	Recommended Approach	Why	Cost Impact
Kubernetes microservices	Kepler DaemonSet + Prometheus	Native cgroup attribution, RAPL access, low overhead	Neutral (uses existing monitoring infra)
Bare metal VMs	Scaphandre + node_exporter	Direct sysfs access, language-agnostic, stable	Low (minimal agent footprint)
Serverless/Managed	Cloud carbon API + statistical estimator	No hardware access, compliant with provider limits	Medium (API calls, estimation drift)
High-frequency trading	In-process RAPL polling + eBPF	Sub-millisecond precision, bypasses sidecar latency	High (requires privileged access, tuning)
Legacy monolith	Host agent + request tracing correlation	Non-invasive, retrofittable without code changes	Low (agent deployment only)

Configuration Template

# kepler-deployment.yaml (abbreviated)
apiVersion: apps/v1
kind: DaemonSet
metadata:
  name: kepler-exporter
  namespace: monitoring
spec:
  selector:
    matchLabels:
      app: kepler-exporter
  template:
    metadata:
      labels:
        app: kepler-exporter
    spec:
      containers:
      - name: kepler
        image: quay.io/sustainable-computing/kepler:latest
        ports:
        - containerPort: 9102
          name: metrics
        securityContext:
          privileged: true
        volumeMounts:
        - name: sysfs
          mountPath: /sys
          readOnly: true
        - name: proc
          mountPath: /proc
          readOnly: true
      volumes:
      - name: sysfs
        hostPath:
          path: /sys
      - name: proc
        hostPath:
          path: /proc
---
# prometheus-scrape-config.yaml
scrape_configs:
  - job_name: 'kepler'
    static_configs:
      - targets: ['kepler-exporter.monitoring:9102']
    metrics_path: /metrics
    scrape_interval: 5s

Quick Start Guide

Install Kepler as a DaemonSet: kubectl apply -f kepler-deployment.yaml
Verify metrics exposure: curl http://<node-ip>:9102/metrics | grep rapl
Add energy-aware middleware to your application entry point
Configure Prometheus to scrape the sidecar and application metrics
Deploy Grafana dashboard with rate(joules_total[5m]) / rate(requests_total[5m]) panel and set baseline alert at 0.12 J/req

Energy monitoring transitions sustainability from compliance reporting to engineering control. By measuring joules per request, correlating with grid carbon intensity, and normalizing by workload, teams gain actionable feedback loops that reduce cost, cut emissions, and improve system efficiency. Treat energy as a first-class metric, and optimization becomes measurable, repeatable, and automated.

🎉 Mid-Year Sale — Unlock Full Article

Base plan from just $4.99/mo or $49/yr

7-day free trial · Cancel anytime · 30-day money-back

Sources

• ai-generated