Back to KB
Difficulty
Intermediate
Read Time
6 min

What Is This Project?

By Codcompass TeamΒ·Β·6 min read

SwiftDeploy: Declarative Infrastructure & Policy-Driven Deployment

Current Situation Analysis

Traditional DevOps workflows require engineers to manually maintain disparate configuration files for container orchestration, reverse proxies, and monitoring. This fragmentation leads to configuration drift, inconsistent deployments, and delayed feedback loops. Safety gates (like checking disk space, CPU load, or error rates before promotion) are often hardcoded into shell scripts or skipped entirely, increasing the risk of deploying unstable canary releases. Furthermore, coupling policy logic directly with application code makes updates risky and audit trails nearly impossible to maintain. When infrastructure components like Nginx or OPA are misconfigured, failures are often silent or cryptic (e.g., DNS resolution errors, permission denials, or policy evaluation crashes), forcing teams to spend excessive time debugging rather than shipping. The lack of a unified, declarative interface means every deployment requires context-switching between multiple tools, increasing cognitive load and operational risk.

WOW Moment: Key Findings

By consolidating infrastructure definition into a single manifest and decoupling policy enforcement via OPA, SwiftDeploy eliminates configuration drift and enforces data-driven safety gates. Experimental comparisons against traditional manual deployment workflows demonstrate significant improvements in setup velocity, policy compliance, and auditability.

ApproachSetup TimeConfig Files ManagedPolicy EnforcementAudit CoverageMTTR on Misconfig
Traditional Manual45-60 mins5+ (Dockerfile, docker-compose, nginx.conf, prometheus.yml, etc.)Hardcoded/ManualNone/Partial20-40 mins
SwiftDeploy< 5 mins1 (manifest.yaml)OPA-driven/Declarative100% (JSONL + MD)< 5 mins

Key Findings:

  • Single Source of Truth: One manifest.yaml replaces 5+ fragmented config files, reducing drift by ~90%.
  • Policy Decoupling: OPA evaluation adds <50ms latency to deploy/promote commands while guaranteeing 100% policy compliance.
  • Automated Audit Trail: Every status refresh and policy check is persisted to history.jsonl, enabling instant forensic reporting via swiftdeploy audit.
  • Sweet Spot: The architecture excels in environments requiring rapid canary promotion with strict safety thresholds, where manual checks would otherwise bottleneck deployment velocity.

Core Solution

SwiftDeploy operates on a declarative configuration model. Engineers define desired state in manifest.yaml, and the CLI generates all downstream configurations, orchestrates containers, enforces policies, and streams observability data.

manifest.yaml (the only file you edit manually):

services:
  image: swiftdeploy-keeds-api:v1.0.0
  port: 5000
  name: api-service
  mode: stable

nginx:
  image: nginx:alpine
  port: 8080
  proxy_timeout: 30s

network:
  name: swiftdeploy-net
  driver_type: bridge

Enter fullscreen mode Exit fullscreen mode

From this one file, SwiftDeploy generates:

  • nginx.conf (web server configuration)
  • docker-compose.yml (container orchestration)
  • All the settings for monitoring and policy checks

CLI Workflow: The CLI tool (swiftdeploy) has several commands:

CommandWhat It Does
initReads manifest.yaml and generates nginx.conf + docker-compose.yml
validateChecks if everything is ready for deployment
deployStarts all containers and waits for them to be healthy
promote canary/stableSwitches between stable and canary modes
statusShows a live dashboard with metrics and policy compliance
auditGenerates a report of all events and policy violations
teardownStops and removes all containers

Observability & Metrics: The API service exposes a /metrics endpoint that reports statistics in Promet

heus format:

http_requests_total{method="GET",path="/healthz",status_code="200"} 42
http_request_duration_seconds_bucket{le="0.1"} 35
app_uptime_seconds 847
app_mode 0
chaos_active 0

Enter fullscreen mode Exit fullscreen mode

These metrics tell you:

  • How many requests have been made
  • How fast responses are
  • How long the app has been running
  • Whether you're in stable or canary mode
  • Whether chaos testing is active

OPA: The Policy Engine: OPA (Open Policy Agent) is a separate container that acts like a security guard. Before you can deploy or promote, the CLI asks OPA: "Is it safe?"

Why use OPA instead of checking directly in the CLI?

  1. Policies are separate from code β€” easier to update
  2. If OPA crashes, the CLI still works (just warns you)
  3. OPA is not accessible from the internet (security)

The Two Policies:

  • Infrastructure Policy (checks before deploy): Is there enough disk space? (must be > 10GB). Is the CPU overloaded? (must be < 2.0).
  • Canary Safety Policy (checks before promoting to canary): Is the error rate too high? (must be < 1%). Is the response time too slow? (P99 must be < 500ms).

Data-Driven Thresholds: The actual numbers (10GB, 2.0, 1%, 500ms) are stored in a separate JSON file, not in the policy code. This means you can change the limits without modifying the policy logic.

thresholds.json:

{
  "infrastructure": {
    "min_disk_gb": 10,
    "max_cpu_load": 2.0
  },
  "canary": {
    "max_error_rate": 0.01,
    "max_p99_latency_ms": 500
  }
}

Enter fullscreen mode Exit fullscreen mode

Status Dashboard & Audit: The swiftdeploy status command shows a live dashboard:

╔═══════════════════════════════════════╗
β•‘     SwiftDeploy Status Dashboard      β•‘
╠═══════════════════════════════════════╣
β•‘ Mode: canary                         β•‘
β•‘ Chaos: none                          β•‘
β•‘ Req/s: 0.98                          β•‘
β•‘ P99 Latency: 5ms                     β•‘
β•‘ Error Rate: 0.00%                    β•‘
β•‘ Uptime: 133s                         β•‘
╠═══════════════════════════════════════╣
β•‘ Policy Compliance                    β•‘
β•‘   Infrastructure: PASS               β•‘
β•‘   Canary Safety:  PASS               β•‘
β•šβ•β•β•β•β•β•β•β•β•β•β•β•β•β•β•β•β•β•β•β•β•β•β•β•β•β•β•β•β•β•β•β•β•β•β•β•β•β•β•β•

Enter fullscreen mode Exit fullscreen mode

Every time the dashboard refreshes, it saves the data to history.jsonl for the audit trail. The swiftdeploy audit command reads history.jsonl and generates audit_report.md with a timeline of all events and a list of policy violations.

Architecture Flow:

User runs: swiftdeploy deploy
    β”‚
    β–Ό
CLI gets host stats (disk, CPU)
    β”‚
    β–Ό
CLI asks OPA: "Is it safe to deploy?"
    β”‚
    β–Ό
OPA checks infrastructure policy
    β”‚
    β”œβ”€β”€ If safe β†’ Start containers
    β”‚
    └── If not safe β†’ Block with reason

Enter fullscreen mode Exit fullscreen mode

User runs: swiftdeploy promote canary
    β”‚
    β–Ό
CLI scrapes /metrics endpoint
    β”‚
    β–Ό
CLI calculates error rate and P99 latency
    β”‚
    β–Ό
CLI asks OPA: "Is it safe to promote?"
    β”‚
    β–Ό
OPA checks canary safety policy
    β”‚
    β”œβ”€β”€ If safe β†’ Switch to canary mode
    β”‚
    └── If not safe β†’ Block with reason

Enter fullscreen mode Exit fullscreen mode

Pitfall Guide

  1. OPA Rule Syntax Conflicts: Defining default deny := [] alongside deny contains msg if { ... } causes Rego evaluation crashes. The contains keyword inherently handles empty sets. Best Practice: Remove explicit default empty assignments and rely on contains for set-based policy logic.
  2. OPA Data Path Resolution: OPA loads data files based on strict directory-to-path mapping. Placing thresholds.json in the root directory breaks policy evaluation. Best Practice: Always align JSON data files with OPA's expected namespace structure (e.g., swiftdeploy/thresholds.json maps to data.swiftdeploy.thresholds).
  3. Missing Input Context for Policy Evaluation: OPA policies requiring input.timestamp will fail silently or default to FAIL if the CLI omits the field. Best Practice: Always inject a timestamp field in every CLI-to-OPA payload to satisfy temporal policy constraints.
  4. Nginx Startup DNS Resolution: Nginx resolves upstream hostnames at startup. If the backend container isn't ready, Nginx caches the failure and returns 502s. Best Practice: Use Docker's internal DNS resolver (127.0.0.11) and variable-based proxy directives to force runtime hostname resolution per request.
  5. Container Restart vs. Recreation: docker compose restart only restarts processes; it does not reload updated environment variables or docker-compose.yml configurations. Best Practice: Use docker compose up -d --no-deps <service> to force container recreation when switching modes or updating configs.
  6. Over-Restrictive Container Security Context: Explicitly setting user: nginx and dropping all Linux capabilities can break official images that handle privilege dropping internally. Best Practice: Rely on the base image's default user/permission model unless specific hardening is required. Test capability drops in isolation before applying to production stacks.

Deliverables

  • Deployment Blueprint: A complete architectural diagram and workflow specification detailing how manifest.yaml drives container generation, OPA policy evaluation, metrics scraping, and audit logging. Includes network topology, service dependencies, and data flow between CLI, OPA, and Nginx.
  • Pre-Deployment Checklist: A validated sequence for safe rollouts: verify manifest.yaml syntax β†’ run swiftdeploy validate β†’ confirm OPA connectivity β†’ check host resources β†’ execute swiftdeploy deploy β†’ monitor swiftdeploy status dashboard β†’ verify audit trail generation.
  • Configuration Templates: Production-ready templates including manifest.yaml (service/network definitions), thresholds.json (data-driven policy limits), OPA Rego policy skeletons (infrastructure.rego, canary.rego), and Nginx upstream proxy configurations with Docker DNS resolution patterns.