mplementation demonstrates a production-ready architecture using a Go API service, PostgreSQL, and an Nginx reverse proxy.
Step 1: Define the OCI-Compliant Build Context
Traditional Dockerfiles remain fully compatible. The difference lies in how the build process interacts with the host filesystem and user namespaces.
# Containerfile
FROM golang:1.21-alpine AS builder
WORKDIR /src
COPY go.mod go.sum ./
RUN go mod download
COPY . .
RUN CGO_ENABLED=0 GOOS=linux go build -ldflags="-s -w" -o /bin/api-server ./cmd/server
FROM alpine:3.18
RUN apk add --no-cache ca-certificates tzdata
COPY --from=builder /bin/api-server /usr/local/bin/
EXPOSE 8080
USER 1000:1000
ENTRYPOINT ["/usr/local/bin/api-server"]
Architecture Decision: Explicit USER directive and stripped binaries reduce the container's runtime footprint. The multi-stage build ensures no build tooling leaks into the final image. This aligns with rootless execution because the container process never requires host-level capabilities.
Step 2: Orchestrate Multi-Service Dependencies
Compose workflows translate directly. The engine parses the YAML, resolves dependencies, and spawns isolated processes using the same OCI runtime specification.
# compose.yaml
services:
api:
build:
context: .
dockerfile: Containerfile
ports:
- "127.0.0.1:8080:8080"
environment:
DB_HOST: postgres
DB_PORT: "5432"
depends_on:
postgres:
condition: service_healthy
postgres:
image: docker.io/library/postgres:16-alpine
environment:
POSTGRES_DB: appdb
POSTGRES_USER: devuser
POSTGRES_PASSWORD: securelocal
volumes:
- pgdata:/var/lib/postgresql/data
healthcheck:
test: ["CMD-SHELL", "pg_isready -U devuser -d appdb"]
interval: 5s
timeout: 3s
retries: 5
volumes:
pgdata:
Architecture Decision: Binding the API port to 127.0.0.1 prevents external exposure during development. Health checks replace arbitrary sleep delays, ensuring deterministic startup sequencing. The volume declaration uses named volumes, which the engine manages under the user's storage directory, avoiding host path permission conflicts.
Step 3: Execute and Verify Rootless Isolation
The runtime constructs an OCI spec, forks the process, and delegates execution to the configured OCI runtime (crun or runc). No background daemon intercepts the call.
$ podman compose up -d
[+] Running 3/3
⠿ Network api_default Created
⠿ Container api-postgres-1 Started
⠿ Container api-api-1 Started
$ podman ps
CONTAINER ID IMAGE COMMAND CREATED STATUS PORTS NAMES
a1b2c3d4e5f6 localhost/api:latest /usr/local/bin/api 12 seconds ago Up 12 seconds ago 127.0.0.1:8080->8080/tcp api-api-1
f6e5d4c3b2a1 docker.io/library/postgres:16-alpine postgres 12 seconds ago Up 12 seconds ago 5432/tcp api-postgres-1
Architecture Decision: Direct fork-exec eliminates shared state between invocations. Each container operates within the caller's cgroup hierarchy and network namespace. This design choice ensures that container crashes do not corrupt a central daemon state, and resource limits apply predictably per user session.
Step 4: Integrate into CI/CD Without Privilege Escalation
Pipeline jobs execute the same commands under the runner's unprivileged user. Static binaries and chroot isolation replace daemon requirements.
# .gitlab-ci.yml
build-and-push:
stage: build
image: quay.io/podman/stable:latest
script:
- podman build -t registry.example.com/team/api:${CI_COMMIT_SHORT_SHA} .
- podman login -u "${CI_REGISTRY_USER}" -p "${CI_REGISTRY_PASSWORD}" registry.example.com
- podman push registry.example.com/team/api:${CI_COMMIT_SHORT_SHA}
variables:
BUILDAH_ISOLATION: chroot
Architecture Decision: BUILDAH_ISOLATION=chroot forces the build engine to avoid FUSE and kernel mounts, ensuring compatibility with unprivileged CI runners. The pipeline never requests privileged: true, eliminating device passthrough and cgroup manipulation risks. This configuration satisfies SOC2 and ISO 27001 requirements for shared runner environments.
Pitfall Guide
1. Assuming Global Image Cache
Explanation: Rootless storage isolates images to ~/.local/share/containers/storage/. Multiple users on the same host will download identical layers independently, wasting disk space and network bandwidth.
Fix: Configure storage.conf to use a shared overlay mount, or implement a local registry proxy (e.g., registry:2) that caches pulls. For CI, use podman pull with --pull=always to ensure deterministic builds.
2. Ignoring Lingering Mode for Persistent Services
Explanation: Systemd user services terminate when the user session ends. Containers configured to start on boot will stop after logout or SSH disconnect.
Fix: Run sudo loginctl enable-linger $USER to keep the user manager alive across reboots and session changes. Verify with loginctl show-user $USER | grep Linger.
3. Network Namespace Binding Failures
Explanation: Rootless containers default to slirp4netns, which translates ports via user-space networking. Binding to privileged ports (<1024) or expecting exact host network behavior will fail.
Fix: Use netavark and aardvark-dns for modern rootless networking. Configure containers.conf with network_backend = "netavark". For host-equivalent performance, use --network host where security boundaries allow.
4. Volume Permission Mismatches on Bind Mounts
Explanation: Container root maps to a high-range UID (e.g., 100000+) on the host. Bind mounts inherit host permissions, causing Permission denied errors when the container process writes to the mount.
Fix: Apply SELinux labels (:Z or :z) to bind mounts, or use --uidmap/--gidmap to align container and host UIDs. Alternatively, run the container process as a non-root user that matches the host directory owner.
5. CI Runner Capability Assumptions
Explanation: Expecting docker commands to work in CI without configuration leads to daemon startup failures. Many pipelines assume sudo or privileged mode is available.
Fix: Install static podman and buildah binaries in the runner image. Set CONTAINER_HOST if using a remote socket, or rely on local fork-exec. Never request privileged: true unless absolutely necessary for kernel testing.
6. Treating Swarm as a Production Path
Explanation: Docker Swarm is deprecated in upstream development. New deployments rarely adopt it, and it lacks modern service mesh integration.
Fix: Migrate multi-node orchestration to Kubernetes, Nomad, or cloud-managed control planes. Use the container engine strictly as a node-level runtime. Podman integrates seamlessly with CRI-O and Kubernetes via the Container Runtime Interface.
Explanation: runc is the reference implementation but carries higher memory overhead and slower startup times compared to crun, which is written in C and optimized for OCI specs.
Fix: Benchmark both runtimes. Switch to crun by updating runtime = "crun" in containers.conf. This reduces cold-start latency by 30-50% in high-density workloads.
Production Bundle
Action Checklist
Decision Matrix
| Scenario | Recommended Approach | Why | Cost Impact |
|---|
| Local Development | Daemonless rootless with podman compose | Eliminates sudo requirements, matches production security model, reduces host exposure | Zero; uses existing developer workstations |
| CI/CD Pipeline | Unprivileged podman + chroot isolation | Compliant with shared runner policies, removes kernel attack surface, faster feedback | Low; static binaries require minimal runner configuration |
| Production Linux Server | Systemd user services + lingering mode | Boot persistence without root, predictable resource limits, audit-friendly process tree | Medium; requires systemd tuning and storage planning |
| Multi-Tenant Edge Node | CRI-O + Podman as node runtime | Kubernetes-native, isolates tenant workloads, avoids daemon contention | High; requires cluster management but reduces long-term operational debt |
Configuration Template
# ~/.config/containers/containers.conf
[engine]
cgroup_manager = "systemd"
events_logger = "file"
runtime = "crun"
network_backend = "netavark"
[engine.service_destinations]
[engine.service_destinations.local]
uri = "unix:///run/user/1000/podman/podman.sock"
[storage]
driver = "overlay"
graphroot = "/home/developer/.local/share/containers/storage"
runroot = "/run/user/1000/containers"
[storage.options.overlay]
mount_program = "/usr/bin/fuse-overlayfs"
# .gitlab-ci.yml (CI/CD Integration)
container-build:
stage: build
image: quay.io/podman/stable:latest
variables:
BUILDAH_ISOLATION: chroot
CONTAINER_TMPDIR: /tmp/build
script:
- mkdir -p $CONTAINER_TMPDIR
- podman build --format docker -t $CI_REGISTRY_IMAGE:$CI_COMMIT_SHA .
- podman push $CI_REGISTRY_IMAGE:$CI_COMMIT_SHA
rules:
- if: $CI_COMMIT_BRANCH == "main"
Quick Start Guide
- Install the runtime: Use your distribution's package manager (
apt install podman, dnf install podman, or brew install podman on macOS). Verify installation with podman --version.
- Initialize rootless storage: Run
podman system reset if migrating from a previous configuration. Confirm storage path with podman info | grep graphRoot.
- Deploy a test service: Create a
compose.yaml with a single stateless service. Execute podman compose up -d and verify with podman ps.
- Enable boot persistence: Run
sudo loginctl enable-linger $USER. Generate a systemd unit with podman generate systemd --name <service> --files --new, then enable it via systemctl --user enable <unit>.service.
- Validate CI compatibility: Run a local build job using
BUILDAH_ISOLATION=chroot podman build .. Confirm no sudo or privileged flags are required.
The transition from daemon-based to daemonless container execution is no longer experimental. It is the operational baseline for secure, scalable, and compliant infrastructure. By aligning with rootless defaults, user-scoped storage, and fork-exec semantics, teams eliminate privilege escalation vectors, reduce CI blast radius, and future-proof their runtime architecture against evolving security standards.