--dearmor -o /etc/apt/keyrings/docker.gpg
Configure the repository
echo
"deb [arch=$(dpkg --print-architecture) signed-by=/etc/apt/keyrings/docker.gpg] https://download.docker.com/linux/ubuntu
$(. /etc/os-release && echo "$VERSION_CODENAME") stable" | sudo tee /etc/apt/sources.list.d/docker.list > /dev/null
Install Docker packages
sudo apt update
sudo apt install -y docker-ce docker-ce-cli containerd.io docker-buildx-plugin docker-compose-plugin
Configure user permissions to avoid sudo for docker commands
sudo usermod -aG docker $USER
newgrp docker
*Note: Running `newgrp docker` applies the group change to the current session without requiring a logout. In production, verify Docker is running with `docker info`.*
#### 2. Project Scaffolding and Security
Create a dedicated directory for the deployment. We use a `.env` file to manage configuration variables, separating secrets from the compose definition.
```bash
mkdir -p ~/hermes-deploy/data
cd ~/hermes-deploy
Environment Configuration:
Create .env with the following variables. This structure isolates domain settings, ACME contact info, and authentication credentials.
# Domain configuration
HERMES_DOMAIN=agent.yourdomain.com
ACME_EMAIL=ops@yourdomain.com
# Authentication
ADMIN_USER=agent_admin
# Generate hash: htpasswd -nbB <user> <pass> | cut -d: -f2
ADMIN_HASH=$2y$10$YourGeneratedBcryptHashHere
Best Practice: Never store plaintext passwords. Use htpasswd or openssl to generate bcrypt hashes for the ADMIN_HASH variable. Restrict .env file permissions with chmod 600 .env.
3. Orchestration with Docker Compose
The docker-compose.yml file defines three services: the edge router, the Hermes core, and an authentication gateway. This separation of concerns allows independent scaling and security policies.
services:
edge-router:
image: traefik:v3
command:
- --providers.docker=true
- --providers.docker.exposedbydefault=false
- --entrypoints.web.address=:80
- --entrypoints.websecure.address=:443
- --certificatesresolvers.letsencrypt.acme.email=${ACME_EMAIL}
- --certificatesresolvers.letsencrypt.acme.storage=/data/acme.json
- --certificatesresolvers.letsencrypt.acme.tlschallenge=true
- --api.dashboard=false
ports:
- "80:80"
- "443:443"
volumes:
- /var/run/docker.sock:/var/run/docker.sock:ro
- ./data/traefik:/data
restart: unless-stopped
networks:
- hermes-net
hermes-core:
image: nousresearch/hermes-agent:latest
volumes:
- ./data/hermes:/opt/data
labels:
- traefik.enable=true
- traefik.http.routers.hermes.rule=Host(`${HERMES_DOMAIN}`)
- traefik.http.routers.hermes.entrypoints=websecure
- traefik.http.routers.hermes.tls.certresolver=letsencrypt
- traefik.http.routers.hermes.middlewares=auth-gateway
restart: unless-stopped
networks:
- hermes-net
auth-gateway:
image: nginx:alpine
volumes:
- ./data/auth/.htpasswd:/etc/nginx/.htpasswd:ro
labels:
- traefik.enable=true
- traefik.http.middlewares.auth-gateway.basicauth.usersfile=/etc/nginx/.htpasswd
restart: unless-stopped
networks:
- hermes-net
networks:
hermes-net:
driver: bridge
Architecture Rationale:
- Traefik v3: Chosen for dynamic configuration via Docker labels and built-in ACME support. The
tlschallenge method verifies domain ownership without requiring DNS provider credentials, simplifying deployment.
- Service Isolation: The
auth-gateway service handles basic authentication middleware. This keeps auth logic decoupled from the Hermes core, allowing the agent to focus on inference and tool execution.
- Persistent Volumes: The
./data/hermes:/opt/data mapping ensures that agent memory, configuration, and tool states survive container restarts.
- Security:
exposedbydefault=false ensures only explicitly labeled services are routable. The Docker socket is mounted read-only (:ro) to minimize attack surface.
4. LLM Backend Configuration
Hermes Agent requires an LLM provider to function. The framework supports any OpenAI-compatible endpoint. For cost-effective production workloads, Vultr Serverless Inference provides a high-performance, OpenAI-compatible API.
Configure the Model:
docker run -it --rm \
-v ./data/hermes:/opt/data \
nousresearch/hermes-agent:latest model
The interactive CLI will prompt for provider selection. Choose the OpenAI-compatible option and input:
- API Key: Your provider's secret key.
- Base URL: For Vultr, this is typically
https://inference.vultr.com/v1.
- Model ID: The specific model identifier supported by the provider.
Production Tip: Store API keys in a secure vault or Docker secrets for large-scale deployments. Rotate keys regularly and monitor usage via the provider's dashboard to prevent unexpected costs.
5. Deployment and Verification
Launch Services:
docker compose up -d
Verify Status:
docker compose ps
Ensure all services report Up status. Check Traefik logs for certificate issuance:
docker compose logs edge-router | grep -i "acme"
Access the Agent:
Navigate to https://agent.yourdomain.com. The browser will prompt for credentials defined in the .env file. Once authenticated, the dashboard loads. Send a test message to verify LLM connectivity. A successful response confirms the routing chain: Browser β Traefik β Auth Gateway β Hermes Core β LLM Provider.
Pitfall Guide
Deploying self-hosted AI agents introduces specific operational risks. The following pitfalls are common in production environments and include mitigation strategies.
| Pitfall | Explanation | Fix |
|---|
| Volume Permission Mismatch | The Hermes container runs as a non-root user. If the host directory is owned by root, the agent cannot write memory or config files, causing silent failures. | Run sudo chown -R 1000:1000 ./data/hermes (or the UID/GID specified in the Dockerfile) before starting the container. |
| ACME Rate Limiting | Traefik may hit Let's Encrypt rate limits if certificates are requested too frequently, often due to misconfigured domains or repeated restarts during development. | Use the --certificatesresolvers.letsencrypt.acme.caserver=https://acme-staging-v02.api.letsencrypt.org/directory flag during testing. Switch to production only after verification. |
| LLM API Key Exposure | Storing API keys in plain text within .env files accessible to all users on the host compromises the LLM account and incurs financial risk. | Restrict .env permissions (chmod 600). For multi-user hosts, use Docker secrets or a vault integration. Audit access logs on the LLM provider side. |
| Memory State Corruption | Abrupt termination of the container (e.g., kill -9) can corrupt the persistent memory database, leading to data loss or agent instability. | Always use docker compose down or docker compose stop for graceful shutdowns. Implement automated backups of the ./data/hermes volume. |
| Firewall Blocking | Ubuntu's UFW may block inbound traffic on ports 80 and 443, preventing Traefik from serving requests or completing ACME challenges. | Run sudo ufw allow 80/tcp and sudo ufw allow 443/tcp. Verify rules with sudo ufw status. |
| Dashboard Auth Bypass | Weak passwords or misconfigured Nginx auth files can allow unauthorized access to the agent dashboard, potentially exposing tools or memory. | Use strong bcrypt hashes. Regularly rotate credentials. Consider adding IP allowlisting via Traefik middleware for internal deployments. |
| Resource Exhaustion | Hermes Agent with persistent memory and tool execution can consume significant RAM/CPU, especially with complex tool chains or large context windows. | Set resource limits in docker-compose.yml using deploy.resources.limits. Monitor usage with docker stats and adjust based on workload. |
Production Bundle
This section provides actionable assets for deploying and managing Hermes Agent in production environments.
Action Checklist
Decision Matrix
Select the appropriate LLM routing strategy based on operational requirements.
| Scenario | Recommended Approach | Why | Cost Impact |
|---|
| High-Volume, Cost-Sensitive | Vultr Serverless Inference | OpenAI-compatible API with competitive pricing; scalable inference. | Low variable cost; predictable scaling. |
| Data Privacy Critical | Local LLM (e.g., Ollama) | Inference runs on-premise; no data leaves the network. | High upfront hardware cost; zero API fees. |
| Complex Reasoning Tasks | Premium Cloud Model (e.g., OpenAI) | State-of-the-art performance for difficult queries. | High per-token cost; use sparingly. |
| Hybrid Workloads | Router/Proxy (e.g., LiteLLM) | Route requests based on task complexity; optimize cost/performance. | Moderate infrastructure cost; optimized API spend. |
Configuration Template
Use this template as a baseline for production deployments. Customize variables and add resource limits as needed.
# docker-compose.prod.yml
services:
edge-router:
image: traefik:v3
command:
- --providers.docker=true
- --providers.docker.exposedbydefault=false
- --entrypoints.web.address=:80
- --entrypoints.websecure.address=:443
- --certificatesresolvers.letsencrypt.acme.email=${ACME_EMAIL}
- --certificatesresolvers.letsencrypt.acme.storage=/data/acme.json
- --certificatesresolvers.letsencrypt.acme.tlschallenge=true
- --api.dashboard=false
- --log.level=INFO
ports:
- "80:80"
- "443:443"
volumes:
- /var/run/docker.sock:/var/run/docker.sock:ro
- ./data/traefik:/data
restart: always
networks:
- hermes-net
hermes-core:
image: nousresearch/hermes-agent:latest
volumes:
- ./data/hermes:/opt/data
environment:
- TZ=UTC
labels:
- traefik.enable=true
- traefik.http.routers.hermes.rule=Host(`${HERMES_DOMAIN}`)
- traefik.http.routers.hermes.entrypoints=websecure
- traefik.http.routers.hermes.tls.certresolver=letsencrypt
- traefik.http.routers.hermes.middlewares=auth-gateway
restart: always
deploy:
resources:
limits:
cpus: '2.0'
memory: 4G
networks:
- hermes-net
auth-gateway:
image: nginx:alpine
volumes:
- ./data/auth/.htpasswd:/etc/nginx/.htpasswd:ro
labels:
- traefik.enable=true
- traefik.http.middlewares.auth-gateway.basicauth.usersfile=/etc/nginx/.htpasswd
restart: always
networks:
- hermes-net
networks:
hermes-net:
driver: bridge
Quick Start Guide
- Prepare Environment: Install Docker on Ubuntu 26.04, create project directory, and generate
.env with domain and auth credentials.
- Initialize Volumes: Create
data directories and set ownership to prevent permission errors.
- Deploy Stack: Run
docker compose -f docker-compose.prod.yml up -d to start services.
- Configure LLM: Execute
docker run -it --rm -v ./data/hermes:/opt/data nousresearch/hermes-agent:latest model to set up the inference backend.
- Verify Access: Open
https://<HERMES_DOMAIN> in a browser, authenticate, and test agent responsiveness.
By following this guide, you establish a secure, scalable, and cost-efficient foundation for running autonomous AI agents. The architecture supports persistent memory, flexible LLM routing, and robust security, enabling production-grade AI workflows without vendor lock-in.