ockerfile.visual-runner
FROM mcr.microsoft.com/playwright:v1.40.0-jammy
Install application-specific fonts
COPY ./fonts /usr/share/fonts/custom
RUN fc-cache -f -v
Set environment variables for deterministic rendering
ENV PLAYWRIGHT_BROWSERS_PATH=/ms-playwright
ENV FONTCONFIG_PATH=/etc/fonts
ENV NODE_OPTIONS="--max-old-space-size=4096"
Pre-install Playwright browsers to avoid runtime downloads
RUN npx playwright install --with-deps chromium
Push this image to your project's container registry. Tag it with a version suffix (e.g., `visual-runner:1.0.0`) to prevent unexpected updates from breaking your pipeline.
### Step 2: Configure Playwright for CI Determinism
Replace default timeouts with explicit retry logic and pixel-diff thresholds. This reduces flakiness from transient network states or minor anti-aliasing variations.
```typescript
// visual.config.ts
import { defineConfig, devices } from '@playwright/test';
export default defineConfig({
testDir: './visual-suites',
fullyParallel: true,
forbidOnly: !!process.env.CI,
retries: process.env.CI ? 2 : 0,
workers: process.env.CI ? 1 : undefined,
use: {
trace: 'on-first-retry',
screenshot: 'only-on-failure',
viewport: { width: 1280, height: 720 },
launchOptions: {
args: [
'--font-render-hinting=none',
'--disable-gpu',
'--disable-software-rasterizer'
]
}
},
projects: [
{
name: 'chromium-stable',
use: { ...devices['Desktop Chrome'] }
}
],
reporter: process.env.CI ? 'html' : 'list',
outputDir: './visual-output'
});
Key architectural choices:
retries: 2 in CI absorbs transient rendering glitches without failing the job.
workers: 1 prevents GPU contention and memory thrashing during image comparison.
--disable-gpu and --font-render-hinting=none standardize pixel output across headless environments.
outputDir isolates generated artifacts from source code.
Step 3: Orchestrate Pipeline Stages with DAG Execution
Visual tests must run against a deployed environment, not a local dev server. Use GitLab CI's needs directive to create a directed acyclic graph (DAG) that skips unnecessary artifact downloads and enforces execution order.
# .gitlab-ci.yml (excerpt)
stages:
- build
- deploy-review
- validate-ui
- cleanup
variables:
VISUAL_IMAGE: "$CI_REGISTRY_IMAGE/visual-runner:1.0.0"
REVIEW_URL: "https://${CI_ENVIRONMENT_SLUG}.${CI_DEFAULT_DOMAIN}"
build-app:
stage: build
image: node:20-slim
script:
- npm ci
- npm run build
artifacts:
paths:
- dist/
expire_in: 1 hour
deploy-review-env:
stage: deploy-review
image: alpine:latest
script:
- echo "Deploying to ${REVIEW_URL}"
environment:
name: review/$CI_COMMIT_REF_SLUG
url: $REVIEW_URL
on_stop: teardown-review-env
validate-visual-regression:
stage: validate-ui
image: $VISUAL_IMAGE
needs:
- job: deploy-review-env
artifacts: false
script:
- npx playwright test --config=visual.config.ts
artifacts:
when: always
paths:
- visual-output/
reports:
junit: visual-output/results.xml
expire_in: 30 days
allow_failure: true
teardown-review-env:
stage: cleanup
when: manual
script:
- echo "Tearing down review environment"
Rationale:
needs: [deploy-review-env] ensures the UI validation job starts immediately after deployment finishes, without waiting for other parallel jobs.
artifacts: false in the needs block prevents unnecessary artifact downloads, cutting pipeline latency.
allow_failure: true keeps the pipeline green during the stabilization phase. Flip to false once false positives drop below 2%.
reports: junit enables GitLab to parse test results directly in the Merge Request widget.
Pitfall Guide
1. Caching Baseline Images
Explanation: GitLab CI cache is scoped to pipeline runs and branch names. Caching baseline PNGs means they are discarded after each pipeline, forcing regeneration and breaking version control.
Fix: Store baselines in the repository. Enable Git LFS (git lfs track "*.png") to prevent history bloat. Never add baseline directories to the cache key.
2. Ignoring Protected Variable Scope
Explanation: GitLab CI/CD variables marked as "Protected" are only injected into pipelines running on protected branches. Feature branches will fail to authenticate with cloud services or internal APIs.
Fix: Either protect your feature branches, or create a separate unprotected variable group for visual testing credentials. Validate variable availability with echo $VAR_NAME in a debug job.
3. Skipping Environment Stabilization
Explanation: Headless browsers inherit system font metrics and GPU rasterization rules. Shared runners change hardware profiles between jobs, causing pixel-level drift.
Fix: Use a versioned Docker image with pinned fonts, disabled GPU acceleration, and explicit anti-aliasing flags. Rebuild the image only when browser versions or font families change.
4. Blocking Merge Requests Prematurely
Explanation: Enforcing visual tests on day one guarantees pipeline failures due to baseline mismatches and environmental noise. Teams quickly disable the job entirely.
Fix: Start with allow_failure: true. Run the job in non-blocking mode for 2β3 weeks. Collect false positive data, tune thresholds, and switch to blocking mode only when the failure rate stabilizes below 5%.
5. Misusing dependencies vs needs
Explanation: The dependencies keyword downloads artifacts from all previous jobs in the stage, regardless of relevance. This adds unnecessary I/O and extends pipeline duration.
Fix: Use needs to declare explicit job dependencies. Set artifacts: false when you only need execution order, not file transfer. This enables DAG execution and reduces CI wait times by 30β40%.
Explanation: Image comparison algorithms load full-resolution bitmaps into RAM. Running multiple workers simultaneously triggers OOM kills on shared runners.
Fix: Limit parallelism to workers: 1 for visual jobs. Increase Node.js heap space via NODE_OPTIONS="--max-old-space-size=4096". For suites exceeding 50 snapshots, migrate to self-managed runners with 8GB+ RAM.
7. Generating Baselines Locally
Explanation: Local machines have different DPI scaling, font smoothing, and browser extensions. Baselines generated locally will fail in CI, creating a false sense of security.
Fix: Create a manual trigger job in GitLab CI that runs npx playwright test --update-snapshots. This guarantees baselines are captured in the exact same environment used for validation.
Production Bundle
Action Checklist
Decision Matrix
| Scenario | Recommended Approach | Why | Cost Impact |
|---|
| Small team, limited infra | Percy / Cloud SaaS | Zero environment management, vendor handles drift | Per-snapshot pricing scales linearly |
| Strict compliance, air-gapped | Containerized Playwright + Self-Managed Runners | Full control over rendering, no external dependencies | Higher upfront runner cost, predictable CI minutes |
| Rapid prototyping, frequent UI changes | BackstopJS + GitLab Artifacts | Fast setup, readable HTML reports, easy baseline review | Moderate maintenance overhead, intermittent project updates |
| Enterprise scale, custom design system | Custom Docker Image + Playwright + LFS | Deterministic diffs, version-controlled baselines, parallelizable | Initial image build time, but lowest long-term CI cost |
Configuration Template
# .gitlab-ci.yml - Visual Regression Pipeline
stages:
- build
- deploy-review
- validate-ui
- cleanup
variables:
VISUAL_IMAGE: "$CI_REGISTRY_IMAGE/visual-runner:1.0.0"
REVIEW_URL: "https://${CI_ENVIRONMENT_SLUG}.${CI_DEFAULT_DOMAIN}"
PLAYWRIGHT_JUNIT_OUTPUT: "visual-output/results.xml"
build-application:
stage: build
image: node:20-slim
script:
- npm ci --prefer-offline
- npm run build
artifacts:
paths:
- dist/
expire_in: 1 hour
deploy-review-instance:
stage: deploy-review
image: alpine:latest
script:
- echo "Provisioning review environment at ${REVIEW_URL}"
environment:
name: review/$CI_COMMIT_REF_SLUG
url: $REVIEW_URL
on_stop: destroy-review-instance
run-visual-validation:
stage: validate-ui
image: $VISUAL_IMAGE
needs:
- job: deploy-review-instance
artifacts: false
script:
- npx playwright test --config=visual.config.ts
artifacts:
when: always
paths:
- visual-output/
reports:
junit: $PLAYWRIGHT_JUNIT_OUTPUT
expire_in: 30 days
allow_failure: true
destroy-review-instance:
stage: cleanup
when: manual
script:
- echo "Decommissioning review environment"
Quick Start Guide
- Create the runner image: Copy the Dockerfile example, add your application fonts, and push it to your GitLab container registry with a version tag.
- Initialize Playwright config: Save the
visual.config.ts template in your repository root. Adjust viewport dimensions and retry counts to match your design system.
- Add the CI job: Paste the
run-visual-validation job into your .gitlab-ci.yml. Ensure it references your custom image and declares a needs dependency on your deployment job.
- Generate first baselines: Trigger the pipeline manually. Once it completes, run
npx playwright test --update-snapshots inside the CI environment (or via a dedicated manual job) to capture the initial reference set.
- Commit and monitor: Push the baseline PNGs to Git LFS. Keep
allow_failure: true active for two weeks. Review artifact reports in Merge Requests, tune thresholds, and switch to blocking mode once false positives stabilize.