er round-trips.
Architecture Decisions
- Single Round-Trip Extraction: Geometry data is collected in a single
page.evaluate() call per batch. This avoids the latency of multiple DOM queries and ensures a consistent snapshot of the render tree.
- Structured Output: Results are returned as JSON primitives (numbers, booleans, strings) rather than images or HTML. This allows LLMs to parse and reason about the data without vision models.
- Viewport Simulation: The server accepts viewport dimensions as input, enabling agents to test responsive behavior without changing the browser window size manually.
- Parallel Breakpoint Processing: Responsive drift analysis processes multiple viewports concurrently to speed up cross-device validation.
Implementation Tools
The MCP server exposes four core tools, rewritten here with distinct interfaces and naming conventions to demonstrate the pattern.
1. Geometry Snapshot Extraction
Retrieves position, size, z-index, and viewport visibility for a batch of selectors.
interface GeometryRequest {
targetUrl: string;
nodeSelectors: string[];
renderingContext: { width: number; height: number };
}
interface GeometryResult {
nodeId: string;
geometry: { top: number; left: number; width: number; height: number };
zIndex: string | number;
visibilityFlags: { inViewport: boolean; isRendered: boolean };
}
// Agent usage
const snapshot = await spatialTool.fetch_geometry_snapshot({
targetUrl: 'https://app.com/checkout',
nodeSelectors: ['#pay-action', '#cookie-consent', 'header'],
renderingContext: { width: 375, height: 812 }
});
// Result
// [
// { nodeId: '#pay-action', geometry: { top: 1450, left: 16, width: 343, height: 48 }, zIndex: 'auto', visibilityFlags: { inViewport: false, isRendered: true } }
// ]
2. Intersection Ratio Computation
Calculates the overlap between two elements to detect occlusion.
interface OverlapRequest {
targetUrl: string;
primaryNode: string;
blockingNode: string;
}
interface OverlapResult {
isOccluded: boolean;
intersectionRatio: number; // 0.0 to 1.0
occludedAreaPx: number;
}
// Agent usage
const overlap = await spatialTool.compute_intersection_ratio({
targetUrl: 'https://app.com/checkout',
primaryNode: '#pay-action',
blockingNode: '#cookie-consent'
});
// Result
// { isOccluded: true, intersectionRatio: 0.61, occludedAreaPx: 4128 }
3. Layout Topology Validation
Asserts spatial relationships between elements using declarative rules.
type LayoutRule = 'above' | 'below' | 'left_of' | 'right_of' | 'contains' | 'no_overlap';
interface TopologyRequest {
targetUrl: string;
constraints: Array<{
rule: LayoutRule;
nodeA: string;
nodeB: string;
}>;
}
interface TopologyResult {
passed: boolean;
details: Array<{ rule: LayoutRule; passed: boolean; explanation: string }>;
}
// Agent usage
const topology = await spatialTool.assert_layout_topology({
targetUrl: 'https://app.com',
constraints: [
{ rule: 'above', nodeA: 'nav', nodeB: '.hero-section' },
{ rule: 'no_overlap', nodeA: '.sidebar', nodeB: '.main-content' }
]
});
// Result
// { passed: false, details: [ { rule: 'above', passed: true, explanation: 'nav bottom (64px) is above .hero-section top (64px)' }, { rule: 'no_overlap', passed: false, explanation: '.sidebar and .main-content overlap by 12%' } ] }
4. Responsive Drift Measurement
Tracks geometry changes across multiple viewports to identify volatile elements.
interface DriftRequest {
targetUrl: string;
nodeSelectors: string[];
viewports: Array<{ width: number; height: number }>;
}
interface DriftResult {
nodeId: string;
isVolatile: boolean;
maxDelta: { x: number; y: number; width: number; height: number };
}
// Agent usage
const drift = await spatialTool.measure_responsive_drift({
targetUrl: 'https://app.com',
nodeSelectors: ['.cta-button', 'nav'],
viewports: [
{ width: 375, height: 812 },
{ width: 768, height: 1024 },
{ width: 1280, height: 720 }
]
});
// Result
// [ { nodeId: '.cta-button', isVolatile: true, maxDelta: { x: 442, y: 318, width: 897, height: 0 } } ]
Pitfall Guide
Integrating spatial validation into AI workflows requires careful handling of browser rendering nuances. Below are common pitfalls and production-tested mitigations.
1. Stacking Context Illusions
- Issue: Overlap detection may report false positives if elements are in different stacking contexts. An element with a higher z-index in a child context may appear above another element visually, but simple bounding box math might suggest overlap.
- Fix: The MCP server must account for
getComputedStyle().zIndex and parent stacking contexts. When analyzing occlusion, verify that the blocking element is actually in a higher stacking context before flagging a bug.
2. Viewport Mismatch
- Issue: Testing layout on a desktop viewport but claiming mobile compatibility. Agents may default to the browser's current size, leading to inaccurate viewport flags.
- Fix: Always explicitly pass the
renderingContext or viewports parameter. Never assume the browser window size matches the target device. Use the tool's viewport simulation rather than resizing the window manually.
3. Dynamic Content Races
- Issue: Ads, cookie banners, or lazy-loaded components may inject themselves after the initial geometry snapshot, causing occlusion that the agent misses.
- Fix: Wait for network idle or specific dynamic selectors before extracting geometry. Use Playwright's
waitForLoadState('networkidle') or explicit waits for overlay elements before calling spatial tools.
4. Selector Fragility
- Issue: Using brittle CSS selectors (e.g.,
.div:nth-child(3)) that break when the DOM structure changes, leading to missing geometry data.
- Fix: Prefer robust selectors like
data-testid attributes, ARIA labels, or semantic roles. Ensure selectors are stable across refactors to maintain reliable spatial assertions.
5. Performance Bloat
- Issue: Extracting geometry for hundreds of elements in a single call can degrade performance and overwhelm the LLM context window.
- Fix: Batch selectors intelligently. Focus on critical paths and interactive elements. Use the
nodeSelectors array to target only relevant components, and avoid dumping the entire DOM geometry.
6. Z-Index "Auto" Ambiguity
- Issue: Elements with
z-index: auto may have unexpected stacking behavior based on DOM order. Agents might misinterpret visibility if they assume auto means no stacking.
- Fix: Treat
z-index: auto as context-dependent. When validating overlaps, rely on the computed intersection ratio rather than z-index values alone. The intersection math is the source of truth for occlusion.
7. Async Rendering Delays
- Issue: CSS transitions or animations may cause elements to be in a transient state during geometry extraction, leading to inconsistent bounding boxes.
- Fix: Disable animations in the test environment or wait for transitions to complete. Use
page.evaluate(() => document.getAnimations().forEach(a => a.finish())) to freeze the render state before extracting metrics.
Production Bundle
Action Checklist
Decision Matrix
Use this matrix to select the appropriate validation strategy based on your testing goals.
| Scenario | Recommended Approach | Why | Cost Impact |
|---|
| Functional Regression | DOM Assertions | Fast, reliable for logic and state changes. Low overhead. | Low |
| Visual Polish / Branding | Pixel Diff | Detects color, font, and image shifts. Essential for design fidelity. | Medium (Storage/Compute) |
| Layout / UX Integrity | Spatial MCP | Catches overlaps, off-screen elements, and responsive drift. Structured data for AI. | Low |
| Accessibility Compliance | ARIA / Axe Core | Validates semantic structure and screen reader compatibility. | Low |
| Performance / Load Time | Lighthouse / WebPageTest | Measures rendering performance and resource loading. | Low |
Configuration Template
Copy this configuration to integrate the spatial layout MCP server with your AI agent.
{
"mcpServers": {
"spatial-layout-agent": {
"command": "npx",
"args": ["-y", "playwright-spatial-layout-mcp"],
"env": {
"PLAYWRIGHT_BROWSERS_PATH": "/usr/bin/chromium",
"LOG_LEVEL": "info"
}
}
}
}
Quick Start Guide
Get spatial validation running in under five minutes.
-
Install Dependencies:
npm install -g playwright-spatial-layout-mcp
npx playwright install chromium
-
Configure Agent:
Add the MCP server block to your agent's configuration file as shown in the template above. Restart the agent to load the new tools.
-
Run a Spatial Check:
Prompt your agent with a spatial query:
"Check if the cookie banner is blocking the checkout button on a 375px viewport. Report the intersection ratio."
-
Validate Responsive Layout:
Ask the agent to analyze layout stability:
"Measure the responsive drift of the hero CTA between mobile and desktop viewports. Flag any element with a horizontal shift greater than 200px."
-
Integrate into Test Suite:
Instruct the agent to generate a Playwright test that includes spatial assertions:
"Generate a Playwright test for the checkout flow that verifies the pay button is visible, in the viewport, and not occluded by any overlays."
By injecting spatial awareness into AI agents, you close the gap between DOM structure and user experience. This enables agents to catch layout bugs that traditional tests miss, ensuring that green tests actually mean a working application.