lates model loading, worker communication, and memory-safe processing.
// types.ts
export interface SegmentationResult {
blob: Blob;
width: number;
height: number;
processingTimeMs: number;
}
export interface EngineConfig {
modelUrl: string;
workerUrl: string;
maxInputDimension: number;
confidenceThreshold: number;
}
// AssetSegmentationEngine.ts
export class AssetSegmentationEngine {
private worker: Worker | null = null;
private isInitialized: boolean = false;
private config: EngineConfig;
constructor(config: EngineConfig) {
this.config = config;
}
async initialize(): Promise<void> {
if (this.isInitialized) return;
// Validate Web Worker support
if (typeof Worker === 'undefined') {
throw new Error('Web Workers are required for segmentation.');
}
this.worker = new Worker(this.config.workerUrl, { type: 'module' });
// Send configuration to worker to load model
await this.sendWorkerCommand('INIT', {
modelUrl: this.config.modelUrl,
threshold: this.config.confidenceThreshold
});
this.isInitialized = true;
}
async processImage(file: File): Promise<SegmentationResult> {
if (!this.isInitialized) {
throw new Error('Engine not initialized. Call initialize() first.');
}
const startTime = performance.now();
// Resize image if it exceeds max dimension to save compute
const processedFile = await this.preprocessImage(file);
// Convert to ArrayBuffer for transfer to worker
const imageBuffer = await processedFile.arrayBuffer();
const result = await this.sendWorkerCommand('SEGMENT', {
imageBuffer,
width: processedFile.width,
height: processedFile.height
});
const processingTimeMs = performance.now() - startTime;
return {
blob: result.blob,
width: result.width,
height: result.height,
processingTimeMs
};
}
private async preprocessImage(file: File): Promise<File> {
// Implementation would use OffscreenCanvas or Canvas to resize
// preserving aspect ratio while capping max dimension
// Returns a new File object
return file; // Placeholder for resize logic
}
private sendWorkerCommand(type: string, payload: any): Promise<any> {
return new Promise((resolve, reject) => {
if (!this.worker) {
reject(new Error('Worker unavailable'));
return;
}
const messageId = Math.random().toString(36).substring(7);
const handler = (event: MessageEvent) => {
if (event.data.id === messageId) {
this.worker?.removeEventListener('message', handler);
if (event.data.error) {
reject(new Error(event.data.error));
} else {
resolve(event.data.payload);
}
}
};
this.worker.addEventListener('message', handler);
this.worker.postMessage({ id: messageId, type, payload }, [payload.imageBuffer]);
});
}
destroy(): void {
this.worker?.terminate();
this.worker = null;
this.isInitialized = false;
}
}
Worker Implementation Snippet
The worker handles the actual inference. This example uses a conceptual structure compatible with ONNX Runtime Web.
// segmentation.worker.ts
import { InferenceSession, Tensor } from 'onnxruntime-web';
let session: InferenceSession | null = null;
let threshold: number = 0.5;
self.onmessage = async (event) => {
const { id, type, payload } = event.data;
try {
if (type === 'INIT') {
session = await InferenceSession.create(payload.modelUrl);
threshold = payload.threshold;
self.postMessage({ id, payload: { status: 'ready' } });
return;
}
if (type === 'SEGMENT' && session) {
const { imageBuffer, width, height } = payload;
// Decode image to tensor (simplified)
// In production, use a library like 'pngjs' or canvas decoding
const inputTensor = await decodeImageToTensor(imageBuffer, width, height);
const feeds = { input: inputTensor };
const results = await session.run(feeds);
// Extract mask and apply to image
const mask = results.output.data as Float32Array;
const outputBlob = await applyMaskAndEncode(imageBuffer, mask, width, height, threshold);
self.postMessage({
id,
payload: {
blob: outputBlob,
width,
height
}
}, [outputBlob]);
}
} catch (error) {
self.postMessage({ id, error: error.message });
}
};
Rationale:
- Transferable Objects: The
sendWorkerCommand uses transferable objects ([payload.imageBuffer]) to move data to the worker without copying, halving memory usage during transfer.
- Promise Wrapper: The worker communication is wrapped in a Promise-based pattern to allow
async/await usage in the main thread, simplifying error handling and flow control.
- Lifecycle Management: The
destroy method ensures the worker is terminated, freeing resources when the component unmounts or the session ends.
Pitfall Guide
Production deployments of client-side ML often fail due to overlooked environmental constraints. The following pitfalls represent common failure modes and their remediations.
-
Main Thread Blocking
- Explanation: Running inference on the main thread freezes the UI, causing the browser to display "Page Unresponsive" warnings on complex images.
- Fix: Always offload inference to a Web Worker. Use
OffscreenCanvas if canvas manipulation is required within the worker.
-
Memory Leaks via Blob URLs
- Explanation: Creating object URLs for processed images without revoking them causes the browser's memory heap to grow indefinitely, leading to crashes in long-running sessions.
- Fix: Implement a strict lifecycle for
URL.createObjectURL. Call URL.revokeObjectURL immediately after the blob is consumed or uploaded.
-
Model Loading Race Conditions
- Explanation: Attempting to process an image before the model weights are fully downloaded and initialized results in null reference errors.
- Fix: Implement a state machine in the engine. Reject processing requests until the
INIT command returns a success status. Use a loading queue if multiple images are submitted during initialization.
-
Ignoring Mobile Thermal Throttling
- Explanation: Mobile devices reduce CPU clock speeds under sustained load. A model that runs in 200ms on desktop may take 2s on a throttled mobile CPU, degrading UX.
- Fix: Use quantized models (INT8). Monitor processing time; if it exceeds a threshold, pause the queue or switch to a lower-resolution model variant.
-
CORS Failures on Model Assets
- Explanation: Browsers block loading model files (
.onnx, .bin) from CDNs if the server does not send proper CORS headers. This is a silent failure in some runtimes.
- Fix: Host model assets on the same origin or ensure the CDN is configured with
Access-Control-Allow-Origin: *. Validate headers during the build pipeline.
-
Format Incompatibility
- Explanation: Segmentation models typically expect RGBA tensors. Passing JPEG data (which lacks an alpha channel) or WebP without proper decoding results in corrupted masks.
- Fix: Normalize all inputs to a standard format (e.g., 3-channel RGB tensor) before inference. Handle alpha channel preservation explicitly if the source image contains transparency.
-
Lack of Fallback Strategy
- Explanation: Users on legacy browsers or extremely low-end devices may not support WASM or Web Workers, causing the feature to break entirely.
- Fix: Implement capability detection. If the client environment is insufficient, automatically route the request to a server-side API. This hybrid approach guarantees functionality for all users.
Production Bundle
Action Checklist
Decision Matrix
| Scenario | Recommended Approach | Why | Cost Impact |
|---|
| High Volume, Low Margin | Client-Side Inference | Eliminates per-image API costs. Bandwidth savings are significant. | CapEx on dev; OpEx near zero. |
| Enterprise, Strict SLA | Hybrid (Client + Server Fallback) | Ensures reliability. Client handles bulk; server handles edge cases. | Moderate dev cost; reduced API spend. |
| Mobile-First App | Client-Side with Quantized Model | Reduces data usage for users. Preserves privacy on device. | Zero API cost; improved UX. |
| Legacy Browser Support | Server-Side API | Older browsers lack WASM/Worker support. | Linear API cost; higher bandwidth. |
| Privacy-Regulated Data | Client-Side Only | Data never leaves the device. Complies with GDPR/HIPAA easily. | Zero data exfiltration risk. |
Configuration Template
Use this Vite configuration to ensure WASM files and model assets are handled correctly during the build process. This prevents common 404 errors and MIME type issues.
// vite.config.ts
import { defineConfig } from 'vite';
import react from '@vitejs/plugin-react';
export default defineConfig({
plugins: [react()],
optimizeDeps: {
exclude: ['onnxruntime-web'], // Prevent premature bundling of WASM
},
build: {
assetsInlineLimit: 0, // Do not inline large model assets
rollupOptions: {
output: {
assetFileNames: (assetInfo) => {
// Ensure WASM and model files get correct extensions
if (assetInfo.name?.endsWith('.wasm')) {
return 'assets/[name][extname]';
}
return 'assets/[name]-[hash][extname]';
},
},
},
},
server: {
headers: {
// Required for SharedArrayBuffer if using multi-threading
'Cross-Origin-Opener-Policy': 'same-origin',
'Cross-Origin-Embedder-Policy': 'require-corp',
},
},
});
Quick Start Guide
-
Install Dependencies:
npm install onnxruntime-web
npm install -D @types/onnxruntime-web
-
Download Model Assets:
Obtain a quantized background removal model (e.g., u2net_quantized.onnx). Place the model file and its associated data files in your public/models directory.
-
Initialize Engine:
Import the AssetSegmentationEngine in your component. Call initialize() on mount, passing the path to the model and the worker script.
const engine = new AssetSegmentationEngine({
modelUrl: '/models/u2net_quantized.onnx',
workerUrl: '/workers/segmentation.worker.js',
maxInputDimension: 1024,
confidenceThreshold: 0.5
});
await engine.initialize();
-
Process Image:
Attach the processImage method to your file input handler. Display the resulting blob in an <img> tag or prepare it for upload.
const handleFileChange = async (file: File) => {
const result = await engine.processImage(file);
const imageUrl = URL.createObjectURL(result.blob);
setPreviewUrl(imageUrl);
};
-
Cleanup:
Ensure you call engine.destroy() and URL.revokeObjectURL() when the component unmounts or the session ends to prevent memory leaks.