Running On-Device AI in a React Native App: Real-Time Hazard Detection with CoreML
Architecting Offline Computer Vision for React Native: A Production Guide to CoreML Integration
Current Situation Analysis
Field service, construction, agriculture, and industrial inspection workflows share a brutal reality: connectivity is unreliable. Teams operating in basements, remote sites, or dense urban canyons cannot depend on cloud APIs for critical decision support. When an application requires AI-driven hazard detection, PPE compliance verification, or equipment inspection, network latency or total signal loss renders cloud-dependent solutions useless.
Despite the maturity of edge AI, many cross-platform teams still default to server-side inference. This stems from three persistent misconceptions:
- Binary bloat fear: Developers assume bundling ML models will explode app size and trigger App Store rejection.
- Performance anxiety: The belief that JavaScript bridges cannot handle computer vision workloads without freezing the UI thread.
- Accuracy trade-off myth: The assumption that on-device models are inherently too coarse for production-grade detection.
Modern hardware and model optimization pipelines have dismantled these barriers. CoreML on Apple Silicon devices can execute YOLOv8s-based object detection in under 300ms while consuming less than 200MB of RAM. A quantized .mlpackage typically stays under 50MB, well within App Store guidelines and user download tolerances. The real bottleneck is no longer hardware capability; it is architectural discipline. Teams that treat on-device inference as an afterthought rather than a core system constraint inevitably face memory leaks, thread contention, and inconsistent UX under load.
WOW Moment: Key Findings
When comparing cloud-dependent inference against a properly architected on-device pipeline, the divergence isn't just about speed. It's about deterministic behavior, cost predictability, and data sovereignty. The following comparison reflects production metrics captured on an iPhone 14 Pro running Expo SDK 52 with a bundled YOLOv8s .mlpackage.
| Approach | Avg Latency | Connectivity Requirement | Monthly Cost (10k requests) | Data Privacy |
|---|---|---|---|---|
| Cloud Vision API | 850β1200ms | Mandatory | $45β$90 | Sent to vendor |
| On-Device CoreML | 280β320ms | None | $0 | Local only |
The 60% latency reduction eliminates the perceptual gap between user action and system feedback. More importantly, removing network jitter transforms inference from a probabilistic operation into a synchronous UX primitive. Inspectors can receive real-time hazard overlays while framing a shot, rather than waiting for a spinner to resolve after capture. This enables continuous feedback loops that cloud architectures physically cannot support in disconnected environments.
Core Solution
Building a reliable offline vision pipeline requires three coordinated layers: model preparation, native bridge architecture, and JavaScript orchestration. Each layer must be optimized for memory, thread safety, and deterministic execution.
Step 1: Model Preparation & Quantization
Start with a YOLOv8s checkpoint. Export it to CoreML format using Apple's coremltools pipeline. Apply INT8 quantization to reduce precision without sacrificing detection accuracy for large, high-contrast objects like hard hats or high-visibility vests. The resulting .mlpackage should be validated against a representative dataset of field conditions before bundling.
Step 2: Native Swift Bridge Architecture
React Native cannot execute CoreML directly. A Swift module acts as the execution boundary. The module must:
- Initialize the model exactly once during app launch
- Accept image URIs from the JavaScript layer
- Execute inference synchronously to avoid promise overhead for short operations
- Return structured bounding box data as plain JSON
- Gracefully degrade when model loading fails
import CoreML
import Vision
import ExpoModulesCore
@objc(VisionInferenceBridge)
class VisionInferenceBridge: ExpoModule {
private var model: VNCoreMLModel?
private let requestQueue = DispatchQueue(label: "com.app.vision.inference")
override func moduleConstants() -> [String: Any]! {
return ["isReady": model != nil]
}
override func supportedEvents() -> [String]! {
return ["onInferenceComplete", "onModelError"]
}
override func viewDidLoad() {
super.viewDidLoad()
loadModel()
}
private func loadModel() {
guard let modelURL = Bundle.main.url(forResource: "HazardDetector", withExtension: "mlmodelc") else {
sendEvent("onModelError", ["reason": "Model bundle not found"])
return
}
do {
let coreMLModel = try MLModel(contentsOf: modelURL)
model = try VNCoreMLModel(for: coreMLModel)
} catch {
sendEvent("onModelError", ["reason": error.localizedDescription])
}
}
@objc(detectHazard:resolver:rejecter:)
func detectHazard(imageURI: String, resolver: @escaping RCTPromiseResolveBlock, rejecter: @escaping RCTPromiseRejectBlock) {
guard let visionModel = model else {
rejecter("MODEL_UNAVAILABLE", "Inference engine not initialized", nil)
return
}
guard let url = URL(string: imageURI), let ciImage = CIImage(contentsOf: url) else {
rejecter("INVALID_IMAGE", "Could not decode image URI", nil)
return
}
requestQueue.async {
let handler = VNImageRequestHandler(ciImage: ciImage, options: [:])
let detectionRequest = VNCoreMLRequest(model: visionModel) { request, error in
if let error = error {
DispatchQueue.main.async {
rejecter("INFERENCE_FAILED", error.localizedDescription, nil)
}
return
}
let results = request.results?.compactMap { observation -> [String: Any]? in
guard let obj = observation as? VNRecognizedObjectObservation,
let label = obj.labels.first else { return nil }
return [
"identifier": label.identifier,
"confidence": label.confidence,
"boundingBox": [
"x": obj.boundingBox.origin.x,
"y": obj.boundingBox.origin.y,
"width": obj.boundingBox.size.width,
"height": obj.boundingBox.size.height
]
]
} ?? []
DispatchQueue.main.async {
resolver(results)
}
}
detectionRequest.imageCropAndScaleOption = .scaleFill
try? handler.perform([detectionRequest])
}
}
}
Step 3: JavaScript Orchestration & Frame Sampling
Continuous inference requires throttling. Running detection on every camera frame will saturate the CPU and drain battery. A 750ms interval provides ~1.3 AI updates per second, which aligns with human perceptual thresholds for real-time feedback.
Camera frames should be captured at reduced quality. Detection accuracy for large objects remains stable at 30% quality, while preprocessing time drops significantly. The JavaScript layer manages the inference loop, state updates, and UI rendering.
import { useEffect, useRef, useState, useCallback } from 'react';
import { VisionInferenceBridge } from '../native-modules';
import { CameraView } from 'expo-camera';
interface DetectionResult {
identifier: string;
confidence: number;
boundingBox: { x: number; y: number; width: number; height: number };
}
export function useOfflineDetector(cameraRef: React.RefObject<CameraView>, isActive: boolean) {
const [detections, setDetections] = useState<DetectionResult[]>([]);
const intervalRef = useRef<NodeJS.Timeout | null>(null);
const isProcessingRef = useRef(false);
const runInference = useCallback(async () => {
if (!cameraRef.current || isProcessingRef.current || !isActive) return;
isProcessingRef.current = true;
try {
const frame = await cameraRef.current.takePictureAsync({
quality: 0.3,
skipProcessing: true,
base64: false,
});
if (frame) {
const results = await VisionInferenceBridge.detectHazard(frame.uri);
setDetections(results as DetectionResult[]);
}
} catch (error) {
console.warn('Inference cycle failed:', error);
} finally {
isProcessingRef.current = false;
}
}, [cameraRef, isActive]);
useEffect(() => {
if (isActive) {
intervalRef.current = setInterval(runInference, 750);
} else {
if (intervalRef.current) clearInterval(intervalRef.current);
setDetections([]);
}
return () => {
if (intervalRef.current) clearInterval(intervalRef.current);
};
}, [isActive, runInference]);
return detections;
}
Architecture Rationale
- Synchronous bridge with async execution: The Swift module uses a promise-based bridge but queues inference on a dedicated background thread. This prevents JS thread blocking while maintaining predictable return semantics.
- 750ms sampling window: Balances CPU utilization against UI responsiveness. Shorter intervals cause thermal throttling on sustained sessions; longer intervals break the illusion of real-time feedback.
- Quality 0.3 frame capture: Reduces pixel count by ~70%, cutting CIImage preprocessing time. YOLOv8s is robust to resolution loss for large, high-contrast targets.
- State isolation: Detection state lives outside the render cycle. The hook returns a plain array, allowing React to diff and render overlays efficiently without triggering unnecessary re-renders.
Pitfall Guide
1. Model Reinitialization on Every Call
Explanation: Developers often instantiate MLModel or VNCoreMLModel inside the inference function. This triggers file I/O and model compilation repeatedly, adding 150β300ms of overhead per call.
Fix: Initialize the model once during module startup. Store it as a private property and reuse it across all inference cycles.
2. Full-Resolution Frame Processing
Explanation: Capturing 12MP frames and passing them to CoreML forces the vision framework to downscale internally. This wastes CPU cycles on preprocessing and increases memory pressure. Fix: Configure camera capture to output reduced-quality frames (0.3β0.5). Validate that detection accuracy remains acceptable for your target object sizes.
3. UI-Layer Quota Enforcement
Explanation: Checking detection limits inside React components or screen logic allows users to bypass restrictions by manipulating state or calling native modules directly. Fix: Enforce entitlements at the data/service layer. Wrap the inference call in a guard function that validates subscription state before invoking the native bridge.
4. Ignoring Memory Warnings During Continuous Inference
Explanation: CIImage and VNImageRequestHandler allocate temporary buffers. Without explicit cleanup, continuous inference causes memory accumulation, triggering OS-level memory warnings and app termination.
Fix: Use autoreleasepool patterns in Swift, avoid retaining frame references, and monitor memory footprint during extended sessions. Target peak usage under 200MB.
5. Hardcoded Confidence Thresholds
Explanation: Shipping raw model confidence scores without calibration leads to false positives in varying lighting conditions. Construction sites have harsh shadows and reflective surfaces that skew predictions. Fix: Implement a dynamic threshold layer. Start with 0.65 confidence for PPE detection, then log false positives/negatives to adjust thresholds per environment. Never expose raw scores to end users.
6. Blocking the Main Thread with Synchronous Bridges
Explanation: While short sync calls are acceptable, running heavy inference on the main thread will freeze UI animations and touch handling.
Fix: Always dispatch CoreML requests to a background queue. Return results to the main thread only when updating React state. Use requestAnimationFrame or custom throttling to align updates with render cycles.
7. Skipping Model Quantization Validation
Explanation: INT8 quantization reduces size but can degrade accuracy on small or low-contrast objects. Assuming quantization is universally safe leads to production failures. Fix: Run a validation suite comparing FP32 vs INT8 outputs on 500+ field images. If accuracy drops below 85% for critical classes, switch to FP16 or retain FP32 for specific layers.
Production Bundle
Action Checklist
- Quantize YOLOv8s to INT8 and validate accuracy on field-condition images
- Bundle
.mlpackageunder 50MB and verify App Store size constraints - Implement singleton model initialization in Swift module
- Configure camera capture to 0.3 quality for inference frames
- Throttle inference loop to 750ms intervals with processing guards
- Enforce detection quotas at the service layer, not UI components
- Monitor peak memory usage and implement buffer cleanup strategies
- Calibrate confidence thresholds using real-site false positive logs
Decision Matrix
| Scenario | Recommended Approach | Why | Cost Impact |
|---|---|---|---|
| Remote field operations with intermittent connectivity | On-device CoreML | Eliminates dependency on network stability; guarantees deterministic latency | $0 infrastructure; higher initial dev cost |
| High-volume public analytics with strict privacy | Cloud API + anonymization | Centralized compute scales better; privacy handled via data stripping | $45β90/mo per 10k requests; compliance overhead |
| Real-time safety alerts requiring <400ms feedback | On-device CoreML | Network jitter cannot meet SLA; local inference provides consistent sub-300ms response | Battery optimization required; no recurring API costs |
| Batch compliance reporting with 24h delay | Cloud API | No real-time requirement; cloud processing enables richer post-processing | Lower dev complexity; predictable monthly billing |
Configuration Template
// VisionInferenceBridge.swift (Expo Module Structure)
import ExpoModulesCore
import CoreML
import Vision
@objc(VisionInferenceBridge)
class VisionInferenceBridge: ExpoModule {
private var inferenceEngine: VNCoreMLModel?
private let executionQueue = DispatchQueue(label: "com.app.vision.queue", qos: .userInitiated)
override func supportedEvents() -> [String]! {
return ["onInferenceReady", "onInferenceError"]
}
override func viewDidLoad() {
super.viewDidLoad()
initializeEngine()
}
private func initializeEngine() {
guard let modelURL = Bundle.main.url(forResource: "SafetyDetector", withExtension: "mlmodelc") else {
sendEvent("onInferenceError", ["code": "BUNDLE_MISSING"])
return
}
do {
let mlModel = try MLModel(contentsOf: modelURL)
inferenceEngine = try VNCoreMLModel(for: mlModel)
sendEvent("onInferenceReady", ["status": "loaded"])
} catch {
sendEvent("onInferenceError", ["code": "INIT_FAILED", "detail": error.localizedDescription])
}
}
@objc(processFrame:resolver:rejecter:)
func processFrame(imagePath: String, resolver: @escaping RCTPromiseResolveBlock, rejecter: @escaping RCTPromiseRejectBlock) {
guard let engine = inferenceEngine else {
rejecter("ENGINE_IDLE", "Model not loaded", nil)
return
}
guard let url = URL(string: imagePath), let sourceImage = CIImage(contentsOf: url) else {
rejecter("DECODE_ERROR", "Invalid image path", nil)
return
}
executionQueue.async {
let handler = VNImageRequestHandler(ciImage: sourceImage, options: [:])
let request = VNCoreMLRequest(model: engine) { req, err in
if let err = err {
DispatchQueue.main.async { rejecter("RUNTIME_ERROR", err.localizedDescription, nil) }
return
}
let output = req.results?.compactMap { obs -> [String: Any]? in
guard let obj = obs as? VNRecognizedObjectObservation,
let primary = obj.labels.first else { return nil }
return [
"class": primary.identifier,
"score": primary.confidence,
"rect": [
"x": obj.boundingBox.origin.x,
"y": obj.boundingBox.origin.y,
"w": obj.boundingBox.size.width,
"h": obj.boundingBox.size.height
]
]
} ?? []
DispatchQueue.main.async { resolver(output) }
}
request.imageCropAndScaleOption = .scaleFill
try? handler.perform([request])
}
}
}
Quick Start Guide
- Convert & Quantize: Export your YOLOv8s checkpoint using
coremltools. Apply INT8 quantization and verify the output.mlpackagestays under 50MB. - Bundle with Expo: Place the
.mlpackagein your project'sassets/directory. Configureapp.jsonto include it in the iOS build target so it ships with the binary. - Initialize Native Module: Create the Swift bridge using
expo-modules-core. Load the model on startup, expose a promise-based detection method, and route inference to a background queue. - Wire JavaScript Hook: Implement a throttled inference loop using
setIntervalorrequestAnimationFrame. Capture frames at 0.3 quality, pass URIs to the native module, and map results to absolute-positioned overlays. - Validate & Ship: Run a 10-minute continuous session on target hardware. Monitor memory peak, CPU temperature, and inference latency. Adjust sampling interval or confidence thresholds before production release.
Mid-Year Sale β Unlock Full Article
Base plan from just $4.99/mo or $49/yr
Sign in to read the full article and unlock all tutorials.
Sign In / Register β Start Free Trial7-day free trial Β· Cancel anytime Β· 30-day money-back
