Phoenix LiveView vs Rails Hotwire: I Built the Same Real-Time App in Both. The Numbers Aren't Close.
Architecting Real-Time Collaboration: Concurrency Models, WebSocket Overhead, and Framework Trade-offs
Current Situation Analysis
Building collaborative, real-time interfaces—live cursors, presence indicators, instant state synchronization—forces engineering teams into architectural corners that traditional request-response models cannot navigate. The industry pain point is not a lack of WebSocket libraries; it is the mismatch between stateless HTTP paradigms and stateful concurrency requirements. Teams routinely attempt to bolt real-time features onto CRUD-first architectures, resulting in fragmented transport layers, race conditions, and operational debt that compounds under load.
This problem is frequently overlooked because framework selection is often driven by ecosystem maturity, hiring availability, or developer familiarity rather than concurrency primitives. Engineering leaders assume that "WebSockets + HTTP" is a drop-in replacement for long-lived processes, underestimating the scheduling overhead, connection management complexity, and cross-process synchronization costs that emerge at scale. The result is a false equivalence: two frameworks may both support real-time updates, but their underlying execution models dictate entirely different operational ceilings.
Controlled benchmarking on identical hardware (2 vCPU, 7GB RAM, PostgreSQL 16) exposes the architectural reality. When hammering identical collaborative applications with sequential load tests, the stateful process model demonstrates approximately 5× higher HTTP throughput and establishes WebSocket connections roughly 14× faster under concurrent load. Memory consumption remains nearly identical (~70MB at 500 persistent connections), confirming that the bottleneck is not RAM allocation but thread scheduling and I/O blocking. These numbers are not tuning artifacts; they are direct consequences of preemptive lightweight process scheduling versus OS-thread-bound request handling.
WOW Moment: Key Findings
The data reveals a clear divergence in how each architecture handles concurrent state. Rather than treating this as a framework popularity contest, the metrics highlight which concurrency primitive aligns with your workload.
| Approach | HTTP p50 Latency | WS Handshake p50 | Throughput (70s) | Client-Side JS | Cross-Process Sync |
|---|---|---|---|---|---|
| Stateful Process Model | 130 ms | 30 ms | ~3,200 requests | 0 lines | Built-in PubSub |
| Split-Transport Model | 1,484 ms | 418 ms | ~640 requests | ~200 lines | External Adapter Required |
Why this matters: The stateful process model collapses transport, state management, and broadcasting into a single execution context. Each connection maps to a lightweight process (~2KB RAM) that holds room state in memory, processes events, and pushes minimal DOM diffs over a single WebSocket. The split-transport model decouples concerns: HTTP handles mutations, one WebSocket streams HTML fragments, and another streams JSON presence data. This separation simplifies mental models for traditional web apps but introduces serialization overhead, connection multiplexing complexity, and cross-process coordination latency. When concurrent users exceed dozens, the scheduling model becomes the primary determinant of user-perceived latency.
Core Solution
Building a production-ready real-time collaboration layer requires aligning your architecture with the concurrency model that matches your traffic pattern. Below is a step-by-step implementation strategy, followed by framework-specific adaptations.
Step 1: Define the Synchronization Boundary
Identify which state changes require immediate propagation versus eventual consistency. Real-time collaboration typically demands:
- Presence tracking (who is in the room)
- Typing/activity indicators
- Instant CRUD propagation
- Conflict resolution strategy (last-write-wins, CRDTs, or operational transforms)
Step 2: Choose the Transport Topology
- Single-Channel Topology: All real-time events flow over one WebSocket. State lives in a long-lived process. Ideal for high-frequency updates and presence-heavy apps.
- Multi-Channel Topology: Separate channels for mutations (HTTP), DOM updates (HTML streams), and metadata (JSON). Ideal for apps where real-time is secondary to traditional navigation.
Step 3: Implement Stateful Process Model (Elixir/OTP Pattern)
This approach treats each browser session as an isolated, long-lived process. The Erlang VM schedules these processes preemptively across all CPU cores, eliminating I/O blocking.
defmodule TaskSync.RoomLive do
use TaskSync.Web, :live_view
alias TaskSync.PresenceTracker
alias TaskSync.RoomBroadcast
@impl true
def mount(%{"room_id" => room_id}, _session, socket) do
if connected?(socket) do
RoomBroadcast.subscribe(room_id)
PresenceTracker.track(self(), room_id, socket.assigns.user_id)
end
{:ok,
socket
|> assign(:room_id, room_id)
|> assign(:tasks, RoomBroadcast.list_tasks(room_id))
|> assign(:presence, PresenceTracker.list(room_id))}
end
@impl true
def handle_event("add_task", %{"title" => title}, socket) do
task = RoomBroadcast.create_task(socket.assigns.room_id, title, socket.assigns.user_id)
RoomBroadcast.broadcast(socket.assigns.room_id, {:task_added, task})
{:noreply, update(socket, :tasks, &[task | &1])}
end
@impl true
def handle_info({:task_added, task}, socket) do
{:noreply, update(socket, :tasks, &[task | &1])}
end
end
Architecture Rationale:
mount/3establishes the subscription and presence tracking only when the socket is connected, preventing background process leaks.handle_event/3processes mutations synchronously within the process, then broadcasts to the room. The process updates its own state immediately, ensuring optimistic UI consistency.handle_info/2receives broadcasts from other processes. Because each process holds its own state, no database round-trip is required for local updates.- Why this choice? Preemptive scheduling ensures that a blocked I/O operation in one room never stalls another. The single WebSocket carries both commands and patches, eliminating client-side routing logic and reducing handshake overhead.
Step 4: Implement Split-Transport Model (Ruby/ActionCable Pattern)
This approach separates concerns across HTTP and WebSocket channels, mapping closely to REST conventions.
# app/channels/task_sync_channel.rb
class TaskSyncChannel < ApplicationCable::Channel
def subscribed
stream_from "task_sync_room_#{params[:room_id]}"
end
def receive(data)
action = data["action"]
room_id = data["room_id"]
case action
when "create_task"
TaskSyncBroadcast.perform_later(room_id, data["title"])
when "update_presence"
PresenceBroadcast.perform_later(room_id, data)
end
end
end
# app/broadcasters/task_sync_broadcast.rb
class TaskSyncBroadcast
include ActionCable::Broadcasting
def self.perform_later(room_id, title)
task = Task.create!(room_id: room_id, title: title)
broadcast_to("task_sync_room_#{room_id}",
action: "turbo_stream",
html: ApplicationController.render(
partial: "tasks/task",
locals: { task: task }
))
end
end
Architecture Rationale:
- Subscriptions are scoped to room-specific channels. ActionCable routes messages to connected clients via the configured adapter.
- Mutations trigger background jobs that serialize HTML fragments and push them over the cable. This decouples database writes from real-time delivery.
- Why this choice? Maps directly to familiar MVC patterns. Caching, authentication, and routing leverage existing Rails middleware. The trade-off is increased client-side complexity to stitch HTTP responses, Turbo streams, and JSON presence data together.
Step 5: Client-Side Integration (TypeScript)
Regardless of server architecture, the client must handle connection lifecycle, reconnection backoff, and DOM patching.
class RealtimeSyncClient {
private socket: WebSocket;
private reconnectAttempts = 0;
private maxReconnectDelay = 30000;
constructor(private endpoint: string, private roomId: string) {
this.connect();
}
private connect(): void {
this.socket = new WebSocket(`${this.endpoint}/rooms/${this.roomId}`);
this.socket.onopen = () => {
this.reconnectAttempts = 0;
this.sendPresence({ status: 'online' });
};
this.socket.onmessage = (event) => {
const payload = JSON.parse(event.data);
this.applyPatch(payload);
};
this.socket.onclose = () => this.scheduleReconnect();
}
private scheduleReconnect(): void {
const delay = Math.min(1000 * Math.pow(2, this.reconnectAttempts), this.maxReconnectDelay);
setTimeout(() => {
this.reconnectAttempts++;
this.connect();
}, delay);
}
public sendCommand(action: string, data: Record<string, unknown>): void {
if (this.socket.readyState === WebSocket.OPEN) {
this.socket.send(JSON.stringify({ action, data, room_id: this.roomId }));
}
}
private applyPatch(payload: { type: string; diff: unknown }): void {
// DOM diffing or framework-specific update logic
console.log(`Applying ${payload.type} patch:`, payload.diff);
}
}
Architecture Rationale:
- Exponential backoff prevents thundering herd scenarios during server restarts or network partitions.
- Command sending checks
readyStateto avoid throwing exceptions on closed sockets. - Patch application is abstracted, allowing integration with virtual DOM libraries or direct DOM manipulation.
- Why this choice? Decouples transport logic from UI rendering. The client remains framework-agnostic while handling connection resilience, which is critical for production WebSocket deployments.
Pitfall Guide
Real-time architectures introduce failure modes that traditional web apps rarely encounter. Below are the most common production mistakes and their fixes.
| Pitfall | Explanation | Fix |
|---|---|---|
| Blocking the Scheduler with Synchronous I/O | Long-running database queries or external API calls inside a WebSocket handler block the entire process/thread, stalling all connected clients. | Offload heavy work to background workers. Keep WebSocket handlers strictly for state routing and lightweight transformations. Use async I/O or connection pooling. |
| Assuming Single-Process Pub/Sub Scales | In-memory event buses work locally but fail in multi-node deployments. Clients on different servers never receive broadcasts. | Deploy a distributed pub/sub adapter (Redis, PostgreSQL LISTEN/NOTIFY, or Erlang distribution). Verify cross-node message delivery in staging before production rollout. |
| Over-Diffing the DOM | Sending full HTML payloads or unoptimized JSON diffs increases bandwidth, CPU usage on the client, and latency. | Transmit minimal structural diffs. Use content hashes or version vectors to skip unchanged nodes. Compress payloads with Brotli or gzip at the WebSocket layer. |
| Ignoring WebSocket Origin Validation | Attackers can forge WebSocket upgrade requests from malicious origins, hijacking sessions or injecting unauthorized events. | Enforce strict origin checking during the HTTP upgrade phase. Reject connections where the Origin header does not match allowed domains. Rotate signing keys periodically. |
| Memory Leaks in Long-Lived Connections | Failing to clean up subscriptions, timers, or event listeners when clients disconnect causes gradual memory growth and eventual OOM crashes. | Implement explicit unsubscribe and cleanup hooks on socket close. Use weak references for client metadata. Monitor RSS memory per connection in production. |
| Treating WebSockets as HTTP | Attempting to use REST conventions (status codes, headers, caching) over persistent connections leads to protocol mismatches and broken middleware. | Design a message-based protocol with explicit envelopes: { type, payload, correlation_id }. Handle authentication via initial handshake tokens, not per-message headers. |
| Missing Reconnection State Reconciliation | Clients that reconnect after network drops receive stale state or duplicate events, causing UI desynchronization. | Implement sequence numbers or version vectors. On reconnect, request a delta or full state snapshot. Discard out-of-order messages using monotonic counters. |
Production Bundle
Action Checklist
- Define sync boundaries: Map which features require real-time propagation versus eventual consistency.
- Select concurrency model: Choose stateful processes for high-frequency collaboration; choose split-transport for CRUD-heavy apps.
- Configure distributed pub/sub: Deploy Redis or PostgreSQL adapter before multi-node rollout.
- Implement connection lifecycle hooks: Add subscribe, unsubscribe, and cleanup handlers to prevent memory leaks.
- Enforce origin validation: Reject unauthorized WebSocket upgrades at the gateway level.
- Add reconnection logic: Build exponential backoff and state reconciliation on the client.
- Monitor per-connection metrics: Track memory, CPU, and message throughput per active socket.
- Load test under realistic conditions: Simulate network partitions, high concurrency, and sustained connection pools.
Decision Matrix
| Scenario | Recommended Approach | Why | Cost Impact |
|---|---|---|---|
| High-frequency collaboration (live cursors, presence, instant sync) | Stateful Process Model | Preemptive scheduling handles thousands of concurrent connections with minimal overhead. Built-in presence reduces custom code. | Higher initial learning curve; lower infra cost at scale due to efficient resource utilization. |
| Standard web app with occasional real-time features | Split-Transport Model | Maps to REST/MVC patterns. Leverages existing caching, authentication, and deployment pipelines. | Lower initial development cost; higher infra cost at scale due to thread limits and cross-process serialization. |
| Multi-node deployment with strict compliance requirements | Split-Transport Model + Redis Adapter | Familiar middleware stack simplifies audit trails. Redis provides reliable cross-node messaging with enterprise support. | Moderate infra cost; Redis cluster adds operational overhead but ensures predictable scaling. |
| Edge deployment or low-memory environments | Stateful Process Model | Lightweight processes (~2KB) minimize RAM footprint. No external pub/sub required for single-node setups. | Lowest infra cost; limited to single-node or Erlang distribution clusters. |
Configuration Template
Erlang VM Scheduler Tuning (Production)
# /etc/default/elixir_app
ERL_FLAGS="+S 2:2 +sbwt none +sbt db"
# +S 2:2: 2 schedulers online, 2 IO threads
# +sbwt none: Disables scheduler busy wait to reduce CPU spin
# +sbt db: Uses balanced distribution for load across cores
ActionCable Connection Pooling (Rails)
# config/cable.yml
production:
adapter: postgresql
pool: 5
url: <%= ENV["DATABASE_URL"] %>
channel_prefix: task_sync_prod
WebSocket Gateway Origin Policy (Nginx)
map $http_origin $allowed_origin {
default "";
"~^https://(www\.)?yourdomain\.com$" $http_origin;
}
server {
location /ws/ {
proxy_pass http://app_server;
proxy_set_header Origin $allowed_origin;
proxy_set_header Upgrade $http_upgrade;
proxy_set_header Connection "upgrade";
if ($allowed_origin = "") {
return 403;
}
}
}
Quick Start Guide
- Initialize the project: Create a new application using your chosen framework. Configure PostgreSQL 16 as the primary datastore and set up the real-time adapter (Erlang PubSub or PostgreSQL LISTEN/NOTIFY).
- Define the room schema: Create a
roomstable withid,name, andupdated_at. Add aroom_membershipsjoin table to track active connections and presence state. - Implement the connection handler: Build the WebSocket subscription logic. Add presence tracking, typing indicators, and a basic CRUD event handler. Test locally with two browser tabs.
- Add client resilience: Integrate the TypeScript sync client. Implement exponential backoff, connection state monitoring, and a basic DOM patching function. Verify reconnection behavior by toggling network connectivity.
- Deploy and validate: Push to a staging environment with 2 vCPU/7GB RAM. Run sequential load tests (HTTP ramp, WebSocket flood, persistent connection hold). Monitor memory per connection, handshake latency, and error rates. Adjust scheduler or pool settings based on observed bottlenecks.
Real-time collaboration is not a feature toggle; it is an architectural commitment. The frameworks that succeed in this space do so by aligning their execution model with the concurrency demands of persistent state. Choose the primitive that matches your traffic pattern, enforce strict connection hygiene, and measure before you scale.
Mid-Year Sale — Unlock Full Article
Base plan from just $4.99/mo or $49/yr
Sign in to read the full article and unlock all tutorials.
Sign In / Register — Start Free Trial7-day free trial · Cancel anytime · 30-day money-back
