Building React AI Chat Components in 2026: Complete Guide

By Codcompass Team·2026-05-16·9 min read

Architecting High-Performance LLM Chat Interfaces in React: Streaming, State, and UX Patterns

Current Situation Analysis

The modern developer often treats AI chat interfaces as simple form submissions: user inputs text, server returns text, UI updates. This mental model fails under production load. A production-grade chat interface is a real-time data pipeline that must handle network volatility, partial data rendering, markdown parsing, and strict latency constraints.

The industry pain point is the "Spinning Wheel of Death." When applications wait for a full model response before rendering, perceived latency spikes. Users abandon interfaces where the first token takes longer than 2 seconds. Furthermore, naive implementations often suffer from state thrashing during streaming, memory leaks from abandoned requests, and security vulnerabilities in markdown rendering.

Data from user experience studies indicates that streaming responses can reduce perceived latency by 60-80% compared to batch requests, even if total generation time remains identical. The critical metric is Time to First Byte (TTFB) relative to user interaction, not just server processing time. Despite this, many teams overlook the architectural complexity required to stream safely in React, leading to brittle components that crash on network drops or freeze the main thread during heavy markdown parsing.

WOW Moment: Key Findings

The choice between batch and streaming architectures fundamentally alters the user experience and implementation complexity. The following comparison highlights why streaming is the default requirement for modern LLM interfaces.

Strategy	TTFB	Perceived Latency	Implementation Complexity	Risk Profile
Batch Request	~2.5s	High	Low	High abandonment rate
Server-Sent Events	~200ms	Low	Medium	Connection management overhead
Chunked Streaming	~150ms	Very Low	Medium	Requires robust stream parsing

Why this matters: Chunked streaming via ReadableStream offers the optimal balance of UX and control. It allows the React component to render tokens as they arrive, providing immediate feedback. However, it demands careful state management to avoid excessive re-renders and proper error handling for partial stream failures.

Core Solution

Building a resilient chat interface requires separating concerns: a custom hook for network and state logic, and presentational components for rendering. This approach enables reuse, testability, and clean UI updates.

1. Architecture Decisions

Custom Hook (useConversationStream): Encapsulates the fetch logic, stream parsing, and state updates. This keeps components declarative.
AbortController: Essential for cancelling in-flight requests when users navigate away or send new messages.
Ref-based Accumulation: During streaming, accumulating text in a useRef and batching state updates prevents React from re-rendering on every single token, which can cause UI jank.
Markdown Safety: Use react-markdown with restricted plugins and rehype-sanitize to prevent XSS attacks from model-generated HTML.

2. Implementation: The Streaming Hook

This hook manages the conversation history, handles the streaming fetch, and exposes a clean API for the UI.

import { useState, useRef, useCallback } from 'react';

export interface ChatTurn {
  id: string;
  role: 'user' | 'assistant';
  content: string;
  timestamp: number;
}

export interface StreamConfig {
  apiKey: string;
  endpoint: string;
  model: string;
  maxTokens?: number;
}

export function useConversationStream(config: StreamConfig) {
  const [turns, setTurns] = useState<ChatTurn[]>([]);
  const [isStreaming, setIsStreaming] = useState(false);
  const [error, setError] = useState<string | null>(null);
  
  const abortControllerRef = useRef<AbortController | null>(null);
  const streamBufferRef = useRef<string>('');

  const submitQuery = useCallback(async (userContent: string) => {
    if (!userContent.t

rim() || isStreaming) return;

// Reset error state
setError(null);

// Create user turn
const userTurn: ChatTurn = {
  id: crypto.randomUUID(),
  role: 'user',
  content: userContent.trim(),
  timestamp: Date.now(),
};

setTurns(prev => [...prev, userTurn]);

// Prepare assistant turn placeholder
const assistantTurnId = crypto.randomUUID();
const assistantTurn: ChatTurn = {
  id: assistantTurnId,
  role: 'assistant',
  content: '',
  timestamp: Date.now(),
};

setTurns(prev => [...prev, assistantTurn]);
setIsStreaming(true);
streamBufferRef.current = '';

// Cancel any existing request
if (abortControllerRef.current) {
  abortControllerRef.current.abort();
}
abortControllerRef.current = new AbortController();

try {
  const response = await fetch(config.endpoint, {
    method: 'POST',
    headers: {
      'Authorization': `Bearer ${config.apiKey}`,
      'Content-Type': 'application/json',
    },
    body: JSON.stringify({
      model: config.model,
      messages: [...turns, userTurn].map(t => ({
        role: t.role,
        content: t.content,
      })),
      stream: true,
      max_tokens: config.maxTokens || 1024,
    }),
    signal: abortControllerRef.current.signal,
  });

  if (!response.ok) {
    throw new Error(`Request failed with status ${response.status}`);
  }

  const reader = response.body?.getReader();
  const decoder = new TextDecoder();

  if (!reader) {
    throw new Error('ReadableStream not supported');
  }

  while (true) {
    const { done, value } = await reader.read();
    if (done) break;

    const chunk = decoder.decode(value, { stream: true });
    const lines = chunk.split('\n');

    for (const line of lines) {
      if (line.startsWith('data: ')) {
        const payload = line.slice(6);
        if (payload === '[DONE]') continue;

        try {
          const json = JSON.parse(payload);
          const deltaContent = json.choices?.[0]?.delta?.content;
          
          if (deltaContent) {
            streamBufferRef.current += deltaContent;
            
            // Update state with accumulated content
            setTurns(prev => 
              prev.map(t => 
                t.id === assistantTurnId 
                  ? { ...t, content: streamBufferRef.current } 
                  : t
              )
            );
          }
        } catch (parseErr) {
          // Ignore malformed JSON chunks common in streaming
          console.warn('Stream parse warning:', parseErr);
        }
      }
    }
  }
} catch (err) {
  if (err instanceof Error && err.name === 'AbortError') {
    console.log('Stream aborted by user');
  } else {
    const message = err instanceof Error ? err.message : 'Unknown stream error';
    setError(message);
    // Remove failed assistant turn
    setTurns(prev => prev.filter(t => t.id !== assistantTurnId));
  }
} finally {
  setIsStreaming(false);
  abortControllerRef.current = null;
}

}, [turns, isStreaming, config]);

const cancelStream = useCallback(() => { if (abortControllerRef.current) { abortControllerRef.current.abort(); } }, []);

return { turns, isStreaming, error, submitQuery, cancelStream }; }


#### 3. Implementation: The Chat Interface

The component focuses on rendering, auto-scrolling, and markdown handling.

```typescript
import React, { useRef, useEffect } from 'react';
import ReactMarkdown from 'react-markdown';
import { Prism as SyntaxHighlighter } from 'react-syntax-highlighter';
import { vscDarkPlus } from 'react-syntax-highlighter/dist/esm/styles/prism';
import { useConversationStream, ChatTurn } from './useConversationStream';

interface ChatDeckProps {
  config: {
    apiKey: string;
    endpoint: string;
    model: string;
  };
}

export function ChatDeck({ config }: ChatDeckProps) {
  const { turns, isStreaming, error, submitQuery, cancelStream } = useConversationStream(config);
  const [inputValue, setInputValue] = useState('');
  const messagesEndRef = useRef<HTMLDivElement>(null);
  const chatContainerRef = useRef<HTMLDivElement>(null);

  // Auto-scroll logic: only scroll if user is near bottom
  const shouldAutoScroll = useRef(true);

  useEffect(() => {
    const container = chatContainerRef.current;
    if (!container) return;

    const handleScroll = () => {
      const { scrollTop, scrollHeight, clientHeight } = container;
      // If user is within 50px of bottom, enable auto-scroll
      shouldAutoScroll.current = scrollHeight - scrollTop - clientHeight < 50;
    };

    container.addEventListener('scroll', handleScroll);
    return () => container.removeEventListener('scroll', handleScroll);
  }, []);

  useEffect(() => {
    if (shouldAutoScroll.current && messagesEndRef.current) {
      messagesEndRef.current.scrollIntoView({ behavior: 'smooth' });
    }
  }, [turns, isStreaming]);

  const handleSend = () => {
    if (inputValue.trim()) {
      submitQuery(inputValue);
      setInputValue('');
    }
  };

  return (
    <div className="chat-container">
      <div ref={chatContainerRef} className="messages-area">
        {turns.map(turn => (
          <MessageBubble key={turn.id} turn={turn} />
        ))}
        
        {isStreaming && (
          <div className="typing-indicator">
            <span>Model is thinking...</span>
          </div>
        )}
        
        <div ref={messagesEndRef} />
      </div>

      {error && (
        <div className="error-banner">
          {error}
          <button onClick={() => window.location.reload()}>Retry</button>
        </div>
      )}

      <div className="input-area">
        <textarea
          value={inputValue}
          onChange={e => setInputValue(e.target.value)}
          onKeyDown={e => {
            if (e.key === 'Enter' && !e.shiftKey) {
              e.preventDefault();
              handleSend();
            }
          }}
          placeholder="Type your message..."
          disabled={isStreaming}
        />
        {isStreaming ? (
          <button onClick={cancelStream} className="stop-btn">Stop</button>
        ) : (
          <button onClick={handleSend} disabled={!inputValue.trim()}>Send</button>
        )}
      </div>
    </div>
  );
}

function MessageBubble({ turn }: { turn: ChatTurn }) {
  return (
    <div className={`message ${turn.role}`}>
      <div className="role-label">
        {turn.role === 'user' ? 'You' : 'Assistant'}
      </div>
      <div className="content">
        <ReactMarkdown
          components={{
            code({ className, children, ...props }) {
              const match = /language-(\w+)/.exec(className || '');
              return match ? (
                <SyntaxHighlighter
                  style={vscDarkPlus}
                  language={match[1]}
                  PreTag="div"
                  {...props}
                >
                  {String(children).replace(/\n$/, '')}
                </SyntaxHighlighter>
              ) : (
                <code className={className} {...props}>
                  {children}
                </code>
              );
            },
          }}
        >
          {turn.content}
        </ReactMarkdown>
        {turn.content === '' && isStreaming && <span className="cursor-blink">▋</span>}
      </div>
    </div>
  );
}

Pitfall Guide

Stream Abandonment and Memory Leaks
- Explanation: If a user navigates away or sends a new message while a stream is active, the previous fetch request continues consuming bandwidth and CPU.
- Fix: Always use AbortController. Cancel the previous request before starting a new one, and abort on component unmount.
Excessive Re-renders During Streaming
- Explanation: Updating React state on every token can cause the UI to re-render hundreds of times per second, leading to jank and high CPU usage.
- Fix: Accumulate content in a useRef and update state only when necessary. Alternatively, use a custom hook that batches updates or updates state every N tokens.
Auto-Scroll Jitter
- Explanation: Forcing scroll-to-bottom on every update interrupts the user if they are scrolling up to read previous messages.
- Fix: Track scroll position. Only auto-scroll if the user is already near the bottom of the container.
Markdown Security Vulnerabilities
- Explanation: LLMs can generate malicious HTML or JavaScript within markdown blocks.
- Fix: Use rehype-sanitize to strip dangerous attributes and tags. Restrict react-markdown plugins to safe extensions like remark-gfm.
Context Window Overflow
- Explanation: Sending the entire conversation history without limits can exceed the model's context window, causing API errors or degraded quality.
- Fix: Implement a sliding window strategy. Trim older messages when the token count approaches the model's limit.
Ignoring Partial Stream Errors
- Explanation: Network drops can result in partial responses. The UI might display incomplete text without indicating an error.
- Fix: Catch errors in the stream loop. If the stream ends unexpectedly, mark the turn with an error state or remove the incomplete message.
Blocking the Main Thread with Markdown Parsing
- Explanation: Parsing large markdown blocks synchronously can freeze the UI.
- Fix: For very long responses, consider virtualizing the message list or using web workers for markdown parsing if performance degrades.

Production Bundle

Action Checklist

Implement AbortController to cancel in-flight requests on navigation or new inputs.
Add scroll position detection to prevent auto-scroll interference.
Sanitize markdown output using rehype-sanitize to prevent XSS.
Configure react-syntax-highlighter with a theme that matches your app's design system.
Add a "Stop Generation" button to allow users to cancel long responses.
Implement error boundaries around the chat component to prevent app crashes.
Add retry logic for transient network failures.
Monitor context window usage and implement message trimming.

Decision Matrix

Scenario	Recommended Approach	Why	Cost Impact
Real-time UX Critical	Chunked Streaming	Lowest perceived latency; keeps users engaged.	Higher bandwidth usage; complex state management.
Simple Bot / Low Traffic	Batch Request	Simpler implementation; easier error handling.	Higher perceived latency; potential user drop-off.
Code-Heavy Responses	Streaming + Syntax Highlighting	Essential for readability of code blocks.	Requires `react-syntax-highlighter`; slightly heavier bundle.
Mobile / Low Bandwidth	Batch with Loading Skeleton	Reduces connection overhead; predictable UX.	May feel slower on high-latency networks.

Configuration Template

Use this template to configure the streaming hook for different environments.

// chatConfig.ts
export const CHAT_CONFIG = {
  production: {
    apiKey: process.env.REACT_APP_LLM_API_KEY,
    endpoint: 'https://api.provider.com/v1/chat/completions',
    model: 'claude-3-5-sonnet-20241022',
    maxTokens: 2048,
  },
  development: {
    apiKey: 'dev-key-placeholder',
    endpoint: 'https://api.provider.com/v1/chat/completions',
    model: 'claude-3-5-sonnet-20241022',
    maxTokens: 512,
  },
};

Quick Start Guide

Install Dependencies:

npm install react-markdown react-syntax-highlighter

Create the Hook: Copy the useConversationStream implementation into hooks/useConversationStream.ts.
Create the Component: Copy the ChatDeck implementation into components/ChatDeck.tsx.
Wire Up: Import ChatDeck in your app and pass the configuration object.
```
<ChatDeck config={CHAT_CONFIG.production} />
```
Test: Verify streaming works, auto-scroll behaves correctly, and markdown renders safely. Check network tab for proper stream handling.

🎉 Mid-Year Sale — Unlock Full Article

Base plan from just $4.99/mo or $49/yr

7-day free trial · Cancel anytime · 30-day money-back