Architecting a Local LLM Registry: Client-Side Filtering and Metadata Normalization in Next.js

By Codcompass Team·2026-05-12·72 min read

Architecting a Local LLM Registry: Client-Side Filtering and Metadata Normalization in Next.js

Current Situation Analysis

The explosion of open-weight language models has created a discovery problem. Developers running local inference through runtimes like Ollama face a fragmented catalog experience. The official model directory relies on endless scrolling, inconsistent capability tagging, and zero programmatic sorting. When evaluating models for specific workloads, engineers need to cross-reference parameter counts, context window limits, and modality support (text, vision, embeddings) across multiple pages. This friction is rarely addressed because model distribution platforms prioritize weight hosting over discovery UX.

The metadata inconsistency is the core technical debt. Context windows are published in mixed formats: raw integers (32768), kilounit suffixes (128k), and megunit suffixes (1M). Capability tags lack standardization, and search functionality is typically limited to exact name matching. Without a normalized data layer, developers perform mental arithmetic and manual filtering, which scales poorly as model libraries grow beyond fifty entries.

This problem is overlooked because most teams treat model selection as a one-time setup task. In reality, local LLM stacks require frequent model swapping for benchmarking, cost optimization, and task-specific routing. A dedicated registry UI that normalizes metadata, enforces precise filtering logic, and delivers instant client-side interactions transforms model selection from a manual chore into a deterministic engineering workflow.

WOW Moment: Key Findings

Building a structured registry UI reveals a critical divergence between default UI patterns and actual developer workflows. The most impactful change isn't visual polish; it's the shift from OR-based tag matching to AND-based capability filtering, combined with strict metadata normalization.

Approach	Filter Precision	Metadata Consistency	Interaction Latency	Cognitive Load
Official Catalog	Low (OR logic, inconsistent tags)	Mixed formats (`k`, `M`, raw)	High (server-rendered pagination)	High (manual cross-referencing)
Structured Registry UI	High (AND logic, strict schema)	Normalized to base units	Near-zero (client-side state)	Low (deterministic sorting)

This finding matters because it directly impacts inference pipeline reliability. When you need a model that supports both vision and chat, OR filtering returns irrelevant results that match either capability. AND filtering guarantees the model satisfies all workload requirements before deployment. Normalizing context windows to a single numeric unit enables accurate sorting, preventing costly mistakes like deploying a 32k model for a 128k document pipeline. The registry approach shifts model selection from guesswork to configuration.

Core Solution

The architecture prioritizes static data generation, client-side state management, and strict type safety. We avoid runtime API calls to the model catalog, instead shipping a pre-validated JSON dataset with the application. This guarantees instant deployment, predictable performance, and zero external dependencies during runtime.

Step 1: Data Architecture and Static Generation

We fetch the Ollama model catalog during the build phase and serialize it into a local JSON file. Next.js App Router handles this via a server component that imports the static asset. The data shape is strictly typed to prevent runtime mismatches.

// types/model-registry.ts
export interface ModelSpec {
  id: string;
  name: string;
  sizeBytes: number;
  contextTokens: number;
  capabilities: ('chat' | 'vision' | 'embedding' | 'code')[];
  quantization: string;
}

export interface ModelRegistry {
  version: string;
  lastUpdated: string;
  models: ModelSpec[];
}

The build script pulls the raw catalog, normalizes the context window field, and writes registry.json. Shipping this file means the UI loads instantly without network waterfalls.

Step 2: Metadata Normalization

Context windows arrive in inconsistent formats. A dedicated parser converts all variations to base tokens before the data reaches the UI.

// utils/normalize-context.ts
export function parseContextWindow(raw: string | number): number {
  if (typeof raw === 'number') return raw;
  
  const normalized = raw.toLowerCase().trim();
  const match = normalized.match(/^(\d+(?:\.\d+)?)\s*(k|m)?$/);
  
  if (!match) throw new Error(`Invalid context format: ${raw}`);
  
  const value = parseFloat(match[1]);
  const suffix = match[2];
  
  if (suffix === 'm') return Math.round(value * 1_000_000);
  if (suffix === 'k') return Math.round(value * 1_000);
  return Math.round(value);
}

This function runs during the data pipeline, not in the browser. By normalizing at build time, we eliminate runtime parsing overhead and guarantee that sorting operations work on pure integers.

Step 3: AND-Based Capability Filtering

Most filter implementations use OR logic, returning models that match any selected tag. For local inference, developers require intersection logic. The filter function checks that every selected capability exists in the model's array.

// utils/filter-registry.ts
import { ModelSpec } from '@/types/model-registry';

export function applyCapabilityFilter(
  models: ModelSpec[],
  selectedCapabilities: string[]
): ModelSpec[] {
  if (selectedCapabilities.length === 0) return models;
  
  return models.filter((model) => {
    return selectedCapabilities.every((cap) =>

model.capabilities.includes(cap as ModelSpec['capabilities'][number]) ); }); }


This approach prevents false positives. If a pipeline requires vision and embedding support, the registry only surfaces models that natively support both, reducing trial-and-error deployment cycles.

### Step 4: Client-Side Sorting and Search

With normalized data, sorting becomes a deterministic array operation. We use React state to manage sort direction and active filters, updating the view without server roundtrips.

```typescript
// components/registry-table.tsx
'use client';

import { useState, useMemo } from 'react';
import { ModelSpec } from '@/types/model-registry';
import { applyCapabilityFilter } from '@/utils/filter-registry';

interface RegistryTableProps {
  initialData: ModelSpec[];
}

export function RegistryTable({ initialData }: RegistryTableProps) {
  const [searchQuery, setSearchQuery] = useState('');
  const [activeFilters, setActiveFilters] = useState<string[]>([]);
  const [sortConfig, setSortConfig] = useState<{ key: keyof ModelSpec; dir: 'asc' | 'desc' } | null>(null);

  const processedData = useMemo(() => {
    let result = applyCapabilityFilter(initialData, activeFilters);
    
    if (searchQuery) {
      const lower = searchQuery.toLowerCase();
      result = result.filter((m) => m.name.toLowerCase().includes(lower));
    }
    
    if (sortConfig) {
      const { key, dir } = sortConfig;
      result = [...result].sort((a, b) => {
        const aVal = a[key];
        const bVal = b[key];
        if (typeof aVal === 'number' && typeof bVal === 'number') {
          return dir === 'asc' ? aVal - bVal : bVal - aVal;
        }
        return dir === 'asc' 
          ? String(aVal).localeCompare(String(bVal)) 
          : String(bVal).localeCompare(String(aVal));
      });
    }
    
    return result;
  }, [initialData, activeFilters, searchQuery, sortConfig]);

  // Render table using shadcn/ui components
  // ...
}

useMemo ensures sorting and filtering only recompute when dependencies change. The numeric comparison branch handles sizeBytes and contextTokens correctly, while string fields fall back to locale-aware comparison.

Architecture Rationale

Static JSON over API Proxy: Eliminates runtime latency, reduces server costs, and guarantees consistent data shape across environments.
Client-Side Filtering: Model catalogs rarely exceed 200-300 entries. Client-side operations complete in under 5ms, providing instant feedback without network overhead.
AND Logic by Default: Matches actual engineering constraints. Workloads require capability intersection, not union.
Build-Time Normalization: Prevents runtime type errors and ensures sorting algorithms operate on homogeneous data types.

Pitfall Guide

1. Defaulting to OR-Based Tag Filtering

Explanation: Most UI libraries implement tag filters as unions. This returns models matching any selected capability, flooding results with irrelevant options. Fix: Enforce intersection logic using Array.every() or Set operations. Validate that all selected capabilities exist in the model's capability array before inclusion.

2. Naive String Sorting for Context Windows

Explanation: Sorting ["128k", "32768", "1M"] as strings produces incorrect order (128k < 1M < 32768). Lexicographic comparison breaks numeric intent. Fix: Normalize all context values to base tokens during the data pipeline. Sort exclusively on numeric fields. Never sort raw metadata strings in the UI.

3. Blocking the Main Thread During Sort/Filter

Explanation: Running heavy array operations on the main thread causes UI jank, especially when users rapidly toggle filters or type search queries. Fix: Wrap expensive computations in useMemo with stable dependencies. For catalogs exceeding 1,000 entries, offload sorting to a Web Worker or implement virtualization to limit DOM nodes.

4. Ignoring Type Safety in Metadata Pipelines

Explanation: Importing raw JSON without validation allows malformed entries to crash the UI or produce silent sorting bugs. Fix: Use Zod or TypeScript interfaces to validate the JSON schema at build time. Fail the build if required fields (contextTokens, capabilities) are missing or malformed.

5. Stale Data in Static Builds

Explanation: Shipping a static JSON file means the registry doesn't reflect new model releases until the next deployment. Fix: Implement a CI/CD job that fetches the latest catalog, runs the normalization script, and commits the updated JSON. Trigger deployments automatically on schema changes.

6. Accessibility Gaps in Custom Tables

Explanation: Replacing native HTML tables with div-based layouts breaks screen reader navigation and keyboard focus management. Fix: Use semantic <table>, <thead>, <tbody>, and <th> elements. Ensure sort buttons have aria-sort attributes and that filter pills are focusable with proper role="checkbox" semantics.

7. Over-Engineering with Server-Side Filtering

Explanation: Routing filter requests through API endpoints adds latency and complexity for datasets that comfortably fit in browser memory. Fix: Keep filtering and sorting client-side for catalogs under 500 entries. Reserve server-side pagination for enterprise datasets requiring database queries or row-level security.

Production Bundle

Action Checklist

Validate JSON schema with Zod before shipping to production
Implement AND-based capability filtering using Array.every()
Normalize context windows and sizes to base units during build time
Wrap sort/filter logic in useMemo to prevent unnecessary re-renders
Add debounced search input to reduce state update frequency
Ensure all interactive elements meet WCAG 2.1 AA contrast and focus standards
Configure CI pipeline to refresh catalog data weekly or on tag changes
Add error boundaries around the registry component to handle malformed entries gracefully

Decision Matrix

Scenario	Recommended Approach	Why	Cost Impact
Catalog < 300 models, infrequent updates	Static JSON + client-side filtering	Zero runtime cost, instant load, predictable performance	$0 infrastructure
Catalog > 1,000 models, frequent updates	Server-side pagination + database index	Reduces payload size, enables efficient querying	Moderate compute cost
Multi-tenant registry with access control	API proxy + row-level security	Enforces permissions, isolates tenant data	Higher backend cost
Real-time benchmarking data required	Hybrid: static metadata + dynamic metrics API	Keeps discovery fast, fetches live scores on demand	Low-medium API cost

Configuration Template

// public/registry.json
{
  "version": "2024.06.15",
  "lastUpdated": "2024-06-15T08:00:00Z",
  "models": [
    {
      "id": "llama3.1:8b",
      "name": "llama3.1",
      "sizeBytes": 4900000000,
      "contextTokens": 131072,
      "capabilities": ["chat", "code"],
      "quantization": "Q4_K_M"
    },
    {
      "id": "llava:13b",
      "name": "llava",
      "sizeBytes": 7800000000,
      "contextTokens": 4096,
      "capabilities": ["chat", "vision"],
      "quantization": "Q5_K_M"
    }
  ]
}

// next.config.js
/** @type {import('next').NextConfig} */
const nextConfig = {
  output: 'export',
  trailingSlash: true,
  images: { unoptimized: true },
  // Ensures static export works with JSON imports
  webpack: (config) => {
    config.resolve.fallback = { fs: false, path: false };
    return config;
  }
};

module.exports = nextConfig;

Quick Start Guide

Initialize Project: Run npx create-next-app@latest registry-ui --typescript --tailwind --app. Install @radix-ui/react-slot, class-variance-authority, clsx, tailwind-merge, and lucide-react.
Add Data Pipeline: Create a scripts/fetch-catalog.ts file that pulls the Ollama model list, runs parseContextWindow() on each entry, and writes public/registry.json. Execute it before building.
Build Registry Component: Import the JSON file in a client component. Implement useMemo-wrapped filtering and sorting logic. Render the table using shadcn/ui components with proper ARIA attributes.
Verify & Deploy: Run npm run build to confirm static export succeeds. Test filter intersection logic and numeric sorting. Deploy to any static host (Vercel, Netlify, Cloudflare Pages) with zero configuration.