Picking black or white text: a tiny trained model vs WCAG luminance

Statistical Contrast: A Lightweight Logistic Model for Dynamic UI Text Color

Current Situation Analysis

Modern interfaces increasingly rely on user-generated or algorithmically assigned colors for tags, calendar events, avatars, and data visualization elements. A persistent engineering challenge in these systems is selecting a legible text color—typically black or white—based on the background hue.

The industry standard response is to implement the W3C's WCAG 2.0 relative luminance formula. This approach converts sRGB values to linear light, applies perceptual weights, and thresholds the result. While WCAG is the correct standard for accessibility compliance and contrast ratios for low-vision users, it is frequently misapplied as a general-purpose aesthetic heuristic.

This misapplication creates two distinct problems:

Objective Mismatch: WCAG optimizes for minimum contrast ratios to ensure readability for users with visual impairments. It does not optimize for the binary aesthetic decision of "which text color looks better to a typical user." Consequently, WCAG's decision boundary diverges from human perception in approximately 14% of color cases, particularly in saturated mid-luminance regions.
Computational Overhead: The luminance formula requires gamma decoding via power functions (pow), which are computationally expensive. In high-throughput rendering scenarios—such as rendering thousands of data points or virtualized list items—this overhead accumulates, impacting frame rates.

A statistically trained binary logistic regression model offers a superior alternative for UI text color selection. By fitting a linear decision boundary to human-labeled data, this approach achieves higher agreement with human judgment while reducing computational complexity by eliminating transcendental functions entirely.

WOW Moment: Key Findings

The following comparison highlights the divergence between the standard luminance approach and the logistic regression model. The data demonstrates that a simpler geometric model, trained on perceptual data, outperforms the complex physical model for this specific use case.

Metric	WCAG Relative Luminance	Logistic Regression Model
Human Agreement	~83.1%	~92.0%
Disagreement Resolution	Baseline	Wins 4:1 on conflicts
Decision Boundary	Curved Surface (Gamma-corrected)	Hyperplane (Linear)
Transcendental Operations	3 (`pow`/`exp` calls)	0 (Optimized linear check)
Primary Optimization Goal	Accessibility Compliance	Aesthetic Readability

Why this matters: The logistic model not only aligns better with human perception but also reduces the decision logic to a single linear combination. By recognizing that the threshold at 0.5 probability corresponds to a linear sum of zero, the model can be implemented without any exponential or power functions, resulting in near-zero latency even in tight rendering loops.

Core Solution

The solution replaces the luminance calculation with a binary logistic regression classifier. The model was trained on a balanced dataset of approximately 600 hand-labeled RGB colors, where human annotators selected the preferred text color.

Architecture Decisions

Linear Decision Boundary: Analysis of the training data reveals that the separation between "black text" and "white text" regions in RGB space is effectively planar. A linear model captures this boundary with high fidelity, rendering complex gamma corrections unnecessary for this task.
Coefficient Analysis: The learned weights for the RGB channels are [0.027291, 0.0688366, 0.006275]. These weights preserve the perceptual ordering found in vision science (Green > Red > Blue) but adjust the ratios to better fit the binary classification task. Specifically, the model reduces the dominance of the green channel relative to red compared to WCAG, and aggressively underweights blue, which aligns with how humans perceive brightness in saturated colors.
Optimization Insight: The standard logistic function computes a probability via the sigmoid: P = 1 / (1 + exp(-z)). The classification decision is P <= 0.5. Mathematically, sigmoid(z) <= 0.5 is equivalent to z <= 0. This allows the implementation to bypass the sigmoid calculation entirely, reducing the operation to a simple comparison of the linear combination against zero.

Implementation

The following TypeScript implementation provides a production-ready utility. It includes input validation and leverages the linear optimization to achieve maximum performance.

/**
 * Parameters derived from binary logistic regression on human-labeled color data.
 * Coefficients represent the weight of each channel in the decision hyperplane.
 */
const TEXT_COLOR_MODEL = {
  weights: {
    red: 0.027291,
    green: 0.0688366,
    blue: 0.006275,
  },
  bias: -13.9369834,
} as const;

/**
 * Determines the optimal text color (light or dark) for a given background RGB.
 * 
 * This function implements a logistic regression decision boundary.
 * By checking if the linear combination is <= 0, we avoid computing the sigmoid,
 * eliminating all transcendental operations.
 * 
 * @param r - Red channel (0-255)
 * @param g - Green channel (0-255)
 * @param b - Blue channel (0-255)
 * @returns 'light' for white text, 'dark' for black text
 */
export function getTextColor(r: number, g: number, b: number): 'light' | 'dark' {
  // Clamp inputs to valid range to prevent undefined behavior
  const clampedR = Math.max(0, Math.min(255, Math.round(r)));
  const clampedG = Math.max(0, Math.min(255, Math.round(g)));
  const clampedB = Math.max(0, Math.min(255, Math.round(b)));

  // Compute the linear combination z = w·x + b
  const linearScore = 
    (clampedR * TEXT_COLOR_MODEL.weights.red) +
    (clampedG * TEXT_COLOR_MODEL.weights.green) +
    (clampedB * TEXT_COLOR_MODEL.weights.blue) +
    TEXT_COLOR_MODEL.bias;

  // Decision boundary: sigmoid(z) <= 0.5 iff z <= 0
  // z <= 0 implies the background is "dark enough" for light text
  return linearScore <= 0 ? 'light' : 'dark';
}

/**
 * Helper to parse hex strings into RGB components.
 */
export function getTextColorFromHex(hex: string): 'light' | 'dark' {
  const cleanHex = hex.replace('#', '');
  const bigint = parseInt(cleanHex, 16);
  const r = (bigint >> 16) & 255;
  const g = (bigint >> 8) & 255;
  const b = bigint & 255;
  return getTextColor(r, g, b);
}

Rationale for Choices

Raw RGB Input: The model operates directly on raw RGB values. There is no need for color space conversion because the training process implicitly learned the mapping from raw values to human preference.
Zero Transcendental Calls: The optimization linearScore <= 0 removes the need for Math.exp or Math.pow. This is critical for performance in scenarios rendering tens of thousands of elements.
Explicit Thresholding: The bias term -13.9369834 encodes the threshold. Adjusting this value allows developers to bias the model toward lighter or darker text if specific UI requirements demand it.

Pitfall Guide

Using WCAG for Aesthetic Decisions
- Explanation: WCAG ensures minimum contrast for accessibility. Using it for general UI text selection results in suboptimal aesthetics, as it fails to match human perception in ~14% of cases.
- Fix: Use the logistic model for dynamic UI text color selection. Reserve WCAG calculations for accessibility audits and compliance checks.
The Magenta/Orange Trap
- Explanation: The model exhibits higher error rates in the saturated magenta, pink, and orange regions of color space. In these areas, human perception is highly variable, and the model may classify colors as darker than they appear.
- Fix: Acknowledge this trade-off. For critical UI elements in these hue ranges, consider adding a heuristic override or allowing manual overrides. The error rate remains low overall, but specific saturated hues may require attention.
Computing the Sigmoid Unnecessarily
- Explanation: Implementing the full sigmoid function 1 / (1 + exp(-z)) adds unnecessary computational cost.
- Fix: Use the linear optimization z <= 0. This reduces the operation to basic arithmetic and comparison, improving throughput significantly.
Ignoring Input Validation
- Explanation: Passing unvalidated or out-of-range values can lead to incorrect classifications or unexpected behavior.
- Fix: Always clamp RGB inputs to the [0, 255] range and ensure integer values before processing.
Misinterpreting Coefficients as Luminance Weights
- Explanation: The model coefficients resemble WCAG weights but have different ratios. Treating them as perceptual luminance weights is incorrect.
- Fix: Treat the coefficients strictly as parameters of a classification boundary. Do not use them to calculate luminance values.
Overlooking Threshold Tuning
- Explanation: The default threshold is optimized for balanced accuracy. Some UI designs may benefit from a bias toward lighter or darker text.
- Fix: Adjust the bias parameter to shift the decision boundary. Increasing the bias makes the model more likely to select dark text; decreasing it favors light text.
Performance Bottlenecks in Virtualized Lists
- Explanation: In virtualized rendering, color calculations occur frequently during scroll events. Inefficient implementations can cause jank.
- Fix: Use the optimized linear check. Profile the rendering loop to ensure color calculation is not the bottleneck. The optimized model adds negligible overhead.

Production Bundle

Action Checklist

Audit existing code for WCAG luminance usage in UI text color selection.
Replace luminance calculations with the logistic regression utility.
Ensure all color inputs are clamped and validated before processing.
Benchmark the new implementation in high-throughput rendering scenarios.
Document the trade-off between WCAG compliance and aesthetic optimization.
Review saturated magenta/orange colors in the UI for potential overrides.
Adjust the bias parameter if the design system requires a text color preference.

Decision Matrix

Scenario	Recommended Approach	Why	Cost Impact
Accessibility Audit	WCAG Relative Luminance	Required for compliance and low-vision support.	High compute, mandatory.
User-Generated Tags/Avatars	Logistic Regression Model	Higher human agreement, faster execution.	Low compute, improved UX.
Data Visualization	Logistic Regression Model	Handles large datasets efficiently; aesthetic match.	Minimal compute, scalable.
Low-Vision Mode	WCAG + High Contrast	Ensures safety and readability for impaired users.	High compute, safety critical.
High-Volume Rendering	Optimized Logistic Model	Zero transcendental ops; maximizes frame rate.	Negligible compute, performance gain.

Configuration Template

Copy this template to integrate the model into your project. Adjust the bias if needed.

// text-color.model.ts

export const TEXT_COLOR_CONFIG = {
  weights: {
    red: 0.027291,
    green: 0.0688366,
    blue: 0.006275,
  },
  // Default bias. Adjust to shift threshold.
  // Positive bias -> favors dark text.
  // Negative bias -> favors light text.
  bias: -13.9369834, 
} as const;

export type TextColor = 'light' | 'dark';

export function computeTextColor(r: number, g: number, b: number): TextColor {
  const score = 
    (r * TEXT_COLOR_CONFIG.weights.red) +
    (g * TEXT_COLOR_CONFIG.weights.green) +
    (b * TEXT_COLOR_CONFIG.weights.blue) +
    TEXT_COLOR_CONFIG.bias;
  
  return score <= 0 ? 'light' : 'dark';
}

Quick Start Guide

Integrate: Copy the TEXT_COLOR_CONFIG and computeTextColor function into your utility library.
Replace: Locate calls to WCAG luminance functions used for text color selection and replace them with computeTextColor(r, g, b).
Validate: Test with a diverse set of colors, paying attention to saturated blues, pinks, and olives where the model improves over WCAG.
Optimize: Ensure inputs are integers in [0, 255] to maintain performance.
Deploy: The model is ready for production with zero external dependencies and minimal runtime cost.

Mid-Year Sale — Unlock Full Article