Back to KB
Difficulty
Intermediate
Read Time
4 min

What 500 curated failure pairs actually fix: a breakdown across 3 seeds

By Codcompass Team··4 min read

What 500 Curated Failure Pairs Actually Fix: A Breakdown Across 3 Seeds

Current Situation Analysis

Small-scale DPO/RLHF experiments frequently suffer from signal dilution and evaluation blindness. Traditional approaches rely on either synthetic bug generation or massive preference datasets (50k+ samples), both of which introduce critical failure modes:

  • Synthetic failures lack distributional fidelity: Artificially injected bugs do not reflect the model's actual confidence gaps or reasoning boundaries, causing the preference signal to train on artifacts rather than genuine capability gaps.
  • Aggregate metrics mask failure mode shifts: Pass@1 or pass@k scores collapse diverse failure types into a single scalar. A +3% gain could stem from algorithmic reasoning improvements, formatting cleanup, or even distribution collapse (e.g., refusal spikes or new syntax errors).
  • Mismatched quality bars break contrastive learning: Pairing model outputs against external "ideal" answers introduces domain shift. The model learns to mimic a different distribution rather than correcting its own failure modes.
  • Single-seed evaluation creates false confidence: Coding benchmarks like HumanEval (164 problems) are small enough that integer-count ties and seed variance produce misleading deltas. Without multi-seed validation, improvements are statistically indistinguishable from noise.

WOW Moment: Key Findings

The experiment isolates the impact of 500 curated preference pairs where both chosen and rejected sides originate from the same internal validation pipeline. This ensures the contrastive signal trains on "honest failure vs. honest success" rather than external idealization.

ApproachPass@1 (Mean)NAME_ERROR CountASSERTION_FAIL Count
Base (Qwen2.5-Coder-3B)80.4

🎉 Mid-Year Sale — Unlock Full Article

Base plan from just $4.99/mo or $49/yr

Sign in to read the full article and unlock all 635+ tutorials.

Sign In / Register — Start Free Trial

7-day free trial · Cancel anytime · 30-day money-back