Building a free Polymarket screener: how I turned 13,963 markets into a single scannable page
Architecting a Zero-Cost Prediction Market Scanner: Static Generation Over Real-Time Streams
Current Situation Analysis
Prediction market interfaces are fundamentally optimized for execution, not discovery. When you land on a typical platform, youâre greeted by order books, candlestick charts, and bet slips designed for a single asset. This depth-first design creates a critical blind spot: there is no native mechanism to scan across the entire market universe for anomalies, momentum shifts, or liquidity pockets. If you need to identify which contracts dropped 15 percentage points overnight, or which low-priced assets are accumulating genuine volume, youâre forced to manually browse or rely on third-party aggregators that often obscure their methodology.
This gap persists because developers conflate screening with execution. Real-time data pipelines, WebSocket connections, and low-latency order routing are essential when placing trades, but they introduce unnecessary complexity when the goal is simply to identify setups. The industry overlooks a simple truth: market discovery operates on a different time horizon than trade execution. You donât need millisecond precision to spot a trend; you need reliable, cross-sectional visibility.
The data reality reinforces this architectural mismatch. Platforms like Polymarket index tens of thousands of historical contracts, but the actively tradable universe is a fraction of that total. At any given moment, only roughly 1,200 markets remain open and liquid. The remaining ~12,800 are resolved, expired, or suspended. Building a real-time infrastructure to monitor 14,000 endpoints is computationally wasteful when 90% of the data is static. A screening architecture should reflect this asymmetry: lightweight, batch-oriented, and focused exclusively on the active subset.
WOW Moment: Key Findings
The most counterintuitive insight in building a market scanner is that data staleness is a feature, not a liability. When comparing architectural approaches for cross-market discovery, the trade-offs become stark:
| Approach | Infrastructure Cost | Data Freshness | Development Complexity | Screening Suitability |
|---|---|---|---|---|
| Static Regeneration (2â4 hr cycles) | $0 (CDN/Storage) | 2â4 hours | Low | High |
| Real-Time WebSocket Stream | $50â$200/mo (Compute + Bandwidth) | <100 ms | High | Low |
| Server-Side Backend + DB | $30â$100/mo (Compute + Storage) | 1â5 min | Medium | Medium |
Why this matters: Screening is a high-signal, low-frequency activity. Youâre looking for structural shiftsâvolume spikes, mean-reversion candidates, or liquidity migrationsânot tick-by-tick price movements. A static regeneration pipeline eliminates database overhead, removes authentication requirements, and guarantees deterministic outputs. The 2â4 hour refresh window aligns perfectly with human decision cycles. You scan, analyze, and plan. Execution happens later, on the primary platform. This architecture decouples discovery from trading, reducing both cost and cognitive load.
Core Solution
Building a reliable scanner requires three distinct phases: data acquisition, signal transformation, and static compilation. Each phase must be isolated to ensure maintainability, testability, and predictable deployment.
Phase 1: Paginated Data Acquisition
The public Gamma API (https://gamma-api.polymarket.com/markets) returns market metadata in paginated JSON. It requires no authentication but enforces a maximum page size. The acquisition layer must handle offset-based pagination gracefully, respecting implicit rate limits and implementing timeout controls for transient network failures.
import fetch from 'node-fetch';
interface MarketPayload {
condition_id: string;
group_item_title: string;
active: boolean;
volume24hr: number;
one_day_change: number;
last_trade_price: number;
}
class GammaDataFetcher {
private readonly endpoint = 'https://gamma-api.polymarket.com/markets';
private readonly batchSize = 500;
async fetchActiveUniverse(): Promise<MarketPayload[]> {
const allMarkets: MarketPayload[] = [];
let currentOffset = 0;
let hasMore = true;
while (hasMore) {
const params = new URLSearchParams({
closed: 'false',
limit: String(this.batchSize),
offset: String(currentOffset),
});
const response = await fetch(`${this.endpoint}?${params}`, {
headers: { 'Accept': 'application/json' },
signal: AbortSignal.timeout(15000),
});
if (!response.ok) {
throw new Error(`Gamma API failed with status ${response.status}`);
}
const batch: MarketPayload[] = await response.json();
if (batch.length === 0) {
hasMore = false;
break;
}
allMarkets.push(...batch);
currentOffset += this.batchSize;
// Polite delay to respect implicit rate limits
await new Promise(res => setTimeout(res, 800));
}
return allMarkets;
}
}
Phase 2: Signal Generation & Filtering
Raw market data is noisy. The transformation layer applies business logic to isolate actionable signals. Two primary rankings emerge: momentum movers and liquidity leaders. Crucially, price changes must be filtered against volume thresholds to eliminate illiquid "dust" markets that produce false signals.
interface ScreenedMarket extends MarketPayload {
signal_type: 'mover' | 'volume_leader' | 'crash_proxy';
score: number;
}
class MarketTransformer {
private readonly minVolumeThreshold = 1000;
private readonly crashThreshold = -0.15;
transform(rawMarkets: MarketPayload[]): ScreenedMarket[] {
const liquidMarkets = rawMarkets.filter(m => m.volume24hr > this.minVolumeThreshold);
const movers: ScreenedMarket[] = liquidMarkets
.map(m => ({ ...m, signal_type: 'mover' as const, score: Math.abs(m.one_day_change) }))
.sort((a, b) => b.score - a.score);
const volumeLeaders: ScreenedMarket[] = liquidMarkets
.map(m => ({ ...m, signal_type: 'volume_leader' as const, score: m.volume24hr }))
.sort((a, b) => b.score - a.score);
const crashProxies: ScreenedMarket[] = liquidMarkets
.filter(m => m.one_day_change <= this.crashThreshold)
.map(m => ({ ...m, signal_type: 'crash_proxy' as const, score: Math.abs(m.one_day_change) }));
return [...movers, ...volumeLeaders, ...crashProxies];
}
}
Phase 3: Static Compilation
The final phase converts the transformed dataset into a self-contained HTML document. This eliminates server runtime, database queries, and authentication layers. The output is a single index.html file with embedded JSON data and client-side rendering logic. Deployment reduces to a git push to a static hosting provider.
Architecture Decisions & Rationale:
- TypeScript over Python: Provides strict typing for market payloads, reducing runtime errors during transformation and making refactoring safer.
- Batch pagination: Prevents memory exhaustion and aligns with API constraints. Offset-based iteration is predictable and easy to debug.
- Volume-first filtering: Ensures price movements are evaluated against actual liquidity, not theoretical spreads. A 20% swing on $5 volume is noise; the same swing on $50,000 volume is a signal.
- Static output: Guarantees zero infrastructure cost and deterministic behavior. The 2â4 hour refresh cycle matches human scanning patterns, not algorithmic trading loops.
Pitfall Guide
Pagination Drift & Offset Limits Explanation: Relying on
offsetwithout verifying batch size can cause duplicate or missing records if the dataset mutates between calls. New markets opening or closing during a fetch cycle shift the offset window. Fix: Always validatebatch.length < batchSizeas the termination condition. Implement idempotent writes or checksums if storing locally. Consider fetching byupdated_attimestamps when the API supports it.Ignoring Liquidity Thresholds Explanation: Sorting by price change alone surfaces illiquid markets where a single small trade can swing prices by 20%. These are statistical noise, not actionable signals. Fix: Enforce a minimum
volume24hrfilter (e.g., >1000) before applying any ranking logic. Document the threshold clearly in UI labels.Misinterpreting Percentage vs. Basis Point Changes Explanation: A 0.15 change represents 15 percentage points, not 15%. Confusing these leads to incorrect threshold configuration and misaligned user expectations. Fix: Explicitly document and label thresholds as decimal fractions (0.15 = 15pp) in both code and UI. Use consistent formatting across all displays.
Over-Engineering for Real-Time Latency Explanation: Building WebSocket listeners or Redis caches for a screening tool introduces unnecessary complexity. Screening doesnât require sub-second updates. Fix: Stick to batch regeneration. Reserve real-time architectures for execution or monitoring dashboards. Keep the scanner stateless.
Missing API Rate Limit & Retry Logic Explanation: The Gamma API doesnât publish explicit rate limits, but aggressive polling triggers temporary blocks or timeouts. Network blips can crash the pipeline. Fix: Implement exponential backoff, fixed delays between pages, and circuit breakers for consecutive failures. Log response times to detect degradation early.
Hardcoding Market Categories Explanation: Filtering by keyword or hardcoded titles breaks when platforms restructure categories or introduce new event types. Fix: Use dynamic tagging systems or allow configuration-driven filters. Parse
group_item_titlegenerically and apply regex or semantic matching only when necessary.Neglecting Data Validation on Price Fields Explanation: API responses occasionally return
nullor malformed numbers forlast_trade_priceorone_day_change, crashing the transformation pipeline. Fix: Apply strict type guards and fallback defaults during the fetch phase. Never assume API payloads are perfectly shaped. Sanitize before sorting.
Production Bundle
Action Checklist
- Verify API pagination limits and implement offset-based iteration with batch validation
- Apply liquidity filters before any sorting or signal generation to eliminate dust markets
- Implement exponential backoff and request timeouts for all external API calls
- Decouple data fetching from UI rendering to enable independent testing and regeneration
- Document threshold logic explicitly (e.g., 0.15 = 15 percentage points, not 15%)
- Add data validation guards for nullable or malformed price fields
- Schedule regeneration via CI/CD or cron jobs rather than manual execution
- Monitor API response times and implement alerting for sustained degradation
Decision Matrix
| Scenario | Recommended Approach | Why | Cost Impact |
|---|---|---|---|
| Cross-market discovery & setup scanning | Static regeneration (2â4 hr) | Matches human decision cycles, eliminates infrastructure overhead | $0 |
| Live execution & order placement | Real-time WebSocket + CLOB API | Requires sub-second latency and authenticated routing | $50â$200/mo |
| Historical backtesting & signal validation | Batch export to local database | Enables complex queries, time-series analysis, and reproducible research | $10â$30/mo (storage) |
| Alerting on threshold breaches | Event-driven webhook + static base | Triggers only when conditions are met, avoids constant polling | $5â$15/mo |
Configuration Template
{
"api": {
"endpoint": "https://gamma-api.polymarket.com/markets",
"batch_size": 500,
"request_timeout_ms": 15000,
"delay_between_pages_ms": 800
},
"filters": {
"exclude_closed": true,
"min_volume_24h": 1000,
"crash_signal_threshold": -0.15,
"mover_sort_field": "one_day_change"
},
"output": {
"format": "static_html",
"regeneration_interval_hours": 3,
"include_crash_proxies": true,
"max_displayed_movers": 50
},
"infrastructure": {
"hosting": "github_pages",
"ci_trigger": "cron",
"cost_projection": "zero"
}
}
Quick Start Guide
- Initialize the project: Create a new TypeScript project (
npm init -y && npm install typescript node-fetch @types/node-fetch --save-dev) and configuretsconfig.jsonfor ES modules with strict mode enabled. - Implement the fetcher: Copy the
GammaDataFetcherclass intosrc/fetcher.ts. Run a dry execution to verify pagination, payload structure, and timeout behavior. - Build the transformer: Add the
MarketTransformerclass tosrc/transformer.ts. Apply volume filters, generate ranked arrays, and validate threshold logic against sample data. - Compile static output: Write a simple HTML generator that embeds the transformed JSON into a
<script>tag and renders a responsive table using vanilla DOM manipulation or a lightweight templating engine. - Deploy & schedule: Push the generated
docs/index.htmlto a repository configured for static hosting. Schedule regeneration using GitHub Actions, GitLab CI, or a local cron job. Verify the pipeline runs successfully and outputs match expectations.
Mid-Year Sale â Unlock Full Article
Base plan from just $4.99/mo or $49/yr
Sign in to read the full article and unlock all tutorials.
Sign In / Register â Start Free Trial7-day free trial · Cancel anytime · 30-day money-back
