Manifest-First Indie Dev: Why I Replaced All My INDEX.md Files with YAML Frontmatter
Manifest-First Indie Dev: Why I Replaced All My INDEX.md Files with YAML Frontmatter
Current Situation Analysis
Indie developers and solo technical operators typically manage 30β50+ markdown files across directories like reports/, products/, and INBOX/. The traditional approach relies on a manually maintained INDEX.md to catalog assets. This pattern fails rapidly due to three core failure modes:
- Metadata Decoupling & Stale Indexes:
INDEX.mdlives outside the asset. Any file rename, move, or status update requires manual synchronization. Within 48 hours, the index drifts from reality, forcing daily maintenance overhead (~30 min/day). - Shallow Auto-Generation: Filename-based auto-indexing scripts eliminate manual work but strip semantic context. Filenames cannot encode priority, ETA, revenue forecasts, execution commands, or dependency graphs. The resulting index is structurally flat and programmatically useless.
- Query & View Rigidity: Traditional indexes are static text. Extracting programmatic views (e.g.,
P0todos ordered byeta_min, or dependency chains for launch blockers) requires custom parsing logic per request. There is no single source of truth, making dashboards, revenue tracking, and cross-referencing brittle.
The break-even threshold for this pattern is ~20β30 files. Beyond that, manual curation becomes a tax on shipping velocity, while auto-generated indexes lack the semantic depth required for operational decision-making.
WOW Moment: Key Findings
| Approach | Metric 1 | Metric 2 | Metric 3 |
|---|---|---|---|
| Manual INDEX.md | 30β45 min/day maintenance | 1β2 semantic fields/file | ~60% sync drift after 1 week |
| Filename Auto-INDEX | ~5 min/day (script run) | 0 semantic fields/file | ~95% structural drift, no query capability |
| Manifest-First (YAML + Scanner) | 0 min/day (on-demand) | 13 semantic fields/file | 0% drift, instant programmatic queries |
Key Findings:
- Overhead Elimination: Shifting metadata into YAML frontmatter and generating views on-demand via a 200-line Python scanner eliminates daily index maintenance entirely.
- Semantic Density: 13 standardized fields (5 required, 8 optional) cover 95% of indie asset metadata, enabling revenue forecasting, execution routing, and dependency mapping.
- ROI Timeline: ~9 hours of upfront migration (schema design, scanner build, manual/helper migration) yields ~30 min/day savings, achieving full payback within 1 month.
- Architectural Sweet Spot: Optimal for solo/indie scale (1β3 files/authors, <100 assets). Multi-person teams require strict schema governance to avoid merge conflicts and validation drift.
Core Solution
The manifest-first pattern treats every file as a self-describing node. Metadata is embedded as YAML frontmatter, parsed by a lightweight scanner, and rendered into dynamic dashboard views.
1. Schema Design (5 Required + 8 Optional)
The schema enforces a consistent contract while remaining extensible. New categories or fields do not break existing assets; the scanner only consumes what is defined.
---
id: ios-pricing-decision # URL-safe slug, unique
title: iOS Pricing Decision Report
category: ios-pricing # one of N enums
priority: P0 # P0 / P1 / P2 / P3
status: ready # ready / scaffold / draft / done / archived
# Optional below
eta_min: 30 # minutes user-action time
revenue_usd_month: "200-500" # forecast revenue contribution
actions: [preview, copy-clipboard] # dashboard buttons to surface
tags: [ios, pricing, paywall] , free-form
ice_score: 6.48 # Impact Γ Confidence Γ Ease
tier_price_usd: 19.0 # if it's a paid SKU
command: python orchestrator/foo.py # if runnable
depends_on: [other-asset-id] # graph relationships
live_url: https://gumroad.com/... # if it's been published
---
2. Scanner Architecture (200-line Python)
The scanner traverses the directory tree, filters excluded paths, parses frontmatter, and maps raw YAML to a strict @dataclass. It supports .md, .py (docstring manifests), .sh (comment blocks), and .yaml (directory-level manifests).
@dataclass
class Asset:
id: str
title: str
category: str
priority: str
status: str
path: str
file_type: str
eta_min: int | None = None
revenue_usd_month: str | None = None
actions: list[str] = field(default_factory=list)
tags: list[str] = field(default_factory=list)
ice_score: float | None = None
tier_price_usd: float | None = None
command: str | None = None
depends_on: list[str] = field(default_factory=list)
live_url: str | None = None
def scan_assets() -> list[Asset]:
discovered = []
for current_dir, dirnames, filenames in os.walk(ROOT):
dirnames[:] = [d for d in dirnames if d not in SKIP_DIRS]
for filename in filenames:
file_path = Path(current_dir) / filename
manifest = parse_frontmatter(file_path)
if manifest:
asset = manifest_to_asset(manifest, file_path)
if asset:
discovered.append(asset)
return discovered
3. Dashboard & View Generation
The scanner output feeds a stateless rendering layer that generates three primary panels on-demand:
- TODO Panel: Filters
category: todoorpriority: P0wherestatus != done. Ordered byeta_min. - Check Panel: Aggregates
category: ideaandcategory: roadmapfor review cycles. - Run Panel: Surfaces assets with a
command:field, enabling one-click execution via Flask/cron. - Cross-Cutting Views: Categories grid, LIVE panel (auto-extracts
live_url), and Revenue tracker (pulls MRR per channel via dedicated aggregator). Dependency graphs are constructed by resolvingdepends_onreferences at scan time.
Pitfall Guide
- Schema Versioning Neglect: YAML frontmatter lacks native versioning. As the schema evolves, older assets may fail validation or misalign with new scanner logic. Best Practice: Add
schema_version: 1to new assets and implement scanner-side migration functions that normalize legacy fields before mapping. - Orphaned Dependency References:
depends_onrelies on raw string IDs. Typos or deleted assets create silent graph breaks. Best Practice: Implement a cross-asset health check during the scan phase that validates alldepends_ontargets exist in the discovered asset list and logs warnings for unresolved references. - Category Trait Enforcement Gaps: Relying on convention for required fields per category leads to inconsistent data (e.g., missing
tier_price_usdforgumroad-sku). Best Practice: Build a trait-based validation layer that enforces field requirements dynamically based on thecategoryenum, failing fast on schema violations. - Scanner Performance Degradation:
os.walkon deep or large trees causes I/O bottlenecks and redundant YAML parsing. Best Practice: Strictly maintainSKIP_DIRS, implement file-type whitelisting, and cache parsed manifests in memory or a lightweight SQLite/JSON store to avoid re-parsing unchanged files on subsequent runs. - Multi-User Schema Drift: This pattern assumes solo/indie scale. Team collaboration introduces merge conflicts, inconsistent field usage, and schema disagreements. Best Practice: Restrict to single-owner repos or enforce a strict PR-based schema review process with pre-commit hooks that validate frontmatter against the canonical schema.
- Malformed Frontmatter Breakage: Invalid YAML syntax or incorrect indentation halts the scanner or returns
None, silently dropping assets. Best Practice: Wrapparse_frontmatterin robust try/except blocks, log parsing errors with file paths, and implement graceful degradation so one broken file doesn't crash the entire dashboard render.
Deliverables
- Blueprint:
MANIFEST_SCHEMA.mdβ Complete field definitions, type constraints, enum values, and scanner architecture diagram. Maps the 13-field contract to dashboard rendering logic and dependency resolution. - Checklist: Migration & Validation Protocol β Step-by-step guide for converting legacy
INDEX.mdrepos: schema documentation, scanner implementation, manual/helper migration, cross-reference validation, and dashboard deployment. Includes break-even calculation template. - Configuration Templates:
- YAML Frontmatter Boilerplate (ready-to-paste header with all 13 fields)
- Scanner Config (
ROOT,SKIP_DIRS, file-type filters, caching layer setup) - Dashboard View Definitions (Flask route mappings for TODO/Check/Run panels, cron job templates for revenue aggregation)
- Source Artifacts:
- Schema & Documentation:
github.com/jiejuefuyou/autoapp-toolkit/blob/main/MANIFEST_SCHEMA.md - Scanner & Dashboard Backend:
github.com/jiejuefuyou/autoapp-toolkit/blob/main/dashboard/app.py - Production-Ready Package: AutoApp Dashboard (Flask backend + 3-panel UI + cron cleanup) available under MIT-friendly licensing for buyer use.
- Schema & Documentation:
