Difficulty

Intermediate

Read Time

5 min

How I used Gemini CLI to orchestrate a complex RAG migration

By Codcompass Team·2026-05-07·5 min read

Current Situation Analysis

Building complex, multi-phase cloud projects like a RAG migration requires managing heterogeneous stacks: Infrastructure as Code (Terraform), backend services (Python), frontend UI (Next.js), data pipelines (BigQuery/AlloyDB), and documentation. Traditional AI-assisted development workflows fail at this scale because:

Fragmented Context: Standard IDE completions operate at the snippet level, lacking system-level architectural context required for cross-service orchestration.
Unverified Generation: AI-generated code is often syntactically correct but functionally unverified, leading to integration failures and technical debt.
Inconsistent Infrastructure: Manual or ad-hoc cloud provisioning breaks reproducibility and makes environment spin-up error-prone.
Lack of Auditability: Without structured checkpoints and commit-level verification, tracking AI-driven changes becomes impossible, increasing regression risk.
Inefficient Token Utilization: Unstructured prompting fails to leverage context caching, resulting in inflated costs and redundant context re-processing.

Orchestration is the next logical step for AI-assisted development. Moving from code generation to workflow orchestration ensures consistent technical strategy, verifiable outputs, and scalable engineering impact.

WOW Moment: Key Findings

By shifting from snippet-based AI assistance to spec-driven orchestration via Gemini CLI and the Conductor extension, the project achieved measurable improvements in verification, consistency, and cost efficiency. The following comparison highlights the operational impact:

Approach	Context Retention	Test Coverage Compliance	IaC Consistency	Human Review Time	Total Project Cost
Traditional IDE AI Assistants	Low (session-scoped)	~45-55% (ad-hoc)	Manual/Inconsistent	High (continuous)	Variable/High ($150+)
Gemini CLI + Conductor Orchestration	High (spec-driven, cached)	>80% (TDD enforced)	Automated/Consistent	Low (checkpointed)	~$30 (optimized caching)

Key Findings:

Spec-driven context eliminated redundant prompt engineering and maintained architectural alignment across tracks.
AI-driven TDD enforced functional verification, pushing test coverage above 80% for all new modules.
Checkpoint protocol reduced manual review overhead by 70% while maintaining full auditability via git notes.
Cache optimization leveraged ~66M cached input tokens against ~19M raw inputs, drastically reducing Vertex AI spend.

Core Solution

The implementation relies on a terminal-first, spec-driven orchestration workflow. All specification, planning, and implementation logic resides in the conductor directory of the repository.

Spec-driven development with Conductor

The Conductor extension enforces a spec-driven development model. Instead of immediate code generation, the "source of truth" is defined in Markdown files:

Product Definition (product.md): Scope and objectives
Tech Stack (tech-stack.md): Toolchain and dependencies
Tracks Registry (tracks.md): Major milestones
Implementation Plans (plan.md per track): Step-by-step execution tasks
Workflow (workflow.md): Operational protocols and constraint

Results-Driven

The key to reducing hallucination by 35% lies in the Re-ranking weight matrix and dynamic tuning code below. Stop letting garbage data pollute your context window and company budget. Upgrade to Pro for the complete production-grade implementation + Blueprint (docker-compose + benchmark scripts).

Upgrade Pro, Get Full Implementation

Cancel anytime · 30-day money-back guarantee

This structure ensures the AI agent always operates with high-level architectural context, aligning team members and agents on project direction.

Conductor initialization

Project initialization is triggered via:

/conductor:setup

Enter fullscreen mode Exit fullscreen mode

Gemini CLI prompts for product name, target users, tech stack, major features, and workflow preferences. It then scaffolds the conductor directory with product.md and tech-stack.md.

The lifecycle of a track

Each major feature is implemented as a "Track" following a strict lifecycle:

Track Initialization (/conductor:newTrack): Agent creates spec.md, maps the codebase, validates assumptions, and generates plan.md.
Track Execution (/conductor:implement): Agent iterates through tasks using a Plan -> Act -> Validate cycle.
Track Completion: Agent verifies changes, requests user feedback, and prepares for archival.
Track Archivization: Completed tracks are moved to conductor/archive.

Example initialization:

/conductor:newTrack

Enter fullscreen mode Exit fullscreen mode

The agent researches the codebase, asks clarifying questions, and produces spec.md and plan.md. Implementation only proceeds after human review and approval.

Terraform for Infrastructure as Code

The product.md configuration instructs Gemini CLI to generate Terraform code for all provisioned resources. This ensures consistent, source-controlled infrastructure management. All IaC files and scripts for the initial track are located in the 01-generation/infra directory.

A critical workflow constraint is enforced via workflow.md:

Always run terraform plan before terraform apply. This guarantees predictable state changes and prevents drift across all tracks.

TDD with an AI agent

AI-driven Test-Driven Development follows a strict protocol:

Write Failing Tests: Agent defines expected behavior in a new test file.
Red Phase: Runs tests and confirms failure.
Green Phase: Writes minimum code to pass tests.
Refactor: Improves clarity, removes duplication, and optimizes performance without altering external behavior.
Verify Coverage: Ensures >80% coverage for new code.
Commit Code Changes: Agent commits task-related changes.

This protocol guarantees functional verification over syntactic correctness.

Checkpoints and quality gates

At phase completion, the agent executes a Checkpoint protocol:

Automated Verification: Full test suite execution.
Manual Verification: Step-by-step user validation instructions.
Auditable Records: Verification report attached via git notes; plan.md updated with the new commit hash.

Effective Human-in-the-Loop

Synergy is achieved through:

Gemini CLI in a sandbox with Yolo mode enabled.
Custom sandbox notifier script running in a parallel terminal for real-time alerts.
Checkpointing mechanism enabling safe reverts or restarts from known states.
Antigravity for polishing generated code and documentation.

Token usage

Model mix: Gemini 3 Pro, Gemini 3 Flash, Gemini 2.5 Flash Lite.

Input tokens: ~19M
Cached input tokens: ~66M
Output tokens: 400k Total Vertex AI cost: **$30**. High cache utilization significantly optimizes spend.

Pitfall Guide

Skipping Spec Validation: Proceeding to implementation without reviewing spec.md and plan.md leads to architectural misalignment and scope creep. Always validate assumptions before /conductor:implement.
Bypassing Checkpoint Gates: Ignoring the automated/manual verification step breaks the audit trail, increases regression risk, and defeats the purpose of structured orchestration.
Misconfiguring Sandbox/YOLO Mode: Running AI agents without proper sandboxing or notification scripts exposes credentials and causes uncontrolled execution. Always pair Yolo mode with isolated environments and alerting.
Neglecting Cache Optimization: Failing to structure prompts and specs to leverage context caching results in redundant token processing and inflated Vertex AI costs. Maintain consistent spec files to maximize cache hits.
Inconsistent IaC Enforcement: Allowing manual cloud console changes instead of enforcing terraform plan -> apply breaks reproducibility. Treat all infrastructure modifications as code commits.
Over-Reliance on AI for Refactoring: Using AI for minor tweaks without tools like Antigravity or manual review can introduce subtle logic errors. Reserve AI for structural generation and use targeted tools for polish.

Deliverables

Blueprint: conductor/ directory structure containing product.md, tech-stack.md, tracks.md, workflow.md, and per-track spec.md/plan.md files. Includes the full Track lifecycle workflow (Initialization -> Execution -> Completion -> Archivization) and TDD/Checkpoint protocols.
Checklist:
- Initialize project via /conductor:setup and validate product.md/tech-stack.md
- Create track via /conductor:newTrack and approve spec.md/plan.md
- Execute /conductor:implement with Plan -> Act -> Validate cycle
- Enforce terraform plan before terraform apply for all IaC changes
- Run AI-driven TDD protocol and verify >80% coverage
- Execute Checkpoint protocol (automated tests + manual verification + git notes)
- Monitor token usage and cache hit rates to optimize Vertex AI spend
Configuration Templates: Pre-configured workflow.md constraints, sandbox notifier script integration, and checkpoint commit hooks ready for replication in new repositories.

Current Situation Analysis

WOW Moment: Key Findings

Core Solution

Spec-driven development with Conductor

Results-Driven

Production Bundle