ions
- File-Based Context Generation: The package outputs Markdown and structured reference files. LLMs parse Markdown efficiently, and file-based storage ensures deterministic loading without runtime network calls.
- Agent-Agnostic Output: Context files are written to standard workspace directories recognized by Claude Code, Cursor, Trae, GitHub Copilot, Hermes, OpenCode, Codex, and pi. This decouples the knowledge base from any single vendor's proprietary format.
- Version-Synced Documentation Mapping: The injected context aligns with specific DolphinDB release cycles. This prevents the AI from mixing deprecated APIs with current implementations.
- Local-First Execution: All processing occurs on the host machine. No telemetry, no external API calls, and no data exfiltration.
Implementation Workflow
The installation process initializes the context generator, verifies workspace compatibility, and writes the reference files to the appropriate agent configuration directories. Below is a production-ready verification script that confirms successful injection and validates context integrity.
import os
import json
import subprocess
from pathlib import Path
def verify_context_injection(workspace_root: str) -> dict:
"""
Validates that DolphinDB agent context files are correctly installed
and accessible to the local AI coding assistant.
"""
context_dir = Path(workspace_root) / ".agent-skills" / "dolphindb"
validation_report = {
"status": "pending",
"files_found": [],
"missing_sections": [],
"agent_compatibility": []
}
required_sections = [
"script_syntax.md",
"sql_analytics.md",
"stream_processing.md",
"sdk_references.md",
"admin_tuning.md"
]
if not context_dir.exists():
validation_report["status"] = "failed"
validation_report["error"] = "Context directory not found. Run installation first."
return validation_report
for section in required_sections:
target_file = context_dir / section
if target_file.exists():
validation_report["files_found"].append(section)
else:
validation_report["missing_sections"].append(section)
# Verify agent configuration directories
agent_dirs = [".cursor", ".github", ".claude", ".vscode"]
for agent in agent_dirs:
agent_path = Path(workspace_root) / agent
if agent_path.exists():
validation_report["agent_compatibility"].append(agent.replace(".", "").upper())
validation_report["status"] = "success" if not validation_report["missing_sections"] else "partial"
return validation_report
if __name__ == "__main__":
import sys
root = sys.argv[1] if len(sys.argv) > 1 else os.getcwd()
report = verify_context_injection(root)
print(json.dumps(report, indent=2))
Why This Architecture Works
Traditional RAG systems chunk documents and embed them in vector databases. While effective for semantic search, vector retrieval introduces latency and occasionally returns semantically similar but technically incorrect snippets. File-based context injection bypasses embedding entirely. The AI reads the exact syntax, parameter signatures, and usage patterns directly from the source files during prompt construction. This deterministic approach is critical for time-series databases where a single misplaced argument in a window function or streaming operator can cause silent data corruption or pipeline failures.
Pitfall Guide
1. Context Window Overflow
Explanation: Developers manually paste entire SDK references or documentation pages into prompts, exhausting the context window and degrading the AI's reasoning capacity.
Fix: Rely on the package's scoped extraction. The generated files are optimized for token efficiency. If additional context is needed, use targeted file references rather than bulk pasting.
2. Version Drift Between Context and Runtime
Explanation: The injected context reflects DolphinDB v2.0, but the production environment runs v3.0. The AI generates code using deprecated APIs or missing parameters.
Fix: Always run the context generator after upgrading the database or SDK. Pin context versions in your project's dependency manifest and validate against runtime versions during CI/CD.
3. Cross-Agent Configuration Conflicts
Explanation: Mixing instruction files for Cursor, Copilot, and Claude Code in the same directory causes the AI to load conflicting system prompts or duplicate context.
Fix: Isolate agent configurations. Use separate directories (.cursor/rules, .github/copilot-instructions.md, etc.) and ensure the context generator targets only the active agent's workspace.
4. Stream Processing vs. Batch Query Confusion
Explanation: LLMs frequently conflate time-series window calculations with real-time streaming pipelines. They may apply batch aggregation logic to continuous data streams, causing memory leaks or incorrect watermark handling.
Fix: Explicitly tag context files with #streaming or #batch directives. In prompts, specify the execution model: Use real-time streaming operators, not batch window functions.
5. Accidental Credential Exposure in Context Files
Explanation: Developers embed connection strings, API keys, or authentication tokens directly into context or configuration files, which are then committed to version control.
Fix: Never store secrets in context files. Use environment variables or secret managers. The AI should generate code that references os.environ.get("DDB_AUTH_TOKEN") rather than hardcoding credentials.
6. Ignoring Administrative and Performance Tuning Context
Explanation: Teams focus exclusively on query syntax and SDK methods, neglecting deployment, indexing, and partitioning strategies. The AI generates functionally correct queries that perform poorly at scale.
Fix: Ensure the admin_tuning.md context file is active. Prompt the AI to consider data distribution, partition keys, and memory allocation when designing time-series schemas.
7. Over-Reliance on AI for Critical Pipeline Logic
Explanation: Assuming AI-generated streaming code is production-ready without validation. Time-series pipelines require precise watermark alignment, state management, and fault tolerance.
Fix: Implement mandatory code review and unit testing for all AI-generated stream processing logic. Use deterministic test datasets to verify window boundaries and aggregation accuracy.
Production Bundle
Action Checklist
Decision Matrix
| Scenario | Recommended Approach | Why | Cost Impact |
|---|
| Local development with niche DB | Local Context Injection | Zero latency, offline access, deterministic syntax | Free |
| Multi-team cloud collaboration | Cloud RAG + Local Fallback | Centralized knowledge, but requires egress controls | API costs + infrastructure |
| High-frequency streaming pipelines | Local Context Injection | Prevents silent logic errors, ensures operator accuracy | Free |
| Legacy system migration | Manual Documentation + AI Assist | Context may be outdated; requires human validation | Engineering time |
| Security-compliant environments | Local Context Injection | No data leaves the host, full auditability | Free |
Configuration Template
Use this template to configure your AI agent's system prompt to leverage the injected context. Adjust paths based on your workspace structure.
# .agent-skills/config.yaml
context_root: "./.agent-skills/dolphindb"
active_sections:
- script_syntax.md
- sql_analytics.md
- stream_processing.md
- sdk_references.md
- admin_tuning.md
agent_directives:
- "Reference local context files for DolphinDB syntax and API signatures."
- "Distinguish between batch window calculations and real-time streaming operators."
- "Never hardcode credentials; use environment variable references."
- "Validate partition keys and data distribution for time-series schemas."
- "Flag deprecated APIs and suggest current equivalents."
version_policy: "sync_with_runtime"
telemetry: false
Quick Start Guide
- Install the package: Run
pip install dolphindb-agent-skills in your terminal.
- Initialize context: Execute
dolphindb-agent-skills to generate and inject reference files into your workspace.
- Verify installation: Check that
.agent-skills/dolphindb/ contains the five core context files. Run the validation script if needed.
- Configure your AI agent: Point your coding assistant to the generated context directory or add the agent directives to your system prompt.
- Test and iterate: Prompt the AI to generate a time-series window query or streaming pipeline. Verify syntax against the injected context and refine prompts based on output accuracy.
Local context injection transforms AI coding assistants from generic code generators into domain-specialized engineering partners. By grounding the model in verified, offline documentation, you eliminate hallucinations, accelerate development cycles, and maintain strict control over data and dependencies. For time-series databases where precision dictates system reliability, this approach is the standard for production-grade AI-assisted development.