variable interpolation, and model routing into distinct architectural layers. The following implementation demonstrates a type-safe, model-agnostic approach that mirrors the functionality of established Korean prompt libraries while introducing a cleaner compilation pipeline.
Step 1: Define the Template Registry
Instead of scattering prompt strings across codebases, centralize them in a typed registry. Each template declares its category, default speech level, and required variables.
type SpeechLevel = 'formal' | 'informal';
type PromptCategory = 'business' | 'coding' | 'customer_service' | 'writing' | 'analysis';
interface PromptTemplate {
id: string;
category: PromptCategory;
defaultSpeechLevel: SpeechLevel;
systemInstruction: string;
userInstruction: string;
requiredVars: string[];
}
const PROMPT_REGISTRY: Record<string, PromptTemplate> = {
'coding/code-review': {
id: 'coding/code-review',
category: 'coding',
defaultSpeechLevel: 'formal',
systemInstruction: `You are a senior software engineer conducting a technical review.
Maintain professional Korean (합쇼체). Focus on security, performance, and readability.
Structure feedback using: 1. Critical Issues, 2. Suggestions, 3. Positive Notes.`,
userInstruction: `Language: {{language}}
Code:
\`\`\`{{language}}
{{code}}
\`\`\`
Review Focus: {{focus}}`,
requiredVars: ['language', 'code', 'focus']
}
};
Architecture Rationale: Centralizing templates in a typed registry enables IDE autocompletion, compile-time validation, and version control. Separating systemInstruction and userInstruction aligns with modern chat API conventions, allowing the model to establish behavioral constraints before processing user input.
Step 2: Implement the Compiler
Variable interpolation must be type-safe and defensive. The compiler validates required variables, applies speech level adjustments, and returns a structured payload ready for API consumption.
interface CompiledPrompt {
system: string;
user: string;
metadata: {
templateId: string;
speechLevel: SpeechLevel;
tokenEstimate: number;
};
}
function compilePrompt(
templateId: string,
variables: Record<string, string>,
overrideSpeechLevel?: SpeechLevel
): CompiledPrompt {
const template = PROMPT_REGISTRY[templateId];
if (!template) throw new Error(`Template ${templateId} not found`);
const missing = template.requiredVars.filter(v => !(v in variables));
if (missing.length > 0) throw new Error(`Missing variables: ${missing.join(', ')}`);
const speechLevel = overrideSpeechLevel ?? template.defaultSpeechLevel;
let system = template.systemInstruction;
let user = template.userInstruction;
// Apply speech level modifiers if needed
if (speechLevel === 'informal') {
system = system.replace(/합쇼체/g, '해체').replace(/professional/g, 'casual');
}
// Variable interpolation
Object.entries(variables).forEach(([key, value]) => {
const regex = new RegExp(`\\{\\{${key}\\}\\}`, 'g');
system = system.replace(regex, value);
user = user.replace(regex, value);
});
return {
system,
user,
metadata: {
templateId,
speechLevel,
tokenEstimate: Math.ceil((system.length + user.length) / 4)
}
};
}
Architecture Rationale: The compiler acts as a deterministic transformation layer. It validates inputs before hitting the LLM, preventing runtime failures. The token estimation heuristic provides early cost visibility. Speech level overrides allow runtime flexibility without duplicating templates.
Step 3: Route to Target Models
Model routing should remain decoupled from prompt compilation. This enables seamless switching between Claude and GPT without rewriting prompt logic.
async function dispatchToClaude(compiled: CompiledPrompt) {
const response = await anthropic.messages.create({
model: 'claude-opus-4-7',
system: compiled.system,
messages: [{ role: 'user', content: compiled.user }],
max_tokens: 2048,
temperature: 0.2
});
return response.content[0].type === 'text' ? response.content[0].text : '';
}
async function dispatchToGPT(compiled: CompiledPrompt) {
const response = await openai.chat.completions.create({
model: 'gpt-4o',
messages: [
{ role: 'system', content: compiled.system },
{ role: 'user', content: compiled.user }
],
max_tokens: 2048,
temperature: 0.2
});
return response.choices[0].message.content ?? '';
}
Architecture Rationale: Decoupling compilation from dispatch ensures that prompt engineering logic remains model-agnostic. Temperature is kept low (0.2) for deterministic technical outputs. Max tokens are explicitly bounded to control costs and prevent runaway generation.
Step 4: CLI Discovery & Filtering
Production teams benefit from a command-line interface that enables rapid template discovery without leaving the terminal.
# List all available templates
npx prompt-engine list
# Retrieve a specific template
npx prompt-engine get coding/code-review
# Filter by category and speech level
npx prompt-engine search --category business --level formal
Architecture Rationale: CLI tools reduce onboarding friction and enable CI/CD integration. Search filters allow developers to locate templates by domain and tone requirements, accelerating prototyping and reducing documentation dependency.
Pitfall Guide
1. Hardcoding Speech Levels in System Prompts
Explanation: Embedding speech level instructions directly into static strings forces developers to duplicate templates for formal vs. informal use cases. This creates maintenance debt and increases the risk of tone drift.
Fix: Parameterize speech levels at compile time. Use a single template with conditional modifiers that swap honorific markers based on a SpeechLevel enum.
2. Ignoring Korean Honorific Morphology
Explanation: Korean verbs change endings based on social context. LLMs trained primarily on English will default to neutral or inconsistent endings unless explicitly constrained.
Fix: Include explicit morphological constraints in the system instruction. Example: Use formal verb endings (습니다/비니다) for all customer-facing responses. Avoid casual endings (아/어) unless explicitly requested.
3. Overloading Templates with Unbounded Variables
Explanation: Passing large code blocks, lengthy documents, or unstructured text directly into prompt variables inflates token counts and degrades output quality.
Fix: Implement variable sanitization and chunking strategies. Truncate or summarize inputs before interpolation. Use metadata tags to separate context from instructions.
Explanation: Claude and GPT interpret Korean formatting instructions differently. Claude tends to follow structural constraints more rigidly, while GPT may prioritize fluency over formatting.
Fix: Add model-specific post-processing rules. For GPT, enforce markdown structure explicitly. For Claude, leverage its native instruction-following strength with minimal formatting directives.
5. Neglecting Domain-Specific Terminology
Explanation: Business, coding, and customer service domains use distinct Korean terminology. A generic prompt will mix registers, producing unnatural outputs.
Fix: Maintain a terminology glossary per category. Inject domain-specific vocabulary constraints into the system prompt. Example: Use standard software engineering terms in Korean (예: 버그, 리팩토링, 테스트 커버리지).
6. Treating Prompts as Immutable Strings
Explanation: Static prompts cannot adapt to evolving business requirements or model updates. Hardcoded strings become technical debt.
Fix: Version control templates. Implement a template registry with semantic versioning. Use CI checks to validate prompt changes against regression tests.
7. Failing to Validate Output Against Cultural Norms
Explanation: LLMs can generate grammatically correct Korean that violates business etiquette or customer service standards.
Fix: Implement a lightweight evaluation layer. Use a secondary model or rule-based checker to validate tone, honorific usage, and structural compliance before returning outputs to users.
Production Bundle
Action Checklist
Decision Matrix
| Scenario | Recommended Approach | Why | Cost Impact |
|---|
| Internal developer tooling | Informal speech level, coding category | Developers prefer direct, concise feedback. Reduces token overhead. | Low (shorter outputs, higher throughput) |
| Customer-facing support bot | Formal speech level, customer service category | Maintains professional tone and cultural appropriateness for end users. | Medium (longer system prompts, higher per-request cost) |
| Technical documentation generator | Formal speech level, analysis/writing category | Ensures consistent terminology and structured formatting for public consumption. | Medium-High (requires chunking and post-processing) |
| Marketing copy generation | Informal/neutral speech level, writing category | Balances creativity with brand voice. Allows flexible tone overrides. | Low-Medium (variable length, requires A/B testing) |
Configuration Template
// prompt.config.ts
import { PromptRegistry, compilePrompt, dispatchToClaude } from '@your-org/korean-prompt-engine';
const registry = new PromptRegistry({
source: './templates',
cache: true,
validation: 'strict'
});
export async function generateCodeReview(code: string, focus: string) {
const compiled = compilePrompt(registry, 'coding/code-review', {
language: 'typescript',
code,
focus
}, {
speechLevel: 'formal',
maxTokens: 2048,
temperature: 0.2
});
const output = await dispatchToClaude(compiled);
// Post-processing validation
if (!output.includes('1. Critical Issues')) {
throw new Error('Output structure validation failed');
}
return output;
}
Quick Start Guide
- Initialize the registry: Run
npm install @your-org/korean-prompt-engine and import the core compiler module into your project.
- Discover templates: Execute
npx prompt-engine list to view available categories and npx prompt-engine search --category coding to filter by domain.
- Compile and dispatch: Call
compilePrompt() with your variables, select the target speech level, and route the result to Claude or GPT using the provided dispatch functions.
- Validate and iterate: Run the output through a tone checker or manual review. Adjust variable mappings or speech level overrides based on production feedback.
- Monitor metrics: Track token consumption, latency, and error rates. Version control prompt changes and establish a rollback strategy for model updates.