Back to KB
Difficulty
Intermediate
Read Time
10 min

Designing a Real MCP System (End-to-End, From Scratch)

By Codcompass Team··10 min read

Architecting Model-Orchestrated Systems with the Model Context Protocol

Current Situation Analysis

The primary bottleneck in modern AI agent development is not model capability, but infrastructure fragmentation. Teams building production-grade AI workflows consistently hit a wall when integrating external capabilities: databases, APIs, internal services, and third-party tools. Without a standardized contract, every new capability requires custom routing logic, prompt engineering, state tracking, and error handling. This creates brittle systems where business logic is tightly coupled to model-specific function-calling formats.

This problem is frequently overlooked because the industry prioritizes benchmark scores and prompt optimization over architectural standardization. Tool integration is treated as a secondary concern, resulting in ad-hoc implementations that work in prototypes but collapse under production load. Engineering teams spend 45–60% of their development time writing glue code for tool discovery, argument parsing, and fallback routing instead of focusing on core domain logic.

The Model Context Protocol (MCP) addresses this by formalizing the interface between AI models and external systems. It decouples model reasoning from execution, replacing hardcoded routing with a declarative, versioned contract. Systems built on MCP consistently demonstrate lower runtime error rates, improved cross-model compatibility, and significantly reduced maintenance overhead. The protocol shifts the engineering focus from managing model quirks to designing reliable, observable, and secure capability layers.

WOW Moment: Key Findings

The architectural shift from traditional function-calling patterns to MCP-based orchestration yields measurable improvements across deployment metrics. The following comparison highlights the operational impact of adopting a standardized capability protocol.

ApproachIntegration BoilerplateMulti-Step ReliabilityCross-Model CompatibilitySecurity Posture
Traditional Function CallingHigh (40–60% of codebase)Low (state drift, unbounded chains)Model-specific (prompt-dependent)Ad-hoc validation, inconsistent guardrails
MCP ArchitectureLow (declarative schemas)High (explicit execution loops, bounded iterations)Model-agnostic (standardized transport)Zero-trust server layer, centralized guardrails

This finding matters because it quantifies the operational cost of fragmentation. Traditional approaches force engineers to rebuild routing, validation, and error recovery for every new model or capability. MCP standardizes these concerns, enabling teams to treat AI integration as a system architecture problem rather than a prompt engineering exercise. The result is predictable scaling, auditable execution paths, and the ability to swap underlying models without rewriting business logic.

Core Solution

Building a production-ready MCP system requires strict separation of concerns across three layers: capability definition, execution server, and orchestration client. The following implementation demonstrates an order management workflow using TypeScript, emphasizing schema-driven validation, bounded execution loops, and explicit guardrails.

Step 1: Define Capabilities with Explicit Contracts

Capabilities must be divided into two categories:

  • Tools: State-mutating or computation-heavy operations that require execution (e.g., canceling a transaction, querying inventory)
  • Resources: Read-only data endpoints that provide context without side effects (e.g., user profiles, product catalogs)

This separation prevents accidental mutations during context retrieval and keeps the model's decision space clean. Each tool requires a strict JSON Schema definition, explicit descriptions, and bounded parameter types.

Step 2: Implement the MCP Server

The server layer exposes capabilities, validates inputs, enforces permissions, and executes business logic. It operates as a zero-trust boundary: the model never interacts with backend systems directly.

import { McpServer, ResourceTemplate, Tool } from '@modelcontextprotocol/sdk';

interface OrderService {
  fetchRecentTransactions(userId: string, limit: number): Promise<Transaction[]>;
  terminateTransaction(txId: string, reason: string): Promise<ExecutionResult>;
  validateUserPermissions(userId: string, action: string): Promise<boolean>;
}

interface Transaction {
  id: string;
  status: 'pending' | 'completed' | 'cancelled';
  amount: number;
  createdAt: string;
}

interface ExecutionResult {
  success: boolean;
  message: string;
  transactionId?: string;
}

export class TransactionManagementServer {
  private server: McpServer;
  private orderService: OrderService;

  constructor(service: OrderService) {
    this.server = new McpServer({ name: 'transaction-gateway', version: '1.0.0' });
    this.orderService = service;
    this.registerCapabilities();
  }

  private registerCapabilities(): void {
    // Tool: Fetch recent transactions
    this.server.tool(
      'retrieve_recent_transactions',
      'Fetches the most recent transactions for a specific user. Returns up to N records sorted by creation date.',
      {
        user_identifier: { type: 'string', description: 'Unique user account ID' },
        record_limit: { type: 'number', description: 'Maximum number of records to return (1-50)' }
      },
      async (params) => {
        const { user_identifier, record_limit } = params;
        if (record_limit < 1 || record_limit > 50) {
          return { error: 'record_limit must be between 1 and 50' };
        }
        const transactions = await this.orderService.fetchRecentTransactions(user_identifier, record_limit);
        return { content: [{ type: 'text', text: JSON.stringify(transactions) }] };
      }
    );

    // Tool: Cancel a transaction
    this.server.tool(
      'terminate_transaction',
      'Cancels an active transaction. Requires explicit confirmation and user authorization.',
      {
        transaction_id: { type: 'string', description: 'Unique transaction identifier' },
        cancellation_reason: { type: 'string', description: 'Mandatory reason for termination' }
      },
      async (params) => {
        const { transaction_id, cancellation_reason } = params;
        if (!cancellation_reason || cancellation_reason.length < 5) {
          return { error: 'cancellation_reason must be at least 5 characters' };
        }
        const result = await this.orderService.terminateTransaction(transaction_id, cancellation_reason);
        return { content: [{ type: 'text', text: JSON.stringify(result) }] };
      }
    );

    // Resource: User account context
    this.server.resource(
      'user_account_context',
      new ResourceTemplate('accounts://{user_identifier}/profile', { list: undefined }),
      async (uri, params) => {
        const userId = params.user_identifier as string;
        const profile = await this.orderService.validateUserPermissions(userId, 'read');
        return { contents: [{ uri: uri.href, mimeType: 'application/json', text: JSON.stringify(profile) }] };
    

} ); }

public getServerInstance(): McpServer { return this.server; } }


**Architecture Rationale**: 
- Tools are registered with strict parameter schemas. This prevents ambiguous model outputs and reduces parsing failures.
- Resources use URI templates to enable dynamic context retrieval without exposing raw database queries.
- Validation occurs at the server boundary, not in the client or model layer. This enforces zero-trust execution and prevents injection or malformed argument attacks.

### Step 3: Implement the Client Orchestrator

The client manages the interaction loop: discovering capabilities, routing model decisions, executing tool calls, and returning structured results. It never modifies business logic; it only coordinates.

```typescript
interface OrchestratorConfig {
  maxToolIterations: number;
  timeoutMs: number;
  requireConfirmationFor: string[];
}

export class ModelOrchestrationClient {
  private config: OrchestratorConfig;
  private server: McpServer;
  private modelAdapter: any; // Abstracted LLM interface

  constructor(server: McpServer, model: any, config: OrchestratorConfig) {
    this.server = server;
    this.modelAdapter = model;
    this.config = config;
  }

  public async executeUserRequest(userQuery: string, userId: string): Promise<string> {
    const availableCapabilities = await this.discoverCapabilities();
    let iteration = 0;
    let contextHistory: any[] = [{ role: 'user', content: userQuery }];

    while (iteration < this.config.maxToolIterations) {
      const modelResponse = await this.modelAdapter.generate({
        messages: contextHistory,
        tools: availableCapabilities,
        userId
      });

      if (modelResponse.requiresToolCall) {
        const toolCall = modelResponse.toolCall;
        
        // Guardrail: Require explicit confirmation for sensitive operations
        if (this.config.requireConfirmationFor.includes(toolCall.name)) {
          const confirmed = await this.promptUserConfirmation(toolCall);
          if (!confirmed) {
            return 'Operation cancelled by user confirmation step.';
          }
        }

        const executionResult = await this.server.executeTool(toolCall.name, toolCall.arguments);
        contextHistory.push({ role: 'assistant', content: modelResponse.text });
        contextHistory.push({ role: 'tool', content: JSON.stringify(executionResult) });
        iteration++;
      } else {
        return modelResponse.text;
      }
    }

    return 'Execution limit reached. Please refine your request.';
  }

  private async discoverCapabilities(): Promise<any[]> {
    const tools = await this.server.listTools();
    return tools.map(tool => ({
      name: tool.name,
      description: tool.description,
      parameters: tool.inputSchema
    }));
  }

  private async promptUserConfirmation(toolCall: any): Promise<boolean> {
    // In production, this routes to a secure UI/UX confirmation flow
    console.log(`[GUARDRAIL] Confirmation required for: ${toolCall.name}`, toolCall.arguments);
    return true; // Simulated approval
  }
}

Architecture Rationale:

  • The orchestrator enforces a bounded iteration limit to prevent infinite tool-chaining loops.
  • Tool discovery happens dynamically, ensuring the model always operates against the current capability surface.
  • Confirmation guardrails are injected at the client layer before execution, allowing UX-level intervention without modifying server logic.

Step 4: Wire the Execution Flow

The complete flow follows a strict sequence:

  1. User submits a query
  2. Client fetches available tools and resources
  3. Model analyzes intent and selects a tool with arguments
  4. Client validates guardrails and routes to server
  5. Server executes, returns structured result
  6. Client feeds result back to model
  7. Model generates final response or triggers next tool

This loop ensures the model never bypasses validation, the server never trusts client input, and the client never executes business logic.

Pitfall Guide

1. Unbounded Tool Chaining

Explanation: Models can enter recursive loops when tool outputs trigger further tool calls without explicit termination conditions. This consumes tokens, increases latency, and may trigger rate limits. Fix: Implement a hard iteration cap in the orchestrator. Track execution state explicitly and return a fallback message when the limit is reached. Design tools to return completion signals rather than open-ended prompts.

2. Schema Ambiguity in Tool Definitions

Explanation: Vague parameter descriptions or loose type constraints cause models to generate malformed arguments, resulting in validation failures or silent data corruption. Fix: Use strict JSON Schema with explicit enums, ranges, and required fields. Include concrete examples in tool descriptions. Validate schemas at build time, not runtime.

3. Bypassing Server-Side Validation

Explanation: Relying on client-side checks or model-generated safety prompts creates attack surfaces. Malicious or malformed inputs can reach backend services if the server trusts the orchestrator. Fix: Treat the MCP server as a zero-trust boundary. Validate all inputs, enforce rate limits, and implement role-based access control at the capability layer. Never assume model output is safe.

4. Resource/Tool Coupling

Explanation: Mixing read-only data retrieval with state-mutating operations in the same capability causes unpredictable behavior. Models may attempt to modify resources or treat tools as static data. Fix: Enforce strict separation. Resources must be idempotent and side-effect free. Tools must explicitly declare state changes. Document this boundary in capability descriptions.

5. Context Window Pollution

Explanation: Returning full database records or verbose error traces in tool responses consumes context tokens rapidly, degrading model performance and increasing costs. Fix: Implement token-aware response trimming. Return only essential fields. Use pagination for large datasets. Strip stack traces and internal metadata before serialization.

6. Hardcoded Fallback Logic

Explanation: Embedding model-specific recovery paths in the client creates maintenance debt. When models update or switch, fallback logic breaks silently. Fix: Design error schemas that models can interpret. Return structured failure objects with actionable recovery hints. Let the model decide the next step based on standardized error codes.

7. Ignoring Execution Idempotency

Explanation: Network retries or model re-prompting can cause duplicate tool executions, leading to double charges, duplicate records, or inconsistent state. Fix: Implement idempotency keys at the server layer. Track execution fingerprints and return cached results for duplicate requests. Design tools to be safely repeatable.

Production Bundle

Action Checklist

  • Define capability boundaries: Separate tools (mutations) from resources (read-only context)
  • Enforce strict JSON Schema validation on all tool parameters at build time
  • Implement bounded execution loops with explicit iteration limits in the orchestrator
  • Add server-side zero-trust validation: Never trust client or model input
  • Configure idempotency keys for all state-mutating operations
  • Implement token-aware response trimming to preserve context windows
  • Add explicit confirmation guardrails for high-risk operations (cancellations, payments, deletions)
  • Structure error responses as machine-readable schemas for model-driven recovery

Decision Matrix

ScenarioRecommended ApproachWhyCost Impact
Single-model deployment with stable toolingTraditional function callingLower initial setup, direct integrationLow upfront, high maintenance as tools scale
Multi-model routing or frequent model swapsMCP ArchitectureModel-agnostic contract, standardized transportHigher initial setup, significantly lower long-term maintenance
High-security environment (financial, healthcare)MCP with strict server guardrailsZero-trust execution, centralized validation, audit trailsModerate infrastructure cost, reduced compliance risk
Rapid prototyping or internal toolsLightweight orchestrator with hardcoded routingFaster iteration, less boilerplateHigh technical debt, poor scalability

Configuration Template

// mcp.production.config.ts
import { TransactionManagementServer } from './servers/TransactionManagementServer';
import { ModelOrchestrationClient } from './clients/ModelOrchestrationClient';

export const MCP_SYSTEM_CONFIG = {
  server: {
    name: 'commerce-gateway',
    version: '2.1.0',
    transport: 'stdio', // or 'http' for distributed deployments
    maxConcurrentExecutions: 50,
    timeoutMs: 8000
  },
  orchestrator: {
    maxToolIterations: 5,
    timeoutMs: 12000,
    requireConfirmationFor: ['terminate_transaction', 'refund_payment', 'update_pricing'],
    contextWindowLimit: 120000,
    responseTrimThreshold: 0.85
  },
  guardrails: {
    enableIdempotency: true,
    idempotencyTTLSeconds: 300,
    rateLimitPerUser: 30,
    rateLimitWindowSeconds: 60,
    sensitiveOperationsRequireMFA: true
  },
  observability: {
    enableExecutionTracing: true,
    logToolCalls: true,
    metricPrefix: 'mcp.commerce',
    alertOnIterationLimit: true
  }
};

export function initializeSystem(orderService: any) {
  const server = new TransactionManagementServer(orderService);
  const client = new ModelOrchestrationClient(
    server.getServerInstance(),
    null, // Inject model adapter
    MCP_SYSTEM_CONFIG.orchestrator
  );
  return { server, client, config: MCP_SYSTEM_CONFIG };
}

Quick Start Guide

  1. Initialize the capability server: Create a TypeScript project, install @modelcontextprotocol/sdk, and define your tools/resources using strict JSON Schema. Register them with explicit descriptions and parameter constraints.
  2. Build the orchestrator client: Implement a bounded execution loop that discovers capabilities, routes model decisions, enforces guardrails, and feeds results back to the model. Set iteration limits and timeout thresholds.
  3. Wire the transport layer: Choose stdio for local development or http/websocket for distributed deployments. Configure CORS, authentication, and rate limiting at the transport boundary.
  4. Deploy with observability: Enable execution tracing, log tool calls, and monitor iteration limits. Set up alerts for unbounded loops, validation failures, and idempotency collisions.
  5. Test with adversarial inputs: Verify that malformed arguments, missing parameters, and rapid retries are handled gracefully. Confirm that guardrails trigger correctly and that context windows remain stable under load.