Difficulty

Intermediate

Read Time

9 min

Genkit 2.0 GA: Build and Deploy a TypeScript MCP Server to Cloud Run

By Codcompass Team·2026-05-28·9 min read

Architecting Production-Ready Agent Tools: From Local MCP Discovery to Cloud Run Deployment

Current Situation Analysis

The modern AI engineering stack has a persistent gap: local agent tooling works flawlessly in isolation, but crossing the boundary into production introduces architectural friction that most teams underestimate. Developers routinely build Model Context Protocol (MCP) servers using frameworks like Genkit, verify tool discovery over stdio, and assume the application is ready for deployment. This assumption collapses when the tool must handle authenticated traffic, integrate with cloud observability, manage dependency surface area, and enforce strict authorization policies.

The problem is overlooked because local verification masks production complexity. A stdio-based MCP server requires zero network configuration, zero IAM policies, and zero container orchestration. It also hides the reality that agent-discoverable tools are fundamentally different from authenticated service endpoints. When teams attempt to lift a local MCP server directly to Cloud Run, they encounter unbounded invocation surfaces, missing trace context, unscoped secrets, and dependency vulnerabilities that only surface under load.

Data from recent sandbox validations confirms this friction. A minimal Genkit + MCP setup using genkit@1.36.0, @genkit-ai/mcp@1.36.0, and @modelcontextprotocol/sdk@1.29.0 on Node.js v25.9.0 pulls 486 total dependencies. Running npm audit --omit=dev surfaces 23 vulnerabilities, including 7 high-severity items. While these warnings do not automatically invalidate the framework, they demonstrate that AI tooling inherits the same supply-chain risks as any internet-facing Node service. Production teams cannot treat dependency review as optional. Furthermore, the official Genkit documentation explicitly separates local MCP exposure from Cloud Run deployment, requiring startFlowServer from @genkit-ai/express, IAM authorization, Secret Manager integration, and Cloud Trace configuration. The architectural shift from local discovery to production execution is not incremental; it is structural.

WOW Moment: Key Findings

The critical insight is that MCP and Cloud Run serve fundamentally different purposes in the AI application lifecycle. MCP is a discovery and integration protocol for agents. Cloud Run is a managed execution boundary for authenticated workflows. Confusing the two leads to premature exposure, uncontrolled costs, and untraceable agent behavior.

Deployment Surface	Discovery Mechanism	Authorization Model	Observability Depth	Deployment Complexity	Ideal Stage
Local MCP (stdio)	Process launch, tool listing	None (process-bound)	Console logs only	Minimal	Development & contract validation
Remote MCP (HTTP)	Endpoint registration, agent handshake	App-level tokens or OAuth	Custom logging only	High (network, auth, scaling)	Controlled agent ecosystems
Cloud Run Flow	Direct HTTPS routing	IAM + App-level policies	Cloud Trace, Metrics, Logging	Moderate (container, IAM, secrets)	Production execution

This finding matters because it forces teams to decouple tool definition from tool exposure. You define and verify tools locally using MCP stdio. You then expose the production workflow as an authenticated Cloud Run flow endpoint. Remote MCP surfaces should only be deployed when you have a narrow client roster, explicit authorization policies, and audit trails. Treating MCP as a production API gateway is a structural anti-pattern.

Core Solution

Building a production-ready AI tool requires a phased architecture: deterministic contract definition, local MCP verification, HTTP flow transition, and platform integration. Each phase isolates a specific risk domain.

Step 1: Define a Strict, Deterministic Tool Contract

Start with a tool that has a clear business boundary and deterministic logic. Avoid model calls until t

he schema, validation, and client wiring are stable. Model variance obscures contract failures.

import { genkit, z } from 'genkit/beta';

const ai = genkit({
  logLevel: 'debug',
});

const complianceCheck = ai.defineTool(
  {
    name: 'policyAudit',
    description: 'Evaluates infrastructure changes against organizational compliance rules.',
    inputSchema: z.object({
      resourceType: z.enum(['compute', 'storage', 'network']),
      modificationCount: z.number().int().min(0),
      containsPublicAccess: z.boolean(),
      region: z.string().min(2),
    }),
    outputSchema: z.object({
      status: z.enum(['compliant', 'warning', 'violation']),
      flaggedRules: z.array(z.string()),
      estimatedRiskScore: z.number().min(0).max(100),
    }),
  },
  async (input) => {
    const flags: string[] = [];
    let risk = 0;

    if (input.containsPublicAccess) {
      flags.push('PUBLIC_ACCESS_DETECTED');
      risk += 40;
    }
    if (input.modificationCount > 5) {
      flags.push('HIGH_CHANGE_VOLUME');
      risk += 25;
    }
    if (input.resourceType === 'network' && input.region === 'us-east-1') {
      flags.push('RESTRICTED_REGION');
      risk += 35;
    }

    const cappedRisk = Math.min(risk, 100);
    const status = cappedRisk >= 60 ? 'violation' : cappedRisk >= 30 ? 'warning' : 'compliant';

    return {
      status,
      flaggedRules: flags,
      estimatedRiskScore: cappedRisk,
    };
  }
);

Rationale: Explicit input/output schemas prevent schema drift. Deterministic logic ensures predictable behavior during contract testing. Separating risk calculation from model inference keeps the tool auditable.

Step 2: Expose via Local MCP Server

Use the Genkit MCP plugin to wrap the tool. Stdio transport is intentional: it keeps the server process-bound and eliminates network exposure during development.

import { createMcpServer } from '@genkit-ai/mcp';

const mcpInstance = createMcpServer(ai, {
  name: 'compliance-audit-agent',
  version: '1.0.0',
  transport: 'stdio',
});

mcpInstance.start();

Rationale: createMcpServer automatically registers all defined tools and prompts. Stdio ensures the host process controls lifecycle and I/O, making it safe for local agent sandboxes.

Step 3: Verify with MCP SDK Client

Validate discovery and invocation before touching production infrastructure.

import { Client } from '@modelcontextprotocol/sdk/client/index.js';
import { StdioClientTransport } from '@modelcontextprotocol/sdk/client/stdio.js';

const transport = new StdioClientTransport({
  command: 'node',
  args: ['./dist/mcp-server.js'],
});

const client = new Client({ name: 'test-runner', version: '0.1.0' });
await client.connect(transport);

const tools = await client.listTools();
console.log('Available tools:', tools.tools.map(t => t.name));

const result = await client.callTool({
  name: 'policyAudit',
  arguments: {
    resourceType: 'network',
    modificationCount: 8,
    containsPublicAccess: true,
    region: 'us-east-1',
  },
});

console.log('Audit result:', JSON.parse(result.content[0].text));
await client.close();

Rationale: Direct SDK verification proves the contract works end-to-end. It isolates framework wiring from cloud infrastructure, making failures easier to diagnose.

Step 4: Transition to Cloud Run Flow Endpoint

For production, switch from MCP discovery to authenticated HTTPS execution. Genkit provides startFlowServer to expose flows as web endpoints.

import { startFlowServer } from '@genkit-ai/express';
import express from 'express';

const app = express();
app.use(express.json());

// Attach Genkit flows to Express
startFlowServer({
  app,
  genkit: ai,
  flows: [complianceCheck], // Expose specific tools/flows
  cors: { origin: process.env.ALLOWED_ORIGINS?.split(',') || [] },
});

const port = process.env.PORT || 8080;
app.listen(port, () => console.log(`Flow server listening on port ${port}`));

Rationale: Cloud Run expects a single HTTP entry point. startFlowServer handles routing, serialization, and error boundaries. Explicit CORS and flow whitelisting prevent unbounded exposure. IAM handles authentication at the platform level; application logic handles authorization.

Step 5: Integrate Observability and Secrets

Production AI tools require trace context, not just application logs. Genkit Monitoring integrates with Cloud Trace, Metrics, and Logging. Secrets must never live in environment variables.

import { genkit } from 'genkit/beta';
import { cloudTrace } from '@genkit-ai/cloud-trace';

const ai = genkit({
  plugins: [cloudTrace()],
  logLevel: 'info',
});

// Secret retrieval via Secret Manager SDK (example)
import { SecretManagerServiceClient } from '@google-cloud/secret-manager';
const secretClient = new SecretManagerServiceClient();
const [version] = await secretClient.accessSecretVersion({
  name: 'projects/PROJECT_ID/secrets/MODEL_API_KEY/versions/latest',
});
const apiKey = version.payload?.data?.toString();

Rationale: Cloud Trace captures latency, token usage, and error paths across middleware and model calls. Secret Manager enforces least-privilege access and rotation. Separating platform auth (IAM) from application secrets (Secret Manager) reduces blast radius.

Pitfall Guide

1. Premature Remote Exposure

Explanation: Deploying a local MCP server directly to Cloud Run without converting to an HTTP flow endpoint exposes stdio semantics to the internet. Agents can discover and invoke tools without authentication, leading to uncontrolled costs and data leakage. Fix: Always transition to startFlowServer for production. Use IAM to gate access. Reserve MCP stdio for local development and controlled agent sandboxes.

2. Dependency Surface Neglect

Explanation: A minimal Genkit + MCP setup pulls ~486 dependencies. npm audit frequently surfaces high-severity vulnerabilities in transitive packages. Dismissing these because "the demo works" creates supply-chain risk in production. Fix: Pin exact versions in package.json. Run npm audit in CI/CD. Use overrides or resolutions to patch vulnerable transitive dependencies. Treat AI tooling like any other internet-facing service.

3. Observability Afterthought

Explanation: Agent failures rarely produce clean stack traces. Without Cloud Trace, you cannot reconstruct which input schema arrived, which middleware intercepted it, or whether a model call was blocked or transformed. Fix: Enable @genkit-ai/cloud-trace from day one. Configure trace sampling to balance cost and visibility. Export metrics to Cloud Monitoring. Never deploy without trace context.

4. Overly Broad Tool Contracts

Explanation: MCP makes tools discoverable. If a tool accepts generic inputs like command: string or query: string, agents can trigger unintended side effects. Broad schemas bypass validation and increase attack surface. Fix: Use strict enums, bounded numbers, and explicit required fields. Validate inputs at the schema level before business logic executes. Document tool boundaries in descriptions.

5. Middleware as Security Proxy

Explanation: Genkit middleware is designed for cross-cutting concerns: retries, logging, redaction, and policy gates. It is not a substitute for IAM, Secret Manager, or application-level authorization. Relying on middleware for security creates fragile, unenforceable boundaries. Fix: Use IAM for platform authentication. Use Secret Manager for credential access. Use middleware for telemetry, input sanitization, and fallback routing. Keep security layers separate.

6. Ignoring Provider Fallbacks

Explanation: Tying a tool to a single model provider (e.g., Gemini API) creates vendor lock-in and single-point-of-failure risk. If the provider rate-limits or deprecates an endpoint, the tool fails silently. Fix: Abstract model calls behind a provider interface. Use Genkit's multi-provider support to configure fallbacks. Test failure paths explicitly.

7. Treating Local Verification as Production Readiness

Explanation: A successful local MCP call proves wiring, not deployment. It does not validate IAM policies, container startup scripts, secret injection, or trace export. Assuming parity leads to runtime failures in production. Fix: Maintain separate test suites for local contract validation and production integration. Verify Cloud Run deployment with authenticated requests and trace inspection before marking as ready.

Production Bundle

Action Checklist

Define tool contracts with strict input/output schemas before adding model calls
Verify local MCP discovery using createMcpServer and stdio transport
Run npm audit --omit=dev and resolve high-severity vulnerabilities before deployment
Transition to startFlowServer for Cloud Run; never expose stdio MCP to the internet
Configure IAM policies to restrict flow endpoint access to authorized service accounts
Store API keys and secrets in Secret Manager; never use plaintext environment variables
Enable @genkit-ai/cloud-trace and configure sampling rates for production workloads
Validate deployment with authenticated HTTPS requests and inspect Cloud Trace records

Decision Matrix

Scenario	Recommended Approach	Why	Cost Impact
Local agent development & contract testing	MCP stdio server via `createMcpServer`	Zero network overhead, process-bound, fast iteration	None (local only)
Controlled internal agent ecosystem	Remote MCP over HTTP with OAuth	Enables agent discovery while maintaining app-level auth	Moderate (network, auth infra)
Production workflow execution	Cloud Run flow endpoint via `startFlowServer`	IAM enforcement, Cloud Trace integration, managed scaling	Low-Moderate (Cloud Run + Trace)
High-risk tool with strict compliance	Deterministic schema + middleware policy gates	Prevents unintended side effects, enforces business rules	Low (compute only)
Multi-provider model dependency	Abstracted provider interface + fallback routing	Prevents vendor lock-in, improves resilience	Moderate (multi-provider billing)

Configuration Template

// package.json scripts
{
  "scripts": {
    "build": "tsc",
    "start": "node dist/flow-server.js",
    "dev:mcp": "node --loader ts-node/esm src/mcp-server.ts",
    "lint": "eslint src/",
    "audit:check": "npm audit --omit=dev --json"
  }
}

// src/flow-server.ts
import { startFlowServer } from '@genkit-ai/express';
import express from 'express';
import { genkit } from 'genkit/beta';
import { cloudTrace } from '@genkit-ai/cloud-trace';

const ai = genkit({
  plugins: [cloudTrace()],
  logLevel: process.env.NODE_ENV === 'production' ? 'info' : 'debug',
});

const app = express();
app.use(express.json());

startFlowServer({
  app,
  genkit: ai,
  flows: [ai.defineTool({ name: 'policyAudit', /* ... */ })],
  cors: { origin: process.env.ALLOWED_ORIGINS?.split(',') || [] },
});

const port = process.env.PORT || 8080;
app.listen(port, () => {
  console.log(`Genkit flow server ready on port ${port}`);
});

# Dockerfile for Cloud Run
FROM node:20-slim
WORKDIR /app
COPY package*.json ./
RUN npm ci --omit=dev
COPY dist/ ./dist/
ENV NODE_ENV=production
EXPOSE 8080
CMD ["npm", "start"]

Quick Start Guide

Initialize project: npm init -y && npm install genkit@1.36.0 @genkit-ai/mcp@1.36.0 @modelcontextprotocol/sdk@1.29.0 typescript ts-node
Create MCP server: Write a deterministic tool with strict schemas, wrap it with createMcpServer, and set transport to stdio.
Verify locally: Run the server in one terminal, execute the MCP SDK client in another, and confirm tool listing and invocation succeed.
Audit dependencies: Run npm audit --omit=dev, resolve high-severity issues, and pin versions in package.json.
Prepare for Cloud Run: Replace stdio server with startFlowServer, add package.json start/build scripts, and containerize with the provided Dockerfile. Deploy using gcloud run deploy with IAM restrictions enabled.

🎉 Mid-Year Sale — Unlock Full Article

Base plan from just $4.99/mo or $49/yr

7-day free trial · Cancel anytime · 30-day money-back