Spring AI 2.0 MCP Annotations: From Tool to Production
Building Scalable MCP Servers with Spring AI 2.0: Annotation-Driven Architecture
Current Situation Analysis
The Model Context Protocol (MCP) has rapidly become the standard interface for connecting AI agents to external data and execution environments. While Python and TypeScript ecosystems adopted MCP server patterns early, Java developers faced a steep integration curve. Building an MCP server in Java traditionally required manual protocol wiring: constructing ToolCallback descriptors, hand-crafting JSON Schema definitions, managing transport lifecycles, and explicitly registering every endpoint with the runtime. This boilerplate consumed development cycles that should have been spent on domain logic.
The problem is often misunderstood as a framework limitation rather than an architectural mismatch. MCP expects a clean, schema-driven contract between client and server. When developers manually serialize parameters, manage transport state, and inject progress channels, they introduce fragility. Schema drift, transport timeouts, and context leakage become routine production issues. The mental model shifts from "implementing business capabilities" to "maintaining protocol compliance."
Spring AI 2.0.0-M6 (released May 2026) addresses this friction by introducing a native annotation layer. The framework now exposes @McpTool, @McpResource, @McpPrompt, and @McpComplete as first-class constructs. Spring Boot's auto-configuration scans these annotations, generates compliant JSON schemas from method signatures, and wires transport layers automatically. Framework-specific parameters like McpSyncRequestContext and McpAsyncRequestContext are recognized internally and stripped from client-facing schemas. This shifts the development paradigm from imperative protocol management to declarative capability exposure.
WOW Moment: Key Findings
The transition from callback-based wiring to annotation-driven registration fundamentally changes how MCP servers are architected in Java. The following comparison highlights the operational impact:
| Approach | Schema Generation | Transport Wiring | Progress Integration | Code Footprint |
|---|---|---|---|---|
Legacy ToolCallback API |
Manual JSON construction via JsonSchemaObject |
Explicit bean registration & transport config | Custom event emitter wiring | High (~120-180 lines per capability) |
| Spring AI 2.0 Annotations | Automatic from method signatures & @McpToolParam |
Auto-configured via starters | Context injection (framework-hidden) | Low (~10-20 lines per capability) |
This reduction in boilerplate is not merely cosmetic. Automatic schema generation eliminates serialization mismatches that cause client-side validation failures. Context injection ensures progress and logging channels remain invisible to the MCP contract, preventing schema pollution. The annotation layer also enforces consistent error boundaries, making it easier to standardize how tools report failures to AI agents. Teams can now iterate on business logic without rewriting protocol adapters for every new endpoint.
Core Solution
Building a production-ready MCP server with Spring AI 2.0 requires aligning domain capabilities with the annotation model while making deliberate transport and serialization choices. The following implementation demonstrates a warehouse inventory system exposing tools, resources, and prompts.
Step 1: Project Initialization
Spring AI 2.0 requires Java 21 and Spring Boot 3.5+. Milestone artifacts reside in Spring's milestone repository, which must be explicitly declared. For synchronous I/O workloads (typical for relational database interactions), the WebMVC starter provides the most straightforward deployment path.
<repositories>
<repository>
<id>spring-milestones</id>
<url>https://repo.spring.io/milestone</url>
</repository>
</repositories>
<dependencies>
<dependency>
<groupId>org.springframework.ai</groupId>
<artifactId>spring-ai-starter-mcp-server-webmvc</artifactId>
<version>2.0.0-M6</version>
</dependency>
</dependencies>
Step 2: Exposing Tools with Context-Aware Progress
Tools execute actions. In production, long-running operations require progress feedback to prevent client timeouts. Spring AI injects McpSyncRequestContext automatically, keeping it out of the JSON schema while exposing logging and progress channels.
package com.example.warehouse.mcp;
import org.springframework.ai.mcp.server.annotation.McpTool;
import org.springframework.ai.mcp.server.annotation.McpToolParam;
import org.springframework.ai.mcp.server.context.McpSyncRequestContext;
import org.springframework.stereotype.Component;
@Component
public class InventoryOperations {
private final InventoryRepository inventoryRepo;
public InventoryOperations(InventoryRepository inventoryRepo) {
this.inventoryRepo = inventoryRepo;
}
@McpTool(
name = "audit_stock_levels",
description = "Perform a full warehouse stock audit. Returns aggregated metrics per aisle."
)
public AuditReport auditStock(
McpSyncRequestContext ctx,
@McpToolParam(description = "Target warehouse identifier", required = true) String warehouseId
) {
ctx.logging().info("Initiating stock audit for warehouse: {}", warehouseId);
ctx.progress().report(0.1, "Fetching inventory manifests");
var manifests = inventoryRepo.fetchManifests(warehouseId);
ctx.progress().report(0.4, "Validating SKU counts");
var validated = inventoryRepo.validateQuantities(manifests);
ctx.progress().report(0.8, "Computing discrepancy metrics");
return inventoryRepo.generateAuditReport(validated);
}
}
Architecture Rationale:
- Record-based DTOs:
AuditReportshould be a Java record. Records provide immutable state, predictable Jackson serialization, and implicit schema mapping. This eliminates the ambiguity ofMap<String, Object>returns while avoiding verbose getter/setter chains. - Context Injection:
McpSyncRequestContextis recognized by the framework and excluded from the MCP JSON schema. This keeps the client contract clean while providing access tologging(),progress(),sampling(), andelicitation()channels. - Progress vs Streaming: Progress reporting is metadata for UI/agent feedback. It does not stream incremental results. The method still returns a single payload upon completion. For token-by-token LLM streaming, use sampling endpoints instead.
Step 3: Exposing Resources and Prompts
Resources expose readable state via URI templates. Prompts provide pre-structured text templates that clients can parameterize.
package com.example.warehouse.mcp;
import org.springframework.ai.mcp.server.annotation.McpResource;
import org.springframework.ai.mcp.server.annotation.McpPrompt;
import org.springframework.stereotype.Component;
@Component
public class WarehouseSurface {
@McpResource(
uri = "warehouse://{warehouseId}/layout",
description = "Returns the physical layout configuration for a specific warehouse."
)
public LayoutConfig getLayout(String warehouseId) {
return LayoutConfig.fromId(warehouseId);
}
@McpPrompt(
name = "reorder_analysis",
description = "Generates a prompt template for analyzing low-stock reorder thresholds."
)
public String generateReorderPrompt(
@McpToolParam(description = "SKU code to analyze", required = true) String skuCode
) {
return """
Analyze current stock levels for SKU %s.
Evaluate historical demand velocity, supplier lead times, and safety stock policies.
Recommend a reorder quantity and trigger threshold.
""".formatted(skuCode);
}
}
Architecture Rationale:
- URI Template Matching: The
{warehouseId}placeholder is resolved automatically. Method parameter names must align with template variables, or Spring AI will fail to bind the path segment. - Prompt Rendering: Prompts return plain strings. The client receives the rendered template and injects it into its model context. This decouples prompt engineering from tool execution.
Pitfall Guide
1. Context Parameter Schema Leakage
Explanation: Developers sometimes place McpSyncRequestContext or McpAsyncRequestContext after user-facing parameters, or attempt to serialize them manually. While Spring AI typically filters framework types, inconsistent ordering or custom serializers can cause context objects to leak into the JSON schema, breaking client validation.
Fix: Always declare context parameters first in the method signature. Verify the generated schema using /mcp/tools/list before deployment. Never annotate context parameters with @McpToolParam.
2. Blocking I/O in Reactive Contexts
Explanation: Using McpAsyncRequestContext with blocking database calls or synchronous HTTP clients defeats the purpose of reactive transport. This causes thread pool exhaustion under concurrent tool invocations.
Fix: Pair McpAsyncRequestContext with reactive repositories (ReactiveCrudRepository) or offload blocking work to Schedulers.boundedElastic(). Ensure the return type is Mono<T> or Flux<T>.
3. Misinterpreting Progress as Result Streaming
Explanation: Teams expect ctx.progress().report() to stream partial JSON payloads or token chunks. Progress is strictly metadata for agent UI feedback. The actual tool result remains a single serialized object.
Fix: Use progress for long-running batch operations. For incremental LLM output, implement sampling endpoints or switch to a streaming-capable transport with explicit token chunking.
4. Stateful Transport in Serverless Environments
Explanation: Deploying streamable-http to stateless platforms (Cloud Run, Vercel, AWS Lambda) without session management causes request routing failures. The protocol expects session affinity for bidirectional notifications.
Fix: Use stateless-streamable-http for serverless deployments. Pass session tokens explicitly in request headers, or implement a lightweight session registry backed by Redis if stateful behavior is mandatory.
5. URI Template Variable Mismatch
Explanation: @McpResource fails silently or throws binding errors when path variables do not match method parameters. Typos in {variable} syntax or missing @PathVariable equivalents break resource resolution.
Fix: Ensure exact string matching between URI template placeholders and method parameter names. Use IDE refactoring tools to rename parameters safely. Validate resource listing endpoints before client integration.
6. Unhandled Exception Propagation
Explanation: Throwing unchecked exceptions (NullPointerException, IllegalArgumentException) without MCP-aware wrapping corrupts the JSON-RPC response. Clients receive malformed errors instead of structured failure payloads.
Fix: Catch domain exceptions and wrap them in McpError or return a standardized error DTO. Implement a global @ControllerAdvice that translates exceptions into MCP-compliant error responses with clear, agent-readable messages.
7. Missing Milestone Repository Configuration
Explanation: Spring AI 2.0 milestones are not published to Maven Central. Omitting the milestone repository causes dependency resolution failures during build.
Fix: Explicitly declare https://repo.spring.io/milestone in pom.xml or build.gradle. Pin the exact milestone version (2.0.0-M6) to avoid unexpected snapshot breaks.
Production Bundle
Action Checklist
- Verify Java 21 and Spring Boot 3.5+ runtime compatibility before dependency resolution
- Declare Spring milestone repository explicitly in build configuration
- Replace manual
ToolCallbackregistrations with@McpToolannotations - Use Java records for all tool return types to ensure predictable schema generation
- Inject
McpSyncRequestContextorMcpAsyncRequestContextas the first method parameter - Validate generated JSON schemas via
/mcp/tools/listbefore client integration - Configure transport strategy based on deployment topology (stateful vs stateless)
- Implement global exception translation to MCP-compliant error responses
Decision Matrix
| Scenario | Recommended Approach | Why | Cost Impact |
|---|---|---|---|
| Local CLI agent tooling | stdio transport |
Zero network overhead, process isolation, ideal for single-user development | None (local only) |
| Internal microservice API | streamable-http (WebMVC) |
Session affinity, bidirectional notifications, reverse-proxy compatible | Moderate (requires sticky sessions or session store) |
| Serverless / autoscaled deployment | stateless-streamable-http |
No session affinity required, horizontal scaling friendly, request-driven | Low (pay-per-invocation, no session infrastructure) |
| High-throughput reactive workloads | streamable-http (WebFlux) |
Non-blocking I/O, backpressure support, efficient thread utilization | Higher (requires reactive ecosystem alignment) |
Configuration Template
spring:
ai:
mcp:
server:
transport: streamable-http
path: /api/mcp
name: warehouse-inventory-server
version: 1.2.0
stateless: false
logging:
level: INFO
jackson:
serialization:
write-dates-as-timestamps: false
fail-on-empty-beans: false
deserialization:
fail-on-unknown-properties: true
Quick Start Guide
- Initialize Project: Create a Spring Boot 3.5+ project with Java 21. Add the
spring-ai-starter-mcp-server-webmvcdependency and declare the Spring milestone repository. - Define Capabilities: Annotate service methods with
@McpTool,@McpResource, or@McpPrompt. Inject context parameters where progress or logging is required. Return Java records for predictable serialization. - Configure Transport: Set
spring.ai.mcp.server.transporttostreamable-httpfor local testing. Adjustpath,name, andversionto match your deployment conventions. - Validate Endpoints: Start the application and navigate to
http://localhost:8080/api/mcp/tools/list. Verify that schemas match method signatures and that context parameters are excluded. - Connect Client: Configure your MCP client (Claude Code, custom agent, or test harness) to point to the server endpoint. Execute tool calls and monitor progress channels in the client UI.
Mid-Year Sale β Unlock Full Article
Base plan from just $4.99/mo or $49/yr
Sign in to read the full article and unlock all tutorials.
Sign In / Register β Start Free Trial7-day free trial Β· Cancel anytime Β· 30-day money-back
