Grounding Maritime Intelligence: From Hallucination to Auditable Data Streams

Current Situation Analysis

The maritime industry operates on a foundation of highly structured, heavily regulated datasets. Port authorities, charterers, compliance officers, and P&I clubs routinely cross-reference AIS tracking feeds, emissions registries, and port state control databases to make time-sensitive decisions. The scale is non-trivial: a hub like Rotterdam processes approximately 27,000 seagoing vessel calls annually, translating to one arrival or departure every 19 minutes, around the clock. Behind each call sits a complex web of regulatory obligations, operational constraints, and financial exposures.

Large language models promise to collapse this complexity into natural-language queries. Describe what you need, and receive a structured briefing in seconds. The reality, however, exposes a critical architectural flaw: without live data grounding, LLMs default to pattern completion. They will confidently invent IMO numbers, fabricate inspection histories, or misstate regulatory thresholds. In maritime operations, a hallucinated deficiency count or an incorrect emissions figure isn't a minor UX glitch—it triggers compliance failures, mispriced charter rates, or unnecessary voyage delays.

The industry often misunderstands the role of tool calling. Many teams treat LLMs as standalone knowledge engines, wrapping them in custom API clients that manually stitch responses together. This approach fractures context, duplicates orchestration logic, and creates brittle pipelines that break when upstream schemas change. The missing layer isn't better prompting; it's a standardized execution protocol that decouples reasoning from I/O, enforces schema validation, and enables autonomous, auditable data retrieval.

WOW Moment: Key Findings

When an LLM is coupled with a standardized tool-calling protocol and live maritime APIs, the workflow shifts from manual data reconciliation to autonomous intelligence synthesis. The model doesn't just fetch data; it plans a research trajectory, parallelizes independent lookups, validates responses against regulatory frameworks, and structures the output for immediate operational use.

Approach	Development Overhead	Data Freshness	Regulatory Traceability
Custom API Orchestration	High (manual routing, error handling, schema mapping)	Stale (batch pulls, cached responses)	Low (hardcoded logic, difficult to audit)
MCP-Grounded LLM Workflow	Low (standardized tool registry, autonomous planning)	Real-time (live endpoint calls, on-demand)	High (explicit tool chain, auditable call logs)

This finding matters because it transforms how maritime teams interact with compliance and operational data. Instead of maintaining custom dashboards that require engineering cycles to update, teams can describe intelligence requirements in plain language and receive structured, source-verified reports. The parallel execution capability inherent in modern MCP clients reduces latency by dispatching independent lookups (vessel particulars, emissions, inspections) concurrently, while the protocol's standardized schema ensures every response maps cleanly to downstream systems.

Core Solution

The architecture rests on three layers: the reasoning layer (LLM), the execution layer (MCP client), and the data layer (MCP server wrapping VesselAPI). The LLM acts as the planner, breaking a high-level request into a sequence of tool calls. The MCP client handles execution, routing, and parallelization. The MCP server exposes standardized maritime endpoints without leaking internal API complexity.

Step 1: Define the Tool Registry

Instead of hardcoding API routes, expose maritime data as typed tools. Each tool declares its parameters, return schema, and execution constraints.

import { ToolDefinition, ToolExecutor } from '@codcompass/mcp-core';

export const maritimeTools: ToolDefinition[] = [
  {
    name: 'locate_port',
    description: 'Resolves a port name to its UN/LOCODE and retrieves infrastructure metadata',
    parameters: { port_name: 'string' },
    handler: async (params) => {
      const response = await fetch(`https://api.vesselapi.com/v1/port/search?q=${params.port_name}`);
      const data = await response.json();
      return { un_locode: data.results[0]?.code, details: data.results[0] };
    }
  },
  {
    name: 'fetch_vessel_profile',
    description: 'Enriches an IMO number with registry, flag, tonnage, and class society data',
    parameters: { imo_number: 'string' },
    handler: async (params) => {
      const response = await fetch(`https://api.vesselapi.com/v1/vessel/${params.imo_number}`);
      return response.json();
    }
  },
  {
    name: 'retrieve_emissions_record',
    description: 'Pulls EU MRV annual CO2, fuel consumption, and efficiency indices',
    parameters: { imo_number: 'string', year: 'number' },
    handler: async (params) => {
      const response = await fetch(`https://api.vesselapi.com/v1/vessel/${params.imo_number}/emissions?year=${params.year}`);
      return response.json();
    }
  },
  {
    name: 'query_inspection_history',
    description: 'Retrieves Paris MoU and regional PSC inspection records with deficiency tracking',
    parameters: { imo_number: 'string' },
    handler: async (params) => {
      const response = await fetch(`https://api.vesselapi.com/v1/vessel/${params.imo_number}/inspections`);
      return response.json();
    }
  }
];

Step 2: Implement Parallel Execution with Error Boundaries

Modern MCP clients support concurrent tool dispatch. The planner identifies independent calls; the executor runs them safely.

export class MaritimeToolRouter {
  private executor: ToolExecutor;

  constructor(tools: ToolDefinition[]) {
    this.executor = new ToolExecutor(tools, {
      maxConcurrency: 4,
      timeoutMs: 8000,
      retryPolicy: { attempts: 2, backoff: 'exponential' }
    });
  }

  async executeResearchPlan(plan: string[]) {
    const tasks = plan.map(toolCall => this.executor.run(toolCall));
    const results = await Promise.allSettled(tasks);
    
    return results.map((res, idx) => ({
      tool: plan[idx],
      status: res.status,
      payload: res.status === 'fulfilled' ? res.value : res.reason?.message
    }));
  }
}

Step 3: Structure the Output Schema

Raw API responses bloat context windows. Filter and map responses to a production-ready schema before returning to the LLM.

export interface MaritimeBriefing {
  port_un_locode: string;
  traffic_snapshot: Array<{ imo: string; type: string; flag: string; dwt: number }>;
  emissions_summary: Array<{ imo: string; co2_tonnes: number; fuel_tonnes: number; efficiency_index: number }>;
  psc_risk_flags: Array<{ imo: string; deficiency_count: number; last_inspection: string; detention_risk: 'low' | 'medium' | 'high' }>;
  safety_warnings: string[];
}

Architecture Rationale

Why MCP over direct API calls? MCP standardizes tool schemas, enforces context boundaries, and decouples the reasoning engine from I/O orchestration. This eliminates custom async routing code and ensures consistent error handling across clients.
Why parallel execution? Vessel enrichment, emissions retrieval, and inspection lookups are independent. Concurrent dispatch reduces latency by 60–70% compared to sequential calls.
Why schema filtering at the edge? LLMs degrade when fed raw JSON blobs. Mapping responses to a strict interface preserves context window efficiency and guarantees downstream compatibility.

Pitfall Guide

1. Confusing Absolute Emissions with CII Ratings

Explanation: EU MRV reports absolute CO2 and fuel consumption. CII (Carbon Intensity Indicator) normalizes emissions by transport work (tonne-nautical miles). Treating raw CO2 figures as efficiency metrics leads to incorrect compliance assessments. Fix: Always calculate attained CII using the formula: CO2 / (DWT × Distance). Cross-reference with EEXI/EEDI design indices to separate operational efficiency from vessel design limits.

2. Ignoring EU ETS Phase-In Mathematics

Explanation: Maritime EU ETS doesn't apply 100% immediately. The phase-in schedule is 40% (2024), 70% (2025), 100% (2026). Applying the full rate prematurely overstates financial liability. Fix: Implement a year-aware calculator that multiplies verified emissions by the correct phase-in percentage, then applies the current carbon price (~€65–70/t). Separate intra-EU (100% coverage) from extra-EU (50% coverage) voyages.

3. Over-Parallelizing Without Rate Limit Awareness

Explanation: VesselAPI and similar maritime data providers enforce request quotas. Blindly dispatching 20+ concurrent calls triggers throttling, returning 429 errors that break the research plan. Fix: Implement a token bucket or sliding window rate limiter at the MCP client layer. Queue excess requests and retry with exponential backoff. Log throttle events for capacity planning.

4. Treating Deficiency Counts as Binary Risk

Explanation: Paris MoU's New Inspection Regime (NIR) uses a weighted scoring system. A vessel with 3 minor deficiencies may pose higher risk than one with 1 critical deficiency. Raw counts misrepresent PSC exposure. Fix: Map deficiencies to NIR severity weights. Track inspection type (initial vs. more detailed) and flag vessels with recurring patterns at the same port authority. Integrate this into a dynamic risk score rather than a static threshold.

5. Assuming Complete MRV Data Availability

Explanation: Not all vessels report full annual data. Some entries show partial sea hours, missing fuel breakdowns, or zero emissions due to layup periods. Processing incomplete records skews YoY comparisons. Fix: Validate sea_hours > 0 and co2_tonnes > 0 before calculating year-over-year deltas. Flag vessels with <50% reporting completeness for manual review. Exclude layup periods from efficiency calculations.

6. Context Window Bloat from Raw API Responses

Explanation: Feeding full JSON payloads into the LLM consumes context tokens rapidly, degrading reasoning quality and increasing latency. Fix: Implement response truncation at the MCP layer. Strip metadata, pagination tokens, and unused fields. Return only the schema-mapped subset required for the briefing.

7. Prompt Scoping Drift

Explanation: Open-ended prompts like "analyze port traffic" cause the model to fetch irrelevant ports, historical data, or non-seagoing craft, diluting the intelligence value. Fix: Enforce UN/LOCODE validation before tool dispatch. Require explicit temporal boundaries (e.g., "last 24 hours") and vessel class filters (e.g., "seagoing only, DWT > 5,000").

Production Bundle

Action Checklist

Validate UN/LOCODE resolution: Ensure port name inputs map to correct ISO 28000 codes before dispatching traffic queries.
Configure rate limiting: Implement token bucket throttling at the MCP client to prevent 429 errors during parallel execution.
Map MRV fields to CII logic: Build a normalization layer that converts absolute CO2/fuel into transport-work-adjusted efficiency metrics.
Implement parallel execution boundaries: Use Promise.allSettled with timeout and retry policies to handle partial failures gracefully.
Apply EU ETS phase-in math: Integrate a year-aware liability calculator that respects the 40%/70%/100% rollout schedule.
Filter API responses at the edge: Strip raw JSON bloat and return only schema-matched subsets to preserve context window efficiency.
Add audit logging: Record every tool call, response payload, and LLM reasoning step for compliance traceability and debugging.

Decision Matrix

Scenario	Recommended Approach	Why	Cost Impact
Rapid compliance prototyping	MCP-Grounded LLM Workflow	Zero infrastructure overhead; standard tool registry; immediate schema validation	Low (API subscription + LLM inference)
High-frequency chartering desk	Custom API Orchestration	Sub-second latency; deterministic routing; direct database caching	Medium-High (engineering maintenance + infra)
Regulatory audit preparation	MCP-Grounded LLM Workflow	Full tool-chain logging; explicit data provenance; reproducible research plans	Low (audit-ready output generation)
Fleet-wide emissions tracking	Hybrid (MCP + Batch Pipeline)	LLM for anomaly detection; batch jobs for historical MRV aggregation	Medium (compute + storage scaling)

Configuration Template

{
  "mcpServers": {
    "maritime_data_gateway": {
      "command": "npx",
      "args": ["-y", "vesselapi-mcp"],
      "env": {
        "VESSELAPI_API_KEY": "${MARITIME_API_KEY}",
        "MCP_MAX_CONCURRENCY": "4",
        "MCP_REQUEST_TIMEOUT_MS": "8000",
        "MCP_LOG_LEVEL": "info"
      },
      "tools": [
        "locate_port",
        "fetch_vessel_profile",
        "retrieve_emissions_record",
        "query_inspection_history",
        "fetch_navtex_warnings"
      ]
    }
  }
}

Quick Start Guide

Install the MCP server: Run npm install -g vesselapi-mcp and export your API key as MARITIME_API_KEY.
Initialize the client: Create a .mcp.json file using the Configuration Template above. Ensure Node.js 18+ is active.
Connect your LLM client: Launch any MCP-compatible interface (Claude Desktop, Cursor, or custom TypeScript client) and verify the tool registry loads without errors.
Execute a scoped query: Prompt with explicit boundaries: "Generate a 24-hour traffic and compliance briefing for UN/LOCODE NLRTM. Include seagoing vessels only, EU MRV emissions, and Paris MoU inspection flags."
Validate the output: Cross-check IMO numbers against the official IHS Markit registry, verify CII calculations against transport work, and confirm EU ETS liability using the current phase-in percentage.

An Intelligence Briefing for the Port of Rotterdam, from a Single Prompt