Expose domain as MCP tools for LLM-native interaction
Context
The trip planning engine (ADR-0033) exposes its domain through GraphQL for the iOS app, where GPT-5 with CFG constraints generates effects from natural language (ADR-0032). That works well inside the app, but the in-app agent is not how everyone will discover a trip planner. LLM clients like Claude Desktop, ChatGPT, and whatever Apple ships are becoming a primary surface for interacting with services. MCP (Model Context Protocol) is the wire protocol for connecting these clients to tools.
We wanted to let any LLM client drive the trip planning engine directly. The user's own model reads the tool descriptions, composes atomic operations, and builds trips through conversation. No server-side LLM calls, no prompt engineering on our end -- the intelligence is in the user's client. This is a different interaction model from the iOS app, where our GPT-5 agent interprets natural language into effects. Here, the LLM client IS the agent.
The speculative bet: MCP-connected apps could be a user acquisition channel. Someone plans a trip in Claude Desktop, likes the result, installs the iOS app for on-trip reference. Two channels serving different moments -- AI-first exploration vs. UI-first execution.
Decision
Add an MCP server as a second transport alongside GraphQL, exposed as an Axum route (/mcp) in the same binary. The rmcp crate provides Streamable HTTP + SSE transport with session management backed by Redis.
15 tools (14 domain operations plus a health check) organized into four groups:
- Trips: list, get, create, delete
- Destinations: add, remove, reorder, set trip dates, set destination dates
- Discovery: discrepancies, neighborhoods
- Versions: list, create, switch
Each tool is a thin wrapper around the same domain services that GraphQL resolvers call. create_trip calls TripService, add_destination goes through ModificationService::store_modification and applies effects identically to the GraphQL path.
Tools are deterministic -- pure domain operations with no LLM calls server-side. The user's model decides what to call and in what order. Tool descriptions teach the model the domain concepts (destinations are 0-indexed, versions work like git branches, dates are YYYY-MM-DD).
Mutations return both structured JSON and rendered HTML trip cards via structuredContent. Clients that support MCP's structured content can render trip state inline -- a destination list with dates, journey segments with transport icons, discrepancy badges. This is the "MCP Apps" idea: the LLM client becomes a lightweight UI for the service.
MCP mutations publish the same Redis signals as GraphQL (ADR-0041), so an iOS app with an active subscription sees updates whether the change came from the app or from Claude Desktop. The MCP server also subscribes to trip signals and forwards them as notifications/message to the connected client.
Auth uses OAuth 2.1 with PKCE (RFC 9126 / RFC 6749), since MCP clients expect standard OAuth flows. A separate OAuth domain handles dynamic client registration, authorization via passkey, and token exchange.
Consequences
LLM clients that speak MCP can plan trips without client-specific integrations on our side. The tool surface is small enough (15 tools) that models hold the full API in context. HTML trip cards give visual feedback inside chat without a separate UI.
The cost is a second auth flow (OAuth alongside PASETO), a second transport to maintain, and betting on a protocol that's still early. MCP clients vary in capability -- some support structured content, others don't; some handle SSE reconnection well, others drop sessions. The tool descriptions do real work: poorly worded descriptions lead to models calling tools incorrectly, so they need the same care as API documentation. We built an eval suite (Promptfoo, testing against GPT-5.4, Claude Haiku 4.5, Claude Sonnet 4.6) to catch regressions in model comprehension.