04
Product
16
Backend
09
Auth
12
iOS
07
Infra
02
Real-Time
ADR-0047 ACCEPTED · 2025-07-29
Adopt OpenTelemetry tracing for iOS

Context

The backend already has OpenTelemetry tracing exported to Grafana Cloud Tempo via an Alloy collector. But mobile requests appear as orphaned traces — the backend shows a 67ms GraphQL request, but the user experienced 300ms+ including network latency, cache checks, and UI updates. That entire flow is invisible without iOS-side tracing.

Mobile-specific issues (poor connectivity, app backgrounding, battery throttling) and user journey context (complete flows from tap to UI update) can't be diagnosed from backend traces alone.

Decision

Adopt OpenTelemetry Swift SDK with OTLP export for distributed tracing in the iOS app.

An Apollo interceptor creates spans for GraphQL operations and injects W3C traceparent and tracestate headers into every request. The backend extracts the trace ID and parent span ID from these headers and continues the same trace — so an iOS-initiated request and its backend processing appear as a single distributed trace in Grafana, joined by the shared trace ID.

iOS App (OpenTelemetry) → OTLP → Alloy Collector → Grafana Cloud Tempo
    ↓ W3C traceparent header (trace ID + parent span ID)
Backend (OpenTelemetry) → OTLP → Alloy Collector → Grafana Cloud Tempo

Both iOS and backend export through the same Alloy collector, following Grafana's recommended pattern. This avoids iOS managing Grafana Cloud tokens directly and provides consistent retry/buffering behavior.

Consequences

Complete end-to-end request visibility. A single trace shows the iOS span (user tap → cache check → network request) as the parent of the backend span (GraphQL parse → execute → database query). Trace IDs link iOS actions to backend processing across Tempo.

The cost is a new dependency, small per-operation overhead (~1-2ms), battery impact from background exports, and trace data storage costs. Sampling (e.g., 10% in production) controls volume. PII must be kept out of span attributes.