Architecture

How Trackser Live is built

Overview

Trackser Live is a Cloudflare Workers application. A cron trigger fires every 60 seconds. On each trigger, a timer schedules a second build shortly after — so snapshots are published twice per minute. One Durable Object per line handles all per-line state, fetching from TfL, merging and enriching the data, and publishing a gzip-compressed JSON snapshot to Cloudflare R2. The HTTP layer is almost entirely R2 reads — fast, cheap, and globally edge-cached.

cron (every 60 seconds) └─ index.ts: scheduled() ├─ for each line: LineBuilderDO.buildOnce() ← build #1 │ ├─ fetchTrackerNet() TfL TrackerNet XML │ ├─ fetchUnified() TfL Unified Arrivals API │ ├─ mergeTrains() combine sources, prefer TrackerNet │ ├─ normalizeAndEnrich() direction, vehicle ID, stall state │ ├─ applyMetStoppingPattern() timetable lookup (Met line) │ └─ publish to R2 gzipped JSON snapshot + archive └─ timer → LineBuilderDO.buildOnce() ← build #2 (~30s later)

Platform

Everything runs on Cloudflare's infrastructure. There are no traditional servers involved — no instances to manage, no idle costs, and no capacity planning. The worker scales automatically with request volume and runs close to users on Cloudflare's global edge network.

Runtime

Cloudflare Workers — TypeScript, deployed via Wrangler

State management

Durable Objects — one per line, SQLite-backed, handles all per-line build state and TrackerNet error tracking

Object storage

Cloudflare R2 — stores live snapshots, archive snapshots (14 days full-res, downsampled 15–28 days), and timetable JSON files

Key-value store

Cloudflare KV — timetable schedule lookups, API key config, low-latency reads

Scheduling

Cron triggers — one-minute cadence for live data builds; five-minute cadence for Trackser Pulse

Monitoring

Better Stack — alerting on degraded/down incidents (email, no calls on current tier)

Data processing pipeline

Each LineBuilderDO runs through this sequence every minute:

1. Fetch
Fetch fresh XML from TfL's TrackerNet API for every station on the line. TrackerNet updates approximately every 30 seconds per station. Builds run twice per minute — once on the cron trigger and once via a scheduled timer — keeping snapshots as fresh as the upstream feed allows.
2. Parse & validate
Parse the XML, extract train records, validate required fields, reject malformed records. Log the count of received vs. expected records — a consistent shortfall triggers a degraded alert.
3. Merge sources
If Unified API data is enabled for the line, merge both sources. TrackerNet is preferred for Underground trains; Unified provides supplementary coverage and acts as a fallback.
4. Location normalisation
Standardise station names, track circuit codes, and platform identifiers. Where TrackerNet's track codes are available but station data is missing, infer the location from a per-line track circuit mapping.
5. Deduplication
Identify and remove duplicate train records. Multiple TrackerNet station responses can report the same train; the dedup layer prefers the most recent or most-detailed record.
6. Enrichment — direction inference
Direction is inferred using a layered strategy: platform designation → station-level direction constraints → SetType field → destination-based inference → position delta from prior snapshot. The Met line applies additional branch-aware logic.
7. Enrichment — stall detection
An EWMA (exponentially weighted moving average) is maintained per vehicle per location. A train is flagged isMaybeStalled after it exceeds approximately five minutes at the same location beyond normal dwell. Stall state persists overnight to avoid false positives at stabling positions.
8. Enrichment — reformation detection
If a vehicle ID or set number changes while the train is at a standstill, the train is flagged isReformed. Reformation events are logged separately for analysis.
9. Met stopping patterns
For Metropolitan line trains, the worker looks up the train's trip against the working timetable (stored in R2, indexed via KV). This resolves the stopping pattern code, booked platform, and next destination — information that TrackerNet alone doesn't provide cleanly on the Met's branching network.
10. Destination cleansing
Depot movement codes, intermediate stop codes, and ambiguous TfL destination strings are resolved to human-readable terminus names. The Chiltern timetable is used for towards-inference on Met-shared track.
11. Publish
Write the gzip-compressed snapshot to R2 (two variants per line: map and line). Archive a copy with a timestamped key. Update the Durable Object's last-build metadata.

Storage layout

Live snapshots (R2)

Key patternDescription
live/{lineId}/map/latest.json.gzMap-area trains, gzip
live/{lineId}/map/latest.jsonMap-area trains, plain
live/{lineId}/line/latest.json.gzFull-line trains, gzip
live/{lineId}/line/latest.jsonFull-line trains, plain

Archive snapshots (R2)

Key patternRetention
archive/{lineId}/trains/map/{YYYY-MM-DD}/{HH-MM-SS}.json14 days full-res
Downsampled (every N minutes)15–28 days
Cold storage bucket28+ days

Timetable files (R2)

timetables/{lineId}/{timetableId}/{dayType}/northbound.json
timetables/{lineId}/{timetableId}/{dayType}/southbound.json

The active timetable schedule is stored in KV, mapping dates to timetable IDs and day directories.

Monitoring

Better Stack monitors the /admin/health endpoint. Alerts fire on degraded or down status (email only on current tier — no phone calls). The most significant past incidents were on 25–26 March 2026, caused by repeated TfL-side rate limiting between approximately 21:00 and 01:00 GMT.

Internal monitoring metrics to watch:

  • R2 Class A ops — normal baseline ~2.8M/month. Alert if approaching 5M.
  • Worker CPU (DO compute duration) — normal ~560K GB·s/month (~£12.50). Watch for spikes.
  • TrackerNet 429s — repeated rate-limit responses trigger degraded state and exponential backoff.

Security

  • All endpoints require a valid API key (X-API-Key header or ?key= query param)
  • Keys are stored in the API_KEYS environment variable or API_KEYS_KV, never in source code
  • HTTPS enforced for all requests; CORS headers allow cross-origin access
  • Cloudflare provides DDoS mitigation at the edge
  • Two TfL API keys in rotation to avoid rate-limit concentration