Data Sources

TrackerNet and the TfL Unified API

Overview

Trackser Live aggregates and processes data from two distinct TfL (Transport for London) APIs to provide comprehensive real-time train tracking. Each data source has unique characteristics, strengths, and use cases.

Data Sources

Trackser Live aggregates data from two TfL APIs - TrackerNet (legacy, detailed Underground data) and the Unified API (modern, cross-network coverage). Each source has unique strengths, and we intelligently merge them to provide comprehensive coverage.

â„šī¸ Note: Both data sources are publicly available through TfL's Open Data platform.

For more information on TfL's Open Data platform, visit: https://tfl.gov.uk/info -for/open-data-users/

Thank you to Transport for London for providing access to these valuable data sources that power Trackser Live.

Below is a detailed breakdown of each data source, their characteristics, strengths, limitations, and how Trackser Live utilizes them.

🚇 TrackerNet API

Primary Data Source

TrackerNet is TfL's legacy real-time tracking system that has been in operation for many years. It provides detailed train position data through XML feeds for each line.

Key Characteristics:

  • Coverage: Underground lines only (not Elizabeth line or Overground)
  • Update Frequency: Updates on each request
  • Data Format: XML-based station-specific feeds
  • Location Detail: Provides track codes and platform-level granularity
  • Historical Reliability: Proven, stable system with consistent data structure

Strengths:

  • Rich metadata including track codes for precise location inference
  • Detailed platform and station-level information
  • Leading Car Number information
  • Consistent train identification across updates
  • Lower latency for Underground trains

Limitations:

  • Requires significant processing to normalize inconsistent station names
  • Contains duplicate and stale records that need filtering
  • Legacy XML format requires more parsing overhead

🔷 Unified API

Modern Alternative Source

The Unified API is TfL's newer, modernized API that provides a consistent interface across all TfL services including Underground, Elizabeth line, DLR, and Overground networks.

Key Characteristics:

  • Coverage: All TfL services (Underground, Elizabeth line, DLR, Overground)
  • Update Frequency: Variable, typically 30-60 seconds
  • Data Format: JSON-based RESTful API
  • Location Detail: Station-level, less granular than TrackerNet
  • Modern Architecture: Standardized across all TfL services

Strengths:

  • Broader network coverage including newer lines
  • More consistent data structure across all services
  • Modern JSON format easier to parse and process
  • Better standardization of station names

Limitations:

  • Less detailed location information (no track codes)
  • An extra layer of data processing which can add latency
  • Different train identification scheme can make tracking harder
  • Occasionally missing data that TrackerNet provides
  • Has a caching layer which can delay real-time updates

Data Source Comparison

Feature TrackerNet Unified API
Network Coverage Underground only All TfL networks
Data Format XML JSON
Update Frequency ~30 seconds ~30-60 seconds
Location Granularity Track code level Station level
Platform Information Detailed Basic
Train Identification Vehicle ID + Set Number Set Number
Data Consistency Requires heavy normalization More standardized
API Age Legacy system Modern system

How Trackser Live Uses Both Sources

Intelligent Merging Strategy

Trackser Live intelligently combines data from both sources to provide the most complete and accurate picture of train movements. The merging process follows these principles:

  • Primary Source Selection: TrackerNet is typically preferred for Underground lines due to its detail and reliability
  • Fallback Mechanism: If TrackerNet data is unavailable or stale, the system falls back to Unified API
  • Data Enrichment: Information from both sources is combined when available to fill gaps
  • Deduplication: Trains appearing in both sources are identified and merged based on location and vehicle IDs
  • Source Tracking: Each train record includes a source field indicating its origin

Configuration

The Unified API can be enabled or disabled per line through configuration. By default, it's disabled for most Underground lines where TrackerNet provides superior data, but it can be enabled for:

  • Lines where TrackerNet has known issues
  • Extended network coverage (Elizabeth line, Overground)
  • Backup/redundancy during TrackerNet outages
  • Comparative data analysis

💡 Pro Tip: You can see which data sources are active in the API response under the sources field, and track merge statistics in the stats.merged section to understand how data from both sources was combined.

Processing Pipeline

Regardless of source, all data goes through Trackser Live's comprehensive processing pipeline:

  1. Data Acquisition: Fetch data from TrackerNet XML feeds and/or Unified API JSON endpoints, twice per minute
  2. Parsing & Validation: Parse raw data and validate required fields
  3. Location Normalization: Standardize station names and infer locations from track codes
  4. Deduplication: Remove duplicate records within and across sources
  5. Enrichment: Add destination codes, stopping patterns, and historical context
  6. Merging: Intelligently combine trains from multiple sources
  7. Stall Detection: Analyze location duration against historical averages
  8. Output Generation: Format and compress final JSON response

Statistics & Monitoring

The API provides detailed statistics about data source performance in every response:

  • stats.trackernet - TrackerNet processing metrics (pre/post filtering, dropped IDs)
  • stats.unified - Unified API processing metrics (pre/post filtering, synthetic trains)
  • stats.merged - Merge operation results and train counts by source
  • sources - Current status of each data source ("ok", "disabled", "error")

📊 View Example API Output with Source Statistics →