LAMP (Lightweight Application for Measuring Performance): Our goal is to consume real-time data from the MBTA network, process it to create helpful performance metrics, and make that data publicly available for any and all to use.
Our application code is available on github: https://github.com/mbta/lamp
Data Dictionary documentation for LAMP data exports: Data Dictionary README
Please note that we are in the early days of making datasets publicly available. This page may change frequently, check back for updates.
The MBTA publishes an implementation of the General Transit Feed Specification (GTFS) to communicate planned system service.
The LAMP team has created a compressed archive containing all MBTA GTFS schedules that have been issued, since 2009.
For each year and GTFS File/Field Definition combination a parquet file exists, e.g. (for feed_info.txt): https://performancedata.mbta.com/lamp/gtfs_archive/YYYY/feed_info.parquet
Addtionally, each parquet file contains two integer columns, added by LAMP, (gtfs_active_date, gtfs_end_date). These two columns are used to filter the parquet file for the GTFS records that were applicable on a single service date.
To find the feed_info.txt records that were applicable on December 25, 2022, query https://performancedata.mbta.com/lamp/gtfs_archive/2022/feed_info.parquet, with (gtfs_active_date <= 20221225 AND gtfs_end_date >= 20221225) to retrieve:
feed_publisher_name | feed_lang | feed_version | feed_start_date | feed_publisher_url | feed_id | feed_end_date | feed_contact_email | gtfs_active_date | gtfs_end_date |
---|---|---|---|---|---|---|---|---|---|
str | str | str | i64 | str | str | i64 | str | i32 | i32 |
MBTA | EN | Winter 2023, 2022-12-22T20:50:07+00:00, version D | 20221215 | http://www.mbta.com | null | 20221221 | developer@mbta.com | 20221223 | 20221229 |
For each year, The LAMP team has also created an SQLite file that mirrors the parquet files discussed above.
Because of their size, the SQLite files are gzipped. with the following URL construction: Replace [YYYY] with the YEAR of the requested service date.
Every GTFS File/Field Definition, available as a parquet file, has an equivalent table in the SQLite file that should be filtered using the integer columns (gtfs_active_date, gtfs_end_date) to get records applicable to a single service date.
GTFS records are considered to be applicable on the day after the publish date found in the feed_version column of the feed_info table.
The most recently published GTFS schedule is considered "active" by the (gtfs_active_date, gtfs_end_date) columns for one year past the schedule publish date. Meaning the most recent parquet files can be queried for future service dates.
Not all current GTFS File/Field Definition files are available for every year. For example, timeframes.txt was introduced in 2023, so timeframes.parquet does not exist for any partition years before 2023.
Performance Data for the MBTA Subway system partitioned by service date.
URL Construction: Replace [YYYY-MM-DD] with the YEAR, MONTH and DAY of the requested service date.
CSV File of all published service dates, and file paths, available here: https://performancedata.mbta.com/lamp/subway-on-time-performance-v1/index.csv
LAMP creates several datasets for the MBTA's internal Tableau server. The data analysts in the MBTA departments Operations Analytics and OPMI use these datasets to develop dashboards, metrics, and reports for both MBTA Operations and the public. These datasets are available at the following links.