Configuring transit systems

Transit systems - like the NYC subway or San Francisco BART - are added to Transiter by providing a YAML configuration file that contains information about the transit system. You can find many examples of transit system configurations in the systems subdirectory of the Transiter repository. If the system you're interested in is there, then you don't need to do anything! From the root of the Transiter repository, you can install the system using:

transiter install $SYSTEM_ID

Otherwise, to add a transit system to Transiter you need to write a new YAML file. At the very least, this file contains:

The transit system's name.
The URLs of its GTFS data feeds.

There are also some advanced configurations available, which are discussed below:

Parameters (like API keys) that must be provided with system install requests.
Custom definitions for the system's "service maps".

After writing a YAML file, the transit system is installed in the same way:

transiter install --file $PATH_TO_YAML_FILE

The system ID will be the name of the YAML file (e.g., path/to/my-system.yaml will have ID my-system) but this can be overridden with the --id flag.

The schema for the YAML config is given by the system config type in the API schema. All of the snake_case field names in the proto are in camelCase in the YAML.

Basic configuration

The following is basic example of a transit system configuration. After presenting the full configuration, each section is described below.

name: Transit system name

feeds:
  - id: gtfsstatic
    type: GTFS_STATIC
    url: https://www.transitsystem.com/feed_1

  - id: gtfsrealtime
    type: GTFS_REALTIME
    url: https://www.transitsystem.com/feed_2
    # Optional fields for the HTTP request. Generally these don't need to be set.
    headers:
    - X-Extra-Header: "header value"
    requestTimeoutMs: 4000

Basic Transit system information

The config begins with the Transit system name:

name: Transit System Name

The name can any string.

Feeds

Next, the config describes the feeds for the system. This is the most important part of the configuration.

feeds:
  - id: gtfsstatic
    # feed configuration...

The feed ID must be unique within the configuration file, and can't contain the / character.

Feed type

The feed type tells Transiter what kind of feed this is.

    type: GTFS_STATIC

There are currently 3 options:

GTFS_STATIC a GTFS static feed.
GTFS_REALTIME a GTFS realtime feed.
NYCT_TRIPS_CSV this is a special type of feed and only relevant for the NYC subway.

The Transiter project is open to adding other types of feed, especially to support transit systems that don't provide data in the GTFS format.

Feed URLs

The configuration for the feed includes with instructions for how to obtain it over the internet:

    url: https://www.transitsystem.com/feed_2
    headers:
    - X-Extra-Header: "header value"
    requestTimeoutMs: 4000

Transiter will perform a GET request to the given URL with the specified headers and with the provided timeout. If not specified:

no additional headers are sent.
a default timeout of 5 seconds is used.

Advanced: scheduling policy

After a transit system is installed, Transiter periodically performs feed updates for all feeds in the system. A feed update fetches new data from the feed URL and updates the data in Transiter accordingly. By default Transiter uses the following schedule for feed updates:

For GTFS realtime feeds, Transiter performs a feed update every 5 seconds.
For GTFS static and other feeds, Transiter performs a feed update at around 3am in the timezone of the transit system. The timezone is read from the GTFS static data.

This behavior can be overridden by setting the schedulingPolicy field in the feed configuration. For example, to update the feed every 20 seconds:

feeds:
  - id: gtfsstatic
    # other fields...
    schedulingPolicy: PERIODIC
    periodicUpdatePeriodMs: 20000  # 20000 milliseconds = 20 seconds

To update the feed at 5pm every day in the US Eastern timezone:

feeds:
  - id: gtfsstatic
    # other fields...
    schedulingPolicy: DAILY
    dailyUpdateTime: 17:00
    dailyUpdateTimezone: America/New_York

To stop automatic updates entirely:

feeds:
  - id: gtfsstatic
    # other fields...
    schedulingPolicy: NONE

Note that feed updates can always be triggered manually by using the feed update method in the admin API.

Advanced: required for install

In the process of installing a transit system, Transiter by default performs feed updates for all static feeds in the system. The motivation is that a transit system is not very useful without static data (like the list of all stations), and so to fully install a system the data must already be in place. Transiter does not perform feed updates for realtime feeds during the install process.

This behavior can be overridden using the requiredForInstall field:

feeds:
  - id: gtfsstatic
    # other fields...
    requiredForInstall: false

  - id: gtfsrealtime
    # other fields...
    requiredForInstall: true

Advanced: GTFS realtime options

For GTFS realtime feeds, additional options can be provided the affect how the feed is parsed. For example, Transiter supports GTFS realtime extensions for the NYC subway. These additional options are set using the gtfsRealtimeOptions field and are described in the API reference.

User provided parameters

Sometimes the transit system configuration needs to be personalized for each individual installation. For example, if one of the feeds requires an API key, it's best to have that provided by the person installing the transit system rather than hard-coding a specific key into the configuration. That way configurations can be safely shared without also sharing private keys.

To support these situations, Transiter has a way for system configuration files to accept parameters. When installing a system using the CLI, the format is:

transiter install --arg name1=value1 --arg name2=value2 -f $TRANSIT_SYSTEM_ID $PATH_TO_YAML_FILE

When arguments are passed like this, Transiter interprets the configuration file as a Go template. The arguments can be used in the YAML config using the {{ .Args.name1 }} syntax

The following is a simple example of providing an API key using arguments:

    http:
      url: "https://www.transitsystem.com/feed_1?api_key={{ Args.api_key }}"

The NYC Subway system configuration is an example of a real configuration that uses arguments.

Service maps

Service maps are a novel feature of Transiter that provide a rich connection between routes and stops that is missing in the standard GTFS static specification.

When people think of a route, like the L train in New York City, they usually think about the list of stops the route calls at (8th Ave, 6th Ave, Union Sq...). When people think of a stop, like Pico station in Los Angeles, they usually think of the routes available at that stop (the A and E lines).

The GTFS static specification does not contain this data explicitly. Transiter service maps are built on the idea that this data is contained in the static data implicitly. Namely, if you have the complete timetable for a transit system, it is possible to auto-generate the list of stops for each route by merging together the paths taken by each trip in that route. Once you have this list worked out for each route, for a given stop you can then determine which routes call at it by finding the routes that contain the stop in the corresponding list.

Service maps implement this idea. Moreover, service maps work not just for the complete timetable: you can configure a service map for given "slices" of the timetable (for example the weekday day service), as well as for realtime data.

To see how service maps look in the HTTP API, check out the service_maps data given in these endpoints:

Configuring service maps

Each route in a transit system can have multiple service maps. The service maps desired are defined in the YAML configuration. If no service maps are defined, the default service maps are used.

Here's an example of three service maps definitions; any_time, weekday_day, and realtime:

serviceMaps:
  - id: alltimes
    source: STATIC
    threshold: 0.05

  - id: weekday_day
    source: STATIC
    staticOptions:
      days: ["Monday", "Tuesday", "Wednesday", "Thursday", "Friday"]
      startsLaterThan: 7
      endsEarlierThan: 19

  - id: realtime
    source: REALTIME

Let's step through the options for each one.

The source parameter can be either STATIC (so the map is generated using the timetable data from the GTFS static feeds) or REALTIME (so the map is generated using data from the GTFS realtime feeds).

The threshold parameter is a way of removing one-off trips that may follow a non-standard list of stops. A threshold of 0.05 means that, after collecting all of the trips for a route, group them together based on the list of stops they call at, and remove trips if their group accounts for less that 5% of the trips for the route.

The staticOptions field enables one to create maps based on certain portions of the timetable. The any_time map contains no conditions: it's built using the full timetable. However the weekday_day map contains three conditions: it only uses the timetable corresponding to trips that:

Run during the weekdays: days: ["Monday", "Tuesday", "Wednesday", "Thursday", "Friday"]
Start after 7am in the morning: startsLaterThan: 7
End before 7pm in the evening: endsEarlierThan: 19.