-
-
Notifications
You must be signed in to change notification settings - Fork 216
Description
Add Generation Script for londonTubeLines.json
Dataset
The londonTubeLines.json
dataset, showcased in this example:
- has a complex lineage
- lacks a generation script
Given the significant community interest in geospatial visualization, maintaining reproducible geographic datasets seems to be a worthwhile priority. A script to add to the repo that can generate (or update) londonTubeLines.json
from its original source, which is believed to be OpenStreetMap, would secure the dataset's long-term viability. Input from those with geospatial data expertise would be welcome.
Background and Current Status
As I understand it, the londonTubeLines.json
dataset is a TopoJSON file representing selected London Underground rail lines. It appears to have been added to the repository in this commit. The dataset's description, sources, and license are currently being expanded in pull request #663.
The commit history and related documentation suggest the following lineage:
- Original Source (Likely OpenStreetMap): The data was likely originally sourced from OpenStreetMap, although a direct link could not be found.
- Intermediate Source 1 (oobrien/vis): User @oobrien appears to have processed the data into a simplified GeoJSON format,
tfl_lines.json
. The file can be found in this commit of theoobrien/vis
repository, which cites OpenStreetMap. This file represents a simplified view of London transport lines from the original source. - Intermediate Source 2 (gicentre/litvis): @jwoLondon documented the process of converting
tfl_lines.json
to a TopoJSON file (similar tolondonTubeLines.json
) in this tutorial. This involved filtering specific lines and mapping properties usingndjson-cli
andtopojson
. When I attempted to folllow the instructions (code below), I wasn't quite able to match this repo's version. Also, this code still relies on an intermediate source, not the original source.
topoJSON files are not limited to aereal units. Here, for example, we can import a file containing the geographical routes of selected London Underground tube lines. The conversion of the
tfl_lines.json
follows a similar pattern to the conversion of the borough boundary files, but with some minor differences:
- The file is already in unprojected geoJSON format so does not need reprojecting or conversion from a shapefile.
ndjson-cat
converts the original geoJSON file to a single line necessary for further processing.- the file contains details of more rail lines than we need to map so
ndjson.filter
is used with a regular expression to select data for tube and DLR lines only.- the property we will use for the id (the tube line name) is inside the first element of an array so we reference it with
[0]
(where there is more than one element in the array it indicates more than one named tube line shares the same physical line).ndjson-cat < tfl_lines.json \ | ndjson-split 'd.features' \ | ndjson-filter 'd.properties.lines[0].name.match("Ci.*|Di.*|No.*|Ce.*|DLR|Ha.*|Ba.*|Ju.*|Me.*|Pi.*|Vi.*|Wa.*")' \ | ndjson-map 'd.id = d.properties.lines[0].name,delete d.properties,d' \ | geo2topo -n -q 1e4 line="-" \ > londonTubeLines.json
An initial attempt was made to create a generation script using @oobrien 's tfl_lines.json
as a starting point. The script involved using ndjson-cli
, topojson
, and d3-geo-centroid
, but the output did not perfectly match the existing londonTubeLines.json
in vega-datasets
.
1. Setup Commands
npm install -g shapefile ndjson-cli topojson d3-geo-centroid
apt-get install gdal-bin
wget https://raw.githubusercontent.com/oobrien/vis/master/tubecreature/data/tfl_lines.json
ndjson-cat tfl_lines.json \
| ndjson-split 'd.features' \
| ndjson-filter 'd.properties.lines.some((l) => l.name == "DLR" || l.name == "Bakerloo" || l.name == "District" || l.name == "Piccadilly" || l.name == "Northern" || l.name == "Hammersmith & City" || l.name == "Jubilee" || l.name == "Circle" || l.name == "Waterloo & City" || l.name == "Victoria" || l.name == "Metropolitan" || l.name == "Central") && !d.properties.lines.some((l) => l.name == "London Overground")' \
| ndjson-map 'd.id = d.properties.lines[0].name + (d.id ? "_" + d.id : ""), d' \
> tfl_lines_filtered.ndjson
geo2topo -n -q 1e4 line=tfl_lines_filtered.ndjson > londonTubeLines.json