Automatically downloads the latest version of maps from the OS Data hub and converts a configurable selection of these to GeoJSON format.
- Configurable to work with any OSOpenData product.
- Keeps a versions file to track the product version for all downloaded source data and extracted files.
- If a target file is missing or its version is out of date, its source data is downloaded and the file is extracted.
- Normally
ogr2ogr
is used to extract each map layer. - Optionally a shell script can be used to perform a custom extract.
- Working files (zips and expanded zips) and target GeoJSON files are stored in separate, configurable locations.
- The project is configured via a flexible JSON file.
- Clips map layers to a geography stored in a GeoJSON file.
This project is intended for a small website that uses OpenLayers or a similar library to display map layers for a specific area of the UK, such as a district or county council area. It keeps these maps up to date automatically. It is expected that map loading would be performed off-line, with the map data directory being uploaded at a later date. It is likely that a cron job will be defined to run this script periodically (daily?) to look for updates and install them if found.
This project is designed to run in a Unix-like environment such as Linux or macOS. The scripts rely on bash
and standard Unix command-line tools.
-
Tested Platforms: Ubuntu 24.0 and Raspbian Bookworm.
-
macOS: As macOS is a Unix-like operating system, the scripts should work, but this has not yet been formally tested.
-
For Windows users, we highly recommend installing and running the scripts from an Ubuntu environment under the Windows Subsystem for Linux (WSL).
Familiarity with the command line is required.
-
ogr2ogr
is used for data processing. Thegdal-bin
package is required.**On Ubuntu **: The default GDAL packages are often outdated. We recommend using the
ubuntugis
PPA for a more recent version.sudo add-apt-repository ppa:ubuntugis/ppa sudo apt-get update sudo apt-get install gdal-bin ogrinfo --version
This last step verifies installation.
**On Raspbian **
```bash
sudo apt update
sudo apt upgrade
sudo apt install libgdal-dev
-
The project's main logic is a Node.js application. It was developed and tested with version 24.0.1.
To check your current version, run
node -v. If you need to install or update Node.js, we recommend using a version manager like nvm.
-
Once Node.js is installed, navigate to the project directory and install the required npm packages. The package.json file handles this automatically.
npm install
This section outlines the steps to adapt the project to your local environment and specific mapping needs.
The project zip should be downloaded from github and unzipped to a directory on your system.
As well as the directory structure you have just downloaded, you will need to create 2 other directories:
- One for zip files (ZIPBASE) and expanded raw map files downloaded from OS. This directory is likely to get quite large whilst loading maps, but its contents can be deleted afterwards. It is not uncommon for downloaded zips to get to almost 1Gb in size of which you may only want a small part.
- One for maps to use in the final system (PROCESSED_DIR), this will be much smaller as it will only contain maps for your area.
For the purposes of documentation I have defined these as:
PROCESSED_DIR=/var/www/mapdata/
ZIPBASE=/home/myuser/raw_maps/osopendata/
## Script Configuration:
The process is run from the shell script update.sh , which is as follows:
#!/bin/bash
PROCESSED_DIR=/var/www/mapdata/
ZIPBASE=/home/jules/raw_maps/osopendata/
NODE=/home/jules/.nvm/versions/node/v24.0.1/bin/node
SCRIPTS=/home/jules/Projects/maploaders/OpenOSloader/scripts/
${NODE} ${SCRIPTS}update_opendata.js processed=${PROCESSED_DI
Determine the actual location of node using:
```bash
whereis node
Note the location returned
Use an editor such as nano to edit this script so that:
- PROCESSED_DIR and ZIPBASE are set to the directories you created for them.
- NODE is set to the location of node returned from whereis (this helps later with running it as a cron job, where the environment is ofter restricted).
- SCRIPTS is the directory the the projects scripts directory.
There is an example_config.json in the project's scripts directory, which is a cutdown version of the live config.json which is sufficient to illustrate the main points.
This file controls which products are downloaded, types of map extracted and are covered by mapping.
- products
{
"products":{
"BoundaryLine": {
"name": "Boundary line",
"formats": [
"ESRI® Shapefile"
],
"targets": [
{
This object's key is the id of an OS product listed in the products list on the OS Data Hub. So this file is configured to load BoundaryLine and OpenMapLocal. products should have:
- name - name of the product
- areas - optional list of areas to select the areas of interest for products that specify this.
- formats a list of map formats that should be downloaded
targets in the above describes the shp files you want to load and where you want to put the resulting files. With the OpenMapLocal example there are two targets buildings and natural_environment showing that a product can be partitioned into different directories in the loader.
"targets": [
{
"directory":"boundaries",
"datasets": [
{
"targets":["Teignbridge.json"],
"source":"Data/GB/district_borough_unitary_region.shp",
"where":"CODE IN ('E07000045')"
},
{
"targets":["TeignbridgeWards.json"],
"source":"Data/GB/district_borough_unitary_electoral_division_region.shp",
"where":"CODE IN (@codes)",
"where_codes": ["E05011905","E05011912","E05011910","E05011895","E05011899","E05011909","E05011908",
"E05011906","E05011901","E05011902","E05011914","E05011913","E05011915","E05011915",
"E05011896","E05011898","E05011897","E05011900","E05011892","E05011893","E05011904",
"E05011903","E05011907","E05011911" ]
}
]
}
Each top level target has
- a directory within PROCESSED_DIR which contains loaded maps.
- datasets each of these controls the action of a script
{
"targets":["Teignbridge.json"],
"source":"Data/GB/district_borough_unitary_region.shp",
"where":"CODE IN ('E07000045')"
},
- targets is a list of json files output to directory
- source is the path within the zip file (unzipped) for the file to extract the json from.
- sourcefiles list of source files to be loaded within a single zip only used where there are many sources per zip
- where clause to filter features for inclusion in the extracted dataset, @codes in a where clause is substituted for a list
- where_codes array of codes to be formatted as a comma separated string and replace @codes in the where clause
- script is an optional shell script within the target's directory within scripts that performs the extract.
In this example
Teignbridge.json
is the district council boundary, anddistrict council boundary_region.shp
is a shape file containing all the boundaries for that teir of government within GB. The reason only one boundary is extracted is that the where clause restrict the extract to that districts unique code.
The next example shows the use of a list of codes and @codes to make it a bit easier to enter the list into json
{
"targets":["TeignbridgeWards.json"],
"source":"Data/GB/district_borough_unitary_electoral_division_region.shp",
"where":"CODE IN (@codes)",
"where_codes": ["E05011905","E05011912","E05011910","E05011895","E05011899","E05011909","E05011908",
"E05011906","E05011901","E05011902","E05011914","E05011913","E05011915","E05011915",
"E05011896","E05011898","E05011897","E05011900","E05011892","E05011893","E05011904",
"E05011903","E05011907","E05011911" ]
This shows the use of @codes to put a list of area codes in the where clause for selection.
Loading postcodes from Code Point GB is an example of loading from a different format to a different format, here to a custom storage for postcodes which provides a simple natural index based on file name. This uses a custom node script postcodes/js/loaders.js to read the source csv files and generate the index. This node script is initiated by a shell script postcode.sh
"CodePointOpen": {
"name": "Code Point Open",
"formats": [
"CSV"
],
"script":"postcodes/",
"targets": [
{
"directory":"postcodes",
"datasets":[
{
"targets":["EX1.json"],
"source":"Data/CSV/",
"sourcefiles":["ex.csv","pl.csv","tq.csv"],
"script":"postcode.sh"
}
]
}
]
},
Here the format is CSV and the unzipped source folder contains csv file for each GB postcode out code. We only want to load postcodes starting ex, pl, tq so we list the required source files in the sourcefiles field, and specify the path to these within the zip in source. We need to name at least one target (EX1.json) to ensure that the process detects something missing.
The OpenMapLocal example below illustrates some other features:
- The ability to specify an already loaded boundary as a clipping limit to geometries in a file. In this case we limit building and woodland to the teignbridge boundary that is loaded earlier in the config.
- The ability to pick which OS tiles are extracted from a product, this is important where supply is in 100X100km tiles or 10X10km tiles. Here tiles use the OS sheet naming conventions. The list of available downloads is in the product's url given in the products list.
- The ability to specify multiple targets for a product.
"OpenMapLocal": {
"name": "OS Open Map Local",
"script": "openmaplocal/",
"areas": [
"SX"
],
"formats": [
"ESRI® Shapefile"
],
"clip":"/boundaries/Teignbridge.json",
"targets": [
{
"directory":"buildings",
"datasets":[
{
"targets":["building.json"],
"source":"OS OpenMap Local (ESRI Shape File) SX/data/SX_Building.shp"
}
]
},
{
"directory":"natural_environment",
"datasets":[
{
"targets":["woodland.json"],
"source":"OS OpenMap Local (ESRI Shape File) SX/data/SX_Woodland.shp"
}
]
}
]
}
}
}
It is suggested that if you want to check that it all works with a minimal set of outputs that you swap config.json for example_config.json, and then run update.sh from the command line:
./update.sh
If all is set up correctly this should download and extract some json files to PROCESSED_DIR
The script to load postcodes looks like
### postcode.sh
#!/bash
NODE=/home/jules/.nvm/versions/node/v24.0.1/bin/node
${NODE} ./js/loader.js --sourcefiles="${SOURCES}" --source="${SOURCE}" --targetdir="${TARGET_DIR}"
Though superceded, Typical scripts to extract json files look like
#!/bin/bash
TARGET="${TARGET_DIR}Teignbridge.json"
rm ${TARGET}
ogr2ogr -f GeoJSON "${TARGET}" ${SOURCE} \
-where "CODE IN ('E07000045')"
The script's environment is set according settings in config.json
The following environment variables are provided:
Variable | Content |
---|---|
PATH | is extended to include the likely installed locations of ogr2ogr |
ZIPBASE | value passed from update.sh |
PROCESSED_DIR | value passed from update.sh |
SOURCE | path of the unzipped file or folder supports substitution of {VERSION} for the product version without embedded |
SOURCEFILES | list of files to be processed |
TARGET_DIR | directory to contain the output data |
TARGET | if there is only one output file for a source, this is its full path |
TARGETS | if there is more than one output file for a source |
CLIPSRC | set to the clip field if set on the product, target or dataset |
WHERE | expanded version of the where clause |
MAPDATA | same as processed_dir |
Provide the final command to run the script: ./update.sh.
As well as keeping a record of the current version of each map, the versions files could be used by a remote utility to synchronise maps. There is a versions file in each target directory the following is an example from the boundaries folder:
{
"Teignbridge.json": {
"version": "2025-05",
"loaded": "2025-08-11T13:55:05.839Z"
},
"Devon.json": {
"version": "2025-05",
"loaded": "2025-08-11T13:55:06.451Z"
},
"DevonConstituencies.json": {
"version": "2025-05",
"loaded": "2025-08-11T13:55:08.113Z"
},
"DevonWards.json": {
"version": "2025-05",
"loaded": "2025-08-11T13:55:09.737Z"
},
"TeignbridgeParishes.json": {
"version": "2025-05",
"loaded": "2025-08-12T17:29:18.793Z"
},
"TeignbridgeWards.json": {
"version": "2025-05",
"loaded": "2025-08-11T18:53:17.309Z"
}
}
This can be achieved using syncmaps.php
This section will list any known bugs or limitations.
26/8/2025 Removed UPRN load as now using ONS derived UPRNs 25/8/2025 Functions getjson, getdata, executeScript, executeogr, unzipFile moved to utils.js from update_opendata.js and exported as now used by other scripts. This makes it clearer. Also optional headers passed to getjson and getdata so that User-agent and/or Authorization can be set. 20/8/2025
- E04013236 parish code added for Newton Abbot to relect ONS 2023 change 17/8/2025
- Changed to using unzipper library instead of yauzl as this is more tolerant of leading / on directory names. 12/8/2025
- Added processing of OpenUPRN product to a csv and uprn_lookup.json which also looks up polygon codes such as Porish and Ward - see scripts/uprn/uprn.sh for more 11/8/2025
- For Most layers ogr2ogr is used to generate a single output file from a shp file with an optional where clause and clipping polygon. In these cases no .sh file is necessary.
- Script and config have been implemented to load postcodes from CodePointGB to a custom csv format.
- Shell script for examples like height grids (Panorama), raster data (), csv files. Also to perform bespoke follow on procedures such as setting additional attributes that might be needed by the application.
- GUI front end to set up the config file based on date in the OS Data Hub API.