Skip to content

Commit 2e56986

Browse files
committed
Merge branch 'master' into v5.0-dev
2 parents 7b3b5fd + 55dd0b4 commit 2e56986

File tree

4 files changed

+322
-2
lines changed

4 files changed

+322
-2
lines changed

bin/download-osm

Lines changed: 2 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -141,8 +141,8 @@ class Catalog:
141141
src.hash = len_to_hash[src.file_len]
142142
sources_by_hash[src.hash].append(src)
143143
else:
144-
print(f"WARN: Source {src} has unrecognized file "
145-
f"length={src.file_len:,}")
144+
print(f"WARN: Source {src} has unrecognized file length=" +
145+
("unknown" if src.file_len is None else f"{src.file_len:,}"))
146146
else:
147147
print(f"Unable to use sources - unable to match 'latest' without date/md5:")
148148
for s in no_hash_sources:

docs/README.md

Lines changed: 3 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,3 @@
1+
# OpenMapTiles Tools
2+
3+
* [Setting up PostgreSQL database](database.md)

docs/database.md

Lines changed: 125 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,125 @@
1+
## Creating PostgreSQL database
2+
There are several ways to create a PostgreSQL database for OpenStreetMap data. The simplest is to run it in a Docker container on your local machine. The more involved is to use one of the cloud providers, such as Google Cloud.
3+
4+
All of the examples here use these environment variables (the values can be changed):
5+
6+
```bash
7+
export POSTGRES_DB=openmaptiles
8+
export POSTGRES_USER=openmaptiles
9+
export POSTGRES_PASSWORD=openmaptiles
10+
```
11+
12+
Once PostgreSQL is running, you can test connection with
13+
14+
```bash
15+
PGPASSWORD="$POSTGRES_PASSWORD" psql -h localhost --user "$POSTGRES_USER" "$POSTGRES_DB"
16+
```
17+
18+
### Docker Image
19+
20+
```bash
21+
export POSTGRES_DB=openmaptiles # Database name, user, and password will be
22+
export POSTGRES_USER=openmaptiles # created automatically based on these values.
23+
export POSTGRES_PASSWORD=openmaptiles
24+
export DATA_DIR=$PWD/pgdata # store data in the current dir/pgdata
25+
26+
mkdir -p $DATA_DIR # Ensure the data dir exists
27+
docker run \
28+
--rm `# Delete this container on exit` \
29+
-it `# Run container with a terminal` \
30+
-p 5432:5432 `# Allow external access to the port 5432` \
31+
-e POSTGRES_DB `# On first run, create this database` \
32+
-e POSTGRES_USER `# On first run, create this user/password` \
33+
-e POSTGRES_PASSWORD \
34+
-v $DATA_DIR:/var/lib/postgresql/data `# Use current directory/pgdata to store PostgreSQL data` \
35+
openmaptiles/postgis:latest `# Use the latest OpenMapTiles postgis image` \
36+
postgres
37+
```
38+
39+
For the PostgreSQL versions 11+, make sure to use `postgres -c 'jit=off'` as the last line to run postgres with disabled JIT due to a bug making MVT queries run very slow.
40+
41+
42+
### Google Cloud (GCP)
43+
44+
To run PostgreSQL in Google cloud, you need install `gcloud` utility, and login into your Google account. You will need to set up a firewall rule to enable inbound access.
45+
46+
Additional notes:
47+
* See [GCP machine types](https://cloud.google.com/compute/docs/machine-types) for the `MACHINE_TYPE` setting. The `n1-highmem-2` uses an older 2-CPU machine with max memory. Full planet should use a bigger machine.
48+
* The `VM_DISK_SIZE` should fit OS + Apps + Data. Use ~700GB for the whole planet.
49+
* See [gcp startup script](gcp_startup.sh)
50+
51+
#### Set required env variables
52+
53+
```bash
54+
export GOOGLE_PROJECT_ID=<my_project> # Set to your GCP project name
55+
export GOOGLE_ZONE_NAME=us-central1-c # Which zone to use for the new VM
56+
export PG_VM_NAME=pg1 # Name of the VM to create
57+
58+
export VM_DISK_SIZE=15GB # VM disk size
59+
export MACHINE_TYPE=n1-standard-1 # Type of GCP VM to create
60+
export PG_VERSION=12 # PostgreSQL version
61+
62+
export POSTGRES_DB=openmaptiles # Database name, user, and password will be
63+
export POSTGRES_USER=openmaptiles # created automatically based on these values.
64+
export POSTGRES_PASSWORD=openmaptiles
65+
```
66+
67+
#### Firewall rule
68+
Create a firewall rule for VM instances tagged with "pg" to allow inbound PostgreSQL connections, but only from the local project network (not public).
69+
70+
```bash
71+
gcloud compute firewall-rules create allow-postgres \
72+
--description "Allow private PostgreSQL traffic on TCP port 5432" \
73+
--project $GOOGLE_PROJECT_ID \
74+
--allow tcp:5432 \
75+
--direction INGRESS \
76+
--source-ranges 10.0.0.0/8 \
77+
--target-tags pg
78+
```
79+
80+
#### Create Virtual Machine
81+
Create a new virtual machine and run startup script on it.
82+
```bash
83+
gcloud compute instances \
84+
create $PG_VM_NAME `# Create new VM with this name` \
85+
--project $GOOGLE_PROJECT_ID `# in this GCP project` \
86+
--zone $GOOGLE_ZONE_NAME `# and in this VM zone` \
87+
--image-family debian-10 `# use latest Debian-10 base image` \
88+
--image-project debian-cloud `# ` \
89+
--boot-disk-size $VM_DISK_SIZE `# Enough to fit OS+apps+data` \
90+
--boot-disk-type pd-ssd `# Use faster SSD disks (more expensive)` \
91+
--machine-type=$MACHINE_TYPE `# Specify machine hardware` \
92+
--tags=pg `# Use firewall rule from above` \
93+
`# Set boot script and required metadata` \
94+
--metadata-from-file startup-script=gcp_startup.sh \
95+
--metadata pg_version=$PG_VERSION,pg_database=$POSTGRES_DB,pg_user=$POSTGRES_USER,pg_password=$POSTGRES_PASSWORD
96+
```
97+
98+
#### Login and Verify
99+
Login into the newly created VM:
100+
```bash
101+
gcloud compute ssh --project $GOOGLE_PROJECT_ID $PG_VM_NAME --zone=$GOOGLE_ZONE_NAME
102+
```
103+
104+
Observe how the Postgres DB is being initialized by watching the output from the startup script. This command will show the last 1000 lines, and will wait for any new log lines. Use Ctrl+C to stop viewing.
105+
106+
```bash
107+
sudo tail -f -n 1000 /var/log/syslog | grep 'GCEMetadataScripts:'
108+
```
109+
110+
Connect to the newly initialized database by using postrges root account (from VM):
111+
112+
```bash
113+
sudo -u postgres psql openmaptiles
114+
```
115+
116+
You can also connect to the PostgreSQL server remotely from your local machine by using ssh port-forwarding. Run this command instead of (or in addition to) the regular ssh.
117+
118+
```bash
119+
# From your local machine:
120+
gcloud compute ssh --project $GOOGLE_PROJECT_ID $PG_VM_NAME --zone=$GOOGLE_ZONE_NAME -- -L 5432:localhost:5432
121+
122+
# From another terminal window on your local machine.
123+
# Make sure these env vars are set.
124+
PGPASSWORD="$POSTGRES_PASSWORD" psql -h localhost --user "$POSTGRES_USER" "$POSTGRES_DB"
125+
```

docs/gcp_startup.sh

Lines changed: 192 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,192 @@
1+
#!/usr/bin/env bash
2+
set -euo pipefail
3+
4+
if (( $(id -u) != 0 )); then
5+
echo "***************************************************"
6+
echo "*** FATAL: This script should be ran as ROOT ***"
7+
echo "***************************************************"
8+
exit 1
9+
fi
10+
11+
UTF8PROC_TAG=v2.5.0
12+
MAPNIK_GERMAN_L10N_TAG=v2.5.8
13+
PGSQL_GZIP_TAG=v1.0.0
14+
15+
CURL="curl --silent --show-error --location"
16+
17+
# Get configuration metadata (set with --metadata param during VM creation with gcloud)
18+
GET_METADATA="$CURL -H Metadata-Flavor:Google"
19+
PG_VERSION=$($GET_METADATA http://metadata.google.internal/computeMetadata/v1/instance/attributes/pg_version)
20+
OMT_PGDATABASE=$($GET_METADATA http://metadata.google.internal/computeMetadata/v1/instance/attributes/pg_database)
21+
OMT_PGUSER=$($GET_METADATA http://metadata.google.internal/computeMetadata/v1/instance/attributes/pg_user)
22+
OMT_PGPASSWORD=$($GET_METADATA http://metadata.google.internal/computeMetadata/v1/instance/attributes/pg_password)
23+
24+
# PostgreSQL dirs/files updated by this script
25+
# The non-existance of the config file is also used as an indicatior
26+
# that this is the first time this script has ran.
27+
PG_DIR="/etc/postgresql/${PG_VERSION}/main"
28+
PG_CONFIG_FILE="${PG_DIR}/conf.d/99-custom.conf"
29+
PG_HBA_FILE="${PG_DIR}/pg_hba.conf"
30+
31+
32+
33+
if [[ ! -f "${PG_CONFIG_FILE}" ]]; then
34+
echo "************ First time initialization **************"
35+
36+
# Add PostgreSQL packages
37+
$CURL https://www.postgresql.org/media/keys/ACCC4CF8.asc | sudo apt-key add -
38+
sh -c 'echo "deb http://apt.postgresql.org/pub/repos/apt $(lsb_release -cs)-pgdg main" > /etc/apt/sources.list.d/pgdg.list'
39+
40+
# Install the PostgreSQL server and postgis extension
41+
DEBIAN_FRONTEND=noninteractive apt-get update -qq
42+
DEBIAN_FRONTEND=noninteractive apt-get install -y "postgresql-${PG_VERSION}" postgis
43+
44+
# Install dependencies required to build extensions
45+
DEBIAN_FRONTEND=noninteractive apt-get install -y "postgresql-server-dev-${PG_VERSION}" build-essential git \
46+
xsltproc pandoc libkakasi2-dev libgdal-dev libprotobuf-dev libprotobuf-c-dev protobuf-c-compiler libxml2-dev \
47+
zlib1g-dev bison flex
48+
49+
50+
# Build and install Postgres extentions
51+
cd /opt
52+
53+
echo "Installing utf8proc"
54+
git clone --branch "$UTF8PROC_TAG" --depth 1 https://github.com/JuliaStrings/utf8proc.git
55+
cd utf8proc
56+
make
57+
make install
58+
ldconfig
59+
cd /opt
60+
rm -rf utf8proc
61+
62+
echo "Installing mapnik-german-l10n"
63+
git clone --branch "$MAPNIK_GERMAN_L10N_TAG" --depth 1 https://github.com/giggls/mapnik-german-l10n.git
64+
cd mapnik-german-l10n
65+
git checkout -q
66+
make
67+
make install
68+
cd /opt
69+
rm -rf mapnik-german-l10n
70+
71+
echo "Installing pgsql-gzip"
72+
git clone --branch "$PGSQL_GZIP_TAG" --depth 1 https://github.com/pramsey/pgsql-gzip.git
73+
cd pgsql-gzip
74+
make
75+
make install
76+
cd /opt
77+
rm -rf pgsql-gzip
78+
79+
# remove build deps we no longer need
80+
DEBIAN_FRONTEND=noninteractive apt-get remove --purge -y "postgresql-server-dev-${PG_VERSION}" build-essential git \
81+
xsltproc pandoc libkakasi2-dev libgdal-dev libprotobuf-dev libprotobuf-c-dev protobuf-c-compiler libxml2-dev \
82+
zlib1g-dev bison flex
83+
84+
# Create database
85+
systemctl restart postgresql
86+
sleep 3
87+
88+
sudo -u postgres \
89+
psql -v ON_ERROR_STOP="1" \
90+
-c "create user $OMT_PGUSER with password '$OMT_PGPASSWORD'" \
91+
-c "create database $OMT_PGDATABASE" \
92+
-c "grant all privileges on database $OMT_PGDATABASE to $OMT_PGUSER" \
93+
-c "\c $OMT_PGDATABASE" \
94+
-c "CREATE EXTENSION hstore" \
95+
-c "CREATE EXTENSION postgis" \
96+
-c "CREATE EXTENSION unaccent" \
97+
-c "CREATE EXTENSION fuzzystrmatch" \
98+
-c "CREATE EXTENSION osml10n" \
99+
-c "CREATE EXTENSION gzip" \
100+
-c "CREATE EXTENSION pg_stat_statements"
101+
102+
103+
# set the firwall rules to allow inbound connections from 10.0.0.0/8
104+
cat << EOF | tee "$PG_HBA_FILE"
105+
# DO NOT DISABLE!
106+
# If you change this first entry you will need to make sure that the
107+
# database superuser can access the database using some other method.
108+
# Noninteractive access to all databases is required during automatic
109+
# maintenance (custom daily cronjobs, replication, and similar tasks).
110+
#
111+
# Database administrative login by Unix domain socket
112+
local all postgres peer
113+
114+
# TYPE DATABASE USER ADDRESS METHOD
115+
116+
# "local" is for Unix domain socket connections only
117+
local all all peer
118+
# IPv4 local connections:
119+
host all all 127.0.0.1/32 md5
120+
# IPv6 local connections:
121+
host all all ::1/128 md5
122+
# Allow replication connections from localhost, by a user with the
123+
# replication privilege.
124+
local replication all peer
125+
host replication all 127.0.0.1/32 md5
126+
host replication all ::1/128 md5
127+
128+
# Allow external connections.
129+
# Note: add here all the networks you want or need.
130+
# Open for all by default with password
131+
host all all 10.0.0.0/8 md5
132+
133+
EOF
134+
135+
fi # end of the code that only runs on the first startup
136+
137+
138+
#
139+
# This code should execute on every server restart.
140+
# Recompute available memory and CPU count in case the server
141+
# hardware changed, and adjust Postgres configuration.
142+
# The settings assume this machine is dedicated to Postgres.
143+
#
144+
145+
146+
# %% of the RAM - it should be enough for most of the cases
147+
SHARED_BUFFERS=$(awk '/MemTotal/ { printf "%d", $2/1024 * 0.3 }' /proc/meminfo)
148+
149+
# %% of RAM is assumed to be disk cache (probably more too, but better be conservative)
150+
CACHE_SIZE=$(awk '/MemTotal/ { printf "%d", $2/1024 * 0.3 }' /proc/meminfo)
151+
152+
# Get the current number of CPUs
153+
CPU_COUNT=$(grep -c ^processor /proc/cpuinfo)
154+
155+
# create config which will be read last and overwrite all the settings defined before that
156+
# define your own settings and add to the list below
157+
cat << EOF | tee "${PG_CONFIG_FILE}"
158+
#
159+
# THESE VALUES WILL GET AUTO-GENERATED ON EVERY MACHINE RESTART
160+
#
161+
shared_buffers = ${SHARED_BUFFERS}MB
162+
effective_cache_size = ${CACHE_SIZE}MB
163+
164+
# PostgreSQL 11/12 JIT has a bug making large queries execute 100x slower than without JIT
165+
jit = off
166+
167+
# SSD disk has high concurrency
168+
effective_io_concurrency = 300
169+
170+
# if you see "error: too many dynamic shared memory segments", raise this value
171+
max_connections = $(( 10 + CPU_COUNT * 5 ))
172+
173+
work_mem = 128MB
174+
maintenance_work_mem = 256MB
175+
176+
min_wal_size = 256MB
177+
max_wal_size = 50GB
178+
wal_keep_segments = 64
179+
wal_sender_timeout = 300s
180+
max_wal_senders = 20
181+
182+
checkpoint_completion_target = 0.8
183+
random_page_cost = 1.0
184+
185+
# listen on all interfaces
186+
listen_addresses = '*'
187+
188+
EOF
189+
190+
# Set the owner and restart the postgres to pick up the new configuration
191+
chown -R postgres.postgres "$PG_DIR"
192+
systemctl restart postgresql

0 commit comments

Comments
 (0)