Skip to content

Commit 5c4dcc1

Browse files
committed
Use tags = NULL in middle tables if object doesn't have any tags
This doesn't make much of a difference for the ways and rels table, but if we store all nodes in the database, it does make a huge difference, because most nodes don't have any tags. For a current planet, disk usage for the nodes table goes from 476 GB down to 409 GB saving 67 GB or nearly 15%. Additionally it makes use of that table simpler. If you want to do any queries on tags, you need an index on the tags column on the nodes/ways/rels tables like this: CREATE INDEX ON planet_osm_ways USING gin (tags); But that is wasteful, because of the empty tags. We probably want to generate them as CREATE INDEX ON planet_osm_ways USING gin (tags) WHERE tags != '{}'::jsonb; But now all queries on those tables have to include that extra condition so that the query planner will use the index. SELECT * FROM planet_osm_ways WHERE tags ? 'highway' AND tags != '{}'::jsonb; If we use NULLs, the index can be created as: CREATE INDEX ON planet_osm_ways USING gin (tags) WHERE tags IS NOT NULL; And now the query becomes simpler, because the NOT NULL is automatically taken into account by the query planner: SELECT * FROM planet_osm_ways WHERE tags ? 'highway'; Note that this is an incompatible change to the new format middle tables, but they are still marked as experimental, so we can do this.
1 parent 6ceb4f9 commit 5c4dcc1

File tree

1 file changed

+11
-3
lines changed

1 file changed

+11
-3
lines changed

src/middle-pgsql.cpp

Lines changed: 11 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -323,6 +323,10 @@ template <typename T>
323323
void pgsql_parse_json_tags(char const *string, osmium::memory::Buffer *buffer,
324324
T *obuilder)
325325
{
326+
if (*string == '\0') { // NULL
327+
return;
328+
}
329+
326330
auto const tags = nlohmann::json::parse(string);
327331
if (!tags.is_object()) {
328332
throw std::runtime_error{"Database format for tags invalid."};
@@ -613,6 +617,10 @@ void middle_pgsql_t::copy_attributes(osmium::OSMObject const &obj)
613617
void middle_pgsql_t::copy_tags(osmium::OSMObject const &obj)
614618
{
615619
if (m_store_options.db_format == 2) {
620+
if (obj.tags().empty()) {
621+
m_db_copy.add_null_column();
622+
return;
623+
}
616624
json_writer_t writer;
617625
tags_to_json(obj.tags(), &writer);
618626
m_db_copy.add_column(writer.json());
@@ -1464,7 +1472,7 @@ static table_sql sql_for_nodes_format2(middle_pgsql_options const &options)
14641472
" lat int4 NOT NULL,"
14651473
" lon int4 NOT NULL,"
14661474
"{attribute_columns_definition}"
1467-
" tags jsonb NOT NULL"
1475+
" tags jsonb"
14681476
") {data_tablespace}";
14691477

14701478
sql.prepare_queries = {
@@ -1530,7 +1538,7 @@ static table_sql sql_for_ways_format2(middle_pgsql_options const &options)
15301538
" id int8 PRIMARY KEY {using_tablespace},"
15311539
"{attribute_columns_definition}"
15321540
" nodes int8[] NOT NULL,"
1533-
" tags jsonb NOT NULL"
1541+
" tags jsonb"
15341542
") {data_tablespace}";
15351543

15361544
sql.prepare_queries = {"PREPARE get_way(int8) AS"
@@ -1601,7 +1609,7 @@ static table_sql sql_for_relations_format2()
16011609
" id int8 PRIMARY KEY {using_tablespace},"
16021610
"{attribute_columns_definition}"
16031611
" members jsonb NOT NULL,"
1604-
" tags jsonb NOT NULL"
1612+
" tags jsonb"
16051613
") {data_tablespace}";
16061614

16071615
sql.prepare_queries = {"PREPARE get_rel(int8) AS"

0 commit comments

Comments
 (0)