Skip to content

Conversation

@lonvia
Copy link
Collaborator

@lonvia lonvia commented Jun 9, 2025

This adds callbacks delete_node \ delete_way \ delete_relation. They are called for every object that is explicitly marked as deleted in a change file. The callbacks are not called for objects that are implicitly deleted to make room for a new version of the object.

The callbacks receive an OSM object as a single parameter -- just like the other callbacks --, only the tags, nodes and member attributes as well as all geometry creation functions are disabled. This kind of information is usually not available for deleted objects. You can still access meta information like version, user etc. and use the object to insert something into a table.

This change required some cleanup to the code that handles untagged nodes. The pgsql output has some special handling in that it ignores untagged objects unless attribute handling is requested. We've long since removed the special handling for flex but the code for filtering was still in osmdata. This PR moves the handling into the pgsql output, which is a bit cleaner. This changes some of the outcome in the tests. See first commit for details.

This special handling has been already disabled for flex output
when we introduced the untagged callbacks.
@lonvia lonvia requested a review from joto June 9, 2025 21:07
m_enable_way_area = read_style_file(options.style, &exlist);

m_ignore_untagged_objects = !options.extra_attributes;

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should be initialized with the other members above.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Changed.

{
if (m_delete_relation) {
m_relation_cache.init(rel);
select_relation_members();
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Deleted relations have no members, so there is no way of selecting any of them for two-stage processing. So this line can be removed.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Removed.

void output_flex_t::node_delete(osmium::Node const &node)
{
if (m_delete_node) {
m_context_node = &node;
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You are reusing an existing context. This is fine, but the comments in src/output-flex.hpp for the calling_context enum referring to the process_* callbacks should be changed, too. (We probably already should have changed them when we added the untagged callbacks.)

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Changed.

}
lua_rawset(lua_state, -3);

// Set the metatable of this object
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It is weird that the process callbacks get a "real Lua object", and the delete callbacks only get a "Lua table" as parameter. This might have all sorts of consequences later, no sure. It definitely prevents us from ever calling methods on deleted OSM objects.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is just an implementation detail. If we need functions later, we can change it back to being a real Lua object (possibly of its own type). The user shouldn't rely on the type and assume duck typing is in place.

Given the input file 'liechtenstein-2013-08-03.osm.pbf'
And the lua style
"""
change = osm2pgsql.define_table{
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Use local

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done.

{ column = 'extra', sql_type = 'int' }
}}
function process(object)
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Confusing that this function is called process like the process callbacks. Also should be local.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Changed.

name = 'change',
ids = {type = 'any', id_column = 'osm_id', type_column = 'osm_type'},
columns = {
{ column = 'extra', sql_type = 'int' }
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

type = 'int' should work fine.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed.


Then SELECT osm_type, count(*), sum(extra) FROM change GROUP BY osm_type
| osm_type | count | sum |
| N | 16773 | 16779|
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

mode superpicky on Spaces at the end missing in this and following lines.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Now prettier.

name = 'change',
ids = {type = 'any', id_column = 'osm_id', type_column = 'osm_type'},
columns = {
{ column = 'extra', sql_type = 'int' }
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

see above

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed.

@lonvia lonvia force-pushed the delete-callback branch from b9d3b28 to 47ee919 Compare June 25, 2025 19:24
@joto joto merged commit 3c7d0e2 into osm2pgsql-dev:master Jun 25, 2025
24 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants