Skip to content

Conversation

@MarcelloPerathoner
Copy link
Contributor

Issue

Fixes #7320

On macOS and Windows osrm-datastore now correctly waits for other processes to detach from shared memory before exiting. Previous behaviour: osrm-datastore slept for 50ms and then boldly announced: 'All clients switched.'

This caused

  • tests to fail randomly because queries were executed against old data, and
  • tests to run slower than on linux.

Tasklist

Requirements / Relations

Related to PR #7309

Copy link
Member

@TheMarex TheMarex left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good, some smaller questions. Thanks for breaking it out!

throw util::exception("shmctl encountered an error: " + errorToMessage(error_code) +
SOURCE_REF);
::shmid_ds xsi_ds;
if (::shmctl(xsi.get_shmid(), IPC_STAT, &xsi_ds) < 0)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If I read this correctly, in the case of an error we swallow that and just return? Previously it would have thrown an exception.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

But we are expecting an error here. The error being that the region is gone.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

But can we check for a precise error code in that case? There may be other reasons this call errors.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If we got there, the only plausible error that still can occur is: "EIDRM shmid points to a removed identifier". And that is exactly what we are waiting for to happen. I can test for this error and throw if any other error happens.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

IMHO it makes it a bit more clear that we are waiting for something specific to happen.

#else
// Windows - specific code

// POSIX shared mem
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't remember exactly why we used XSI over POSIX. Maybe issues with OSX at some point? To be honest we were originally planning to drop the whole hot-swapping thing in favor of mmap. The hotswap/shmem was conceived almost 15 years ago when the typical deployment process was "run it on a big server", not rolling deployments behind a load balancer.

Would have been nice for the 6.0 to clean this up but alas.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

osrm-datastore on macOS and win do not wait for clients to switch

2 participants