Skip to content

Conversation

@nishanthcr7777
Copy link
Contributor

Description

What is this PR

  • Bug fix

Why is this PR needed?

If an atlas installation crashes during extraction, the local atlas directory can be left in a partially extracted state with a missing or corrupted metadata.json.

Currently this causes BrainGlobeAtlas initialization to crash, requiring users to manually delete the atlas directory and reinstall ,which is not user-friendly.

What does this PR do?

  • Adds recovery logic during BrainGlobeAtlas initialization:
    • Detects missing or corrupted metadata.json
    • Removes the partially extracted atlas directory
    • Re-downloads the atlas once and retries initialization
  • Uses a bounded retry loop (no recursion) to avoid infinite retries
  • Leaves behavior unchanged for valid atlas installations

References

Closes #697

How has this PR been tested?

  • Added a unit test that simulates a missing metadata.json by mocking Atlas.__init__
  • Verified the atlas is re-downloaded and initialization succeeds
  • Ran pytest locally and confirmed existing tests pass

Is this a breaking change?

No.
This change only affects recovery from corrupted atlas installs.

Does this PR require an update to the documentation?

No.

Checklist:

  • The code has been tested locally
  • Tests have been added to cover the new recovery behavior
  • The code has been formatted with pre-commit

@nishanthcr7777 nishanthcr7777 marked this pull request as ready for review December 16, 2025 14:41
@adamltyson adamltyson requested a review from a team December 16, 2025 14:55
Copy link
Member

@alessandrofelder alessandrofelder left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks @nishanthcr7777 - could you help me understand how this PR relates to your other PR #700, please?

It looks to me like both are meant to address #697 and #698 (but #698 is a special case of #697). Do they solve the same problem or just a very similar problem?

Do we need both PRs? And in which order?

@PolarBean
Copy link
Member

I'm not sure that #698 is a special case of #697. in #698 the tar does not finnish downloading for some reason but the api tries to unzip it regardless. I think #700 is trying to fix this but im not sure that it would work, ideally it should catch that its interrupted and raise an error stating that the download was interrupted rather than attempting to decompress.

#697 may be related but it is just that the json is missing for some reason (maybe a corrupted download) and needs to be reinstalled. #721 addresses this.

@nishanthcr7777
Copy link
Contributor Author

nishanthcr7777 commented Dec 17, 2025

The two PRs address different failure stages and are complementary.

#700 handles failures during download/extraction (eg. interrupted or corrupted tarballs causing tarfile.ReadError / EOFError)

#721 handles failures after extraction, where the atlas directory exists but metadata.json is missing or corrupted (maybe due to a crash mid-install)

So, #700 prevents broken archives from being extracted whereas #721 recovers when extraction partially succeeded but initialization fails later.

I believe both are useful and my preference would be to merge them one by one in order,first #700, then #721, but i am ready to rebase this PR after merging #700 or combine them if you prefer a different approach.

Please let me know how to proceed.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

[BUG] crash when metadata.json is missing

3 participants