Skip to content

Conversation

@nsrawat0333
Copy link

Problem

Issue #607 reported that users can't access datasets for the Learning to Simulate project. The original bash script works on Linux/Mac but many users on Windows face issues with bash compatibility and lack of progress feedback during large downloads.

Solution

Added two new download tools to complement the existing bash script:

📦 Python Script (download_dataset.py)

  • Cross-platform compatibility - works on Windows, Mac, Linux
  • Progress bars with file sizes and download speed
  • Resume capability - can continue interrupted downloads
  • Better error handling with clear error messages
  • Dataset validation - warns about unknown dataset names
  • Clean CLI interface with help text and examples

🪟 PowerShell Script (download_dataset.ps1)

  • Native Windows support for PowerShell users
  • Progress indicators during download
  • Proper error handling with colored output
  • PowerShell-native parameter handling

Changes Made

  • ✅ Created download_dataset.py with requests library and tqdm progress bars
  • ✅ Created download_dataset.ps1 with native PowerShell download functions
  • ✅ Updated README.md with cross-platform usage instructions
  • ✅ Added examples for all three download methods

Usage Examples

# Original bash (Linux/Mac)
bash download_dataset.sh WaterDrop ./datasets

# New Python script (All platforms)
python download_dataset.py WaterDrop ./datasets

# New PowerShell script (Windows)
.\download_dataset.ps1 -DatasetName "WaterDrop" -OutputDir "./datasets"

- Update aiohttp to address potential security vulnerabilities
- Maintains compatibility with existing codebase
- Addresses dependency security recommendations
- Add comprehensive open-source-resources.md with organized dataset links
- Update README.md with dedicated datasets section for better navigation
- Include external repository links and clear access instructions
- Improve reproducibility and research collaboration

Fixes google-deepmind#608
- Apply edge residual connections before node processing per paper
- Node updates now use updated edge features instead of update deltas
- Ensures proper information flow: e'_ij = e_ij + MLP(e_ij, v_i, v_j)
- Then: v'_i = v_i + MLP(v_i, sum(e'_ij)) using updated edges
- Matches mathematical formulation in MeshGraphNets paper

Fixes google-deepmind#609
- Add Python script (download_dataset.py) with progress bars and resume capability
- Add PowerShell script (download_dataset.ps1) for Windows native support
- Update README.md with cross-platform download instructions
- Addresses Issue google-deepmind#607: Unable to access datasets for Learning to Simulate Project

The original bash script works but has limitations on Windows systems.
These additional tools provide:
- Cross-platform compatibility (Python works everywhere)
- Progress indicators for large downloads
- Resume capability for interrupted downloads
- Better error handling and user feedback
- Windows PowerShell native support

All tools download from the same Google Cloud Storage URLs and maintain
compatibility with the existing dataset structure.
@polarbe
Copy link

polarbe commented Aug 10, 2025 via email

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants