Skip to content

feat: Vendor BIDS loader and lazy loading for complete consumption pipeline #23

@The-Obstacle-Is-The-Way

Description

Summary

This repo currently provides a production-only pipeline (BIDS → HuggingFace Hub).

We are missing consumption features that would make this package a complete solution for neuroimaging researchers.

Installation Note

This package is installed as a git dependency (not on PyPI):

uv add git+https://github.com/CloseChoice/neuroimaging-go-brrrr.git
# or
pip install git+https://github.com/CloseChoice/neuroimaging-go-brrrr.git

Current State

Feature Status
Production (BIDS → Hub upload) ✅ Complete
Hub consumption (load_dataset("hugging-science/...")) ✅ Works (via upstream datasets)
Local BIDS consumption (load_dataset('bids', data_dir='/path')) ❌ Missing
Memory-efficient lazy loading for NIfTI ❌ Missing

Pending Upstream PRs (Unlikely to Merge)

These features exist as open PRs on huggingface/datasets but are unlikely to be prioritized:

  • #7886 - BIDS dataset loader
  • #7887 - Lazy loading for NIfTI (memory fix)

Proposal

Vendor the code from PRs #7886 and #7887 into this repo so that:

  1. Installing this package provides a complete neuroimaging toolkit
  2. Users don't have to wait for upstream merges (which may never happen)
  3. This repo becomes the canonical extension for neuroimaging on HuggingFace

Implementation Plan

  1. Add src/bids_hub/loader/bids.py - BIDS loader from PR #7886
  2. Add lazy loading wrapper from PR #7887
  3. Expose via bids_hub.load_bids() or similar API
  4. Update ARCHITECTURE.md to document consumption pipeline

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions