Skip to content

Add D4D datasheet: AI-READI Flagship Dataset#60

Closed
d4dassistant wants to merge 1 commit intomainfrom
d4d/add-ai-readi-flagship-datasheet
Closed

Add D4D datasheet: AI-READI Flagship Dataset#60
d4dassistant wants to merge 1 commit intomainfrom
d4d/add-ai-readi-flagship-datasheet

Conversation

@d4dassistant
Copy link

Summary

Created new D4D datasheet for AI-READI Flagship Dataset of Type 2 Diabetes based on comprehensive documentation from:

Files Added

  • data/extracted_by_column/AI_READI/ai_readi_flagship_d4d.yaml - D4D YAML datasheet
  • src/html/output/D4D_-_AI-READI_FAIRHub_v3_human_readable.html - HTML preview (updated)

Key Metadata Extracted

Dataset Overview

  • Dataset ID: ai-readi-flagship-v2.0.0
  • Dataset Name: Flagship Dataset of Type 2 Diabetes from the AI-READI Project
  • Publisher: AI-READI Consortium
  • Version: 2.0.0 (released July 31, 2024)
  • DOI: 10.5281/zenodo.10642459

Purpose

Create and share a flagship ethically-sourced dataset of type 2 diabetes optimized for AI/ML analysis. The dataset illuminates salutogenic pathways from diabetes toward health recovery through computational methods and emphasizes ethical data sourcing and equity.

Composition

  • Participants: 1,067 participants (target: 4,000)
  • Data Modalities: 15+ types including vitals, ECG, ophthalmological measurements, retinal imaging, wearables
  • Dataset Size: 2.01 TB with 165,051 files
  • Formats: DICOM, CSV, Markdown
  • Categories: 4 diabetes severity groups (no diabetes, prediabetes, oral/non-insulin controlled, insulin-controlled)

Collection

  • Sites: UAB, UCSD, University of Washington
  • Protocols: Clinical Dataset Structure (CDS) specification
  • Devices: Dexcom, Topcon Healthcare, Heidelberg Engineering, iCare, Optomed

Distribution

  • Platform: FAIRhub (https://fairhub.io)
  • Access: Controlled access with use restrictions
  • License: AI-READI custom license v1.0
  • Restrictions: Limited to type 2 diabetes-related research
  • Standards: FAIR principles, CDS, Bridge2AI standards

Ethics & Privacy

  • IRB Approval: All three collection sites
  • De-identification: PHI removed, HIPAA Safe Harbor approach
  • Community Engagement: Stakeholder input and American Indian engagement initiatives
  • Ethical Position: Ethically-sourced with emphasis on equity

Funding

  • Agency: National Institutes of Health (NIH)
  • Program: Bridge2AI
  • Award: 1OT2OD032644

References

  • Owsley et al. (2025). BMJ Open. DOI: 10.1136/bmjopen-2024-097449
  • AI-READI Consortium (2024). Nature Metabolism. DOI: 10.1038/s42255-024-01165-x

How to Review

  1. View HTML preview: Open src/html/output/D4D_-_AI-READI_FAIRHub_v3_human_readable.html in a browser for human-readable format
  2. Check YAML: Review data/extracted_by_column/AI_READI/ai_readi_flagship_d4d.yaml for completeness and accuracy
  3. Verify sources: Compare against original documentation URLs listed above
  4. Check metadata sections: Ensure key information is captured accurately

Notes

  • This datasheet follows the same format as other extracted datasheets in the repository (VOICE, CM4AI, CHORUS)
  • Uses free-form YAML structure compatible with the HTML renderer
  • Comprehensive coverage of all major D4D sections: motivation, composition, collection, preprocessing, distribution, ethics, funding, maintenance

Related to: #58


🤖 Generated with D4D Assistant

- Extracted metadata from comprehensive documentation sources:
  * https://aireadi.org/ (project overview)
  * https://docs.aireadi.org (technical documentation)
  * https://fairhub.io (distribution platform)
  * https://github.com/AI-READI (code repositories)
- Generated HTML preview for review
- Includes comprehensive metadata sections

Key metadata highlights:
- Dataset ID: ai-readi-flagship-v2.0.0
- 1,067 participants with 15+ data modalities
- 2.01 TB dataset distributed via FAIRhub
- Ethically sourced, FAIR-compliant data
- Controlled access for T2D research

Co-Authored-By: Claude <[email protected]>
This was referenced Nov 6, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants