FHIRy is a Python package that converts FHIR (Fast Healthcare Interoperability Resources) bundles and NDJSON files into pandas DataFrames for health data analytics, machine learning, and AI applications. It supports FHIR server search, BigQuery integration, and LLM-based natural language queries.
- Python Version: Requires Python 3.10 or higher (tested on 3.10, 3.11, and 3.12)
- Package Manager: Uses
uvfor fast, reliable dependency management - Setup Commands:
uv sync # Install dependencies from pyproject.toml
src/fhiry/ # Main source code
├── fhiry.py # Core FHIR Bundle processor
├── fhirndjson.py # NDJSON file processor
├── fhirsearch.py # FHIR server search API integration
├── bqsearch.py # BigQuery FHIR dataset queries
├── flattenfhir.py # FHIR resource flattening logic
├── parallel.py # Parallel processing utilities
├── base_fhiry.py # Base class for FHIR processors
└── main.py # CLI entry point
tests/ # Test suite with pytest
docs/ # MkDocs documentation
examples/ # Usage examples
- Formatter: Ruff (enforced via pre-commit hooks)
- Line Length: 120 characters maximum
- Type Hints: Required for all function signatures (enforced by mypy)
- Docstrings: Use Google-style docstrings for classes and public methods
- Import Organization: Handled automatically by ruff (isort-compatible)
- All functions must have type hints
- No implicit optional types
- Check untyped definitions
- Return type annotations are required
- Add
# type: ignorecomments only when necessary, with justification in code comments
- Framework: pytest with coverage reporting
- Coverage: Tracks coverage for
src.fhirymodule - Location: All tests in
tests/directory - Test Resources: Sample FHIR bundles in
tests/resources/
uv run pytest --cov=src/fhiry tests/ # Run all tests with coverage
uv run pytest tests/ # Run tests without coverage
uv run pytest tests/test_specific.py # Run specific test file- Test files must start with
test_ - Test functions must start with
test_ - Use fixtures from
tests/conftest.py - Maintain high test coverage (aim for >70%)
- FHIR: Fast Healthcare Interoperability Resources (HL7 standard)
- Bundles: Collections of FHIR resources (e.g., Patient, Observation, Condition)
- NDJSON: Newline-delimited JSON format used for bulk FHIR data export
- Resources have nested structures that need flattening for DataFrame conversion
- Resource types include: Patient, Observation, Condition, Medication, Procedure, etc.
- FHIR Search API uses RESTful queries with specific parameters
- BigQuery has native FHIR dataset support
- Flatten nested FHIR structures into tabular format
- Extract coding systems (SNOMED, LOINC, ICD-10) from CodeableConcept
- Handle references between resources (e.g., Patient references in Observations)
- Support column filtering and renaming via config JSON
pandas: DataFrame operationsgoogle-cloud-bigquery: BigQuery integrationtqdm: Progress bars for long operationsclick: CLI frameworknumpy: Numerical operations supporttimeago: Timestamp formattingprodict: Dictionary to object conversionresponses: HTTP request mocking for testsopenpyxl: Excel file support
llmextra: Adds llama-index, langchain for LLM-based queries
- Add to
dependenciesinpyproject.toml - Run
uv syncto update lock file - Check for obsolete deps with
make check(uses deptry)
- Add processing logic in appropriate module (fhiry.py, fhirsearch.py, etc.)
- Follow existing patterns for resource flattening
- Add type hints for all methods
- Write tests with sample FHIR resources
- Update documentation if adding public API
- Changes to column extraction logic should be in
base_fhiry.pyor specific processor - Test with various FHIR resource types
- Verify config JSON filtering still works
- Modify
src/fhiry/main.py - Use Click decorators for command definition
- Add tests in
tests/test_cli.py
pyproject.toml: Dependencies, tool configuration, project metadataMakefile: Build, test, and development commands.pre-commit-config.yaml: Formatting and linting configurationCONTRIBUTING.md: Contribution guidelinesREADME.md: Public API and usage examples
- Always run tests:
uv run pytestbefore submitting changes - Respect FHIR standards: Consult HL7 FHIR specification when handling resources
- Preserve test coverage: Add tests for new functionality
- Use type hints: Required by mypy configuration
- Follow existing patterns: Check similar code before implementing new features
- Target develop branch: Never push directly to main
- Keep dependencies minimal: Only add if absolutely necessary
- Document public APIs: Update docstrings and README for user-facing changes
- Test with real FHIR data: Use samples in
tests/resources/