BayAreaMetro
diff --git a/‎.github/workflows/docs.yml‎
Lines changed: 57 additions & 0 deletions b/‎.github/workflows/docs.yml‎
Lines changed: 57 additions & 0 deletions
diff --git a/‎.gitignore‎
Lines changed: 3 additions & 0 deletions b/‎.gitignore‎
Lines changed: 3 additions & 0 deletions
diff --git a/‎README.md‎
Lines changed: 54 additions & 22 deletions b/‎README.md‎
Lines changed: 54 additions & 22 deletions
diff --git a/‎docs/codebook.md‎
Lines changed: 46 additions & 0 deletions b/‎docs/codebook.md‎
Lines changed: 46 additions & 0 deletions
diff --git a/‎docs/index.md‎
Lines changed: 31 additions & 0 deletions b/‎docs/index.md‎
Lines changed: 31 additions & 0 deletions
diff --git a/‎docs/models/ctramp.md‎
Lines changed: 15 additions & 0 deletions b/‎docs/models/ctramp.md‎
Lines changed: 15 additions & 0 deletions
diff --git a/‎docs/models/daysim.md‎
Lines changed: 16 additions & 0 deletions b/‎docs/models/daysim.md‎
Lines changed: 16 additions & 0 deletions
diff --git a/‎docs/models/index.md‎
Lines changed: 82 additions & 0 deletions b/‎docs/models/index.md‎
Lines changed: 82 additions & 0 deletions
@@ -0,0 +1,57 @@
+name: Deploy Documentation
+
+on:
+  push:
+    branches: [main, mkdocs]
+  workflow_dispatch:
+
+permissions:
+  contents: read
+  pages: write
+  id-token: write
+
+# Allow one concurrent deployment
+concurrency:
+  group: "pages"
+  cancel-in-progress: false
+
+jobs:
+  build:
+    name: Build Documentation
+    runs-on: ubuntu-latest
+    steps:
+      - name: Checkout code
+        uses: actions/checkout@v4
+
+      - name: Install uv
+        uses: astral-sh/setup-uv@v3
+        with:
+          enable-cache: true
+
+      - name: Set up Python
+        uses: actions/setup-python@v5
+        with:
+          python-version: "3.12"
+
+      - name: Install dependencies
+        run: uv sync --group dev
+
+      - name: Build documentation
+        run: uv run mkdocs build
+
+      - name: Upload artifact
+        uses: actions/upload-pages-artifact@v3
+        with:
+          path: ./site
+
+  deploy:
+    name: Deploy to GitHub Pages
+    needs: build
+    runs-on: ubuntu-latest
+    environment:
+      name: github-pages
+      url: ${{ steps.deployment.outputs.page_url }}
+    steps:
+      - name: Deploy to GitHub Pages
+        id: deployment
+        uses: actions/deploy-pages@v4
@@ -78,6 +78,9 @@ instance/
 # Sphinx documentation
 docs/_build/
 
+# MkDocs documentation
+site/
+
 # PyBuilder
 .pybuilder/
 target/
 
@@ -9,17 +9,17 @@ Tools for processing and validating travel diary survey data into standardized f
 - [Architecture](#architecture)
   - [Conceptual Diagram](#conceptual-diagram)
   - [Pipeline Steps](#pipeline-steps)
-- [Usage](#usage)
-  - [Quick Start](#quick-start)
-    - [1. Installing UV & Virtual Environment Setup](#1-installing-uv--virtual-environment-setup)
-    - [2. Configuration](#2-configuration)
-    - [3. Pipeline Runner](#3-pipeline-runner)
-  - [Data Models and Validation](#data-models-and-validation)
-    - [`step` Decorator and Validation](#step-decorator-and-validation)
-  - [Documentation](#documentation)
+- [Quick Start](#quick-start)
+  - [1. Installing UV & Virtual Environment Setup](#1-installing-uv--virtual-environment-setup)
+  - [2. Configuration](#2-configuration)
+  - [3. Pipeline Runner](#3-pipeline-runner)
+- [Data Models and Validation](#data-models-and-validation)
+  - [`step` Decorator and Validation](#step-decorator-and-validation)
+- [Additional Documentation](#additional-documentation)
 - [Work Plan](#work-plan)
 - [Development](#development)
   - [Project Structure](#project-structure)
+  - [Generating API Documentation](#generating-api-documentation)
   - [Running Tests](#running-tests)
   - [Code Quality](#code-quality)
   - [Pre-commit Hooks](#pre-commit-hooks)
@@ -51,6 +51,7 @@ Tools for processing and validating travel diary survey data into standardized f
 The usage pattern for the pipeline is a bit different than the typical numbered scripts you might see elsewhere. *There is no monolithic integrated script*. Instead there is a standardized data processing pipeline that is configurable via YAML files and executed via a runner script.
 
 There are three main components:
+
 * **Setup**
   * This contains the point of entry defined in `project/run.py` and
   * Pipeline configuration defined in `project/config.yaml`
@@ -179,20 +180,20 @@ The data processing pipeline consists of modular steps that transform raw survey
 
 #### Core Processing Steps
 
-1. **[Load Data](src/processing/read_write/README.md)** - Loads canonical survey tables from CSV, Parquet, or geospatial files into memory
-2. **[Cleaning](src/processing/cleaning/README.md)** - Project-specific data cleaning operations (e.g., fixing time/distance errors, adding missing records)
-3. **Imputation** *(placeholder)* - Imputes missing values for key variables (e.g., mode, purpose, locations)
-4. **[Link Trips](src/processing/link_trips/README.md)** - Aggregates individual trip segments into complete journey records by detecting mode changes and transfers
-5. **[Detect Joint Trips](src/processing/joint_trips/README.md)** - Identifies shared household trips using spatial-temporal similarity matching
-6. **[Extract Tours](src/processing/tours/README.md)** - Builds hierarchical tour structures (home-based tours and work-based subtours) from linked trips
-7. **Weighting** *(placeholder)* - Calculates expansion weights to match survey sample to population targets
-8. **[Format Output](src/processing/formatting/daysim/README.md)** - Transforms canonical data to model-specific formats (DaySim, ActivitySim, etc.)
-    - **[DaySim Format](src/processing/formatting/daysim/README.md)** - Formats data for DaySim model input
-    - **[CT-RAMP Format](src/processing/formatting/ctramp/README.md)** - Formats data for CT-RAMP model input
-9. **[Final Check](src/processing/final_check/README.md)** - Validates complete dataset against canonical schemas before export
-10. **[Write Data](src/processing/read_write/README.md)** - Writes processed tables to output files with optional validation
-
-Each step README provides detailed documentation on:
+1. **[Load Data](https://bayareametro.github.io/travel-diary-survey-tools/pipeline_steps/read_write/)** - Loads canonical survey tables from CSV, Parquet, or geospatial files into memory
+2. **[Cleaning](https://bayareametro.github.io/travel-diary-survey-tools/pipeline_steps/cleaning/)** - Project-specific data cleaning operations (e.g., fixing time/distance errors, adding missing records)
+3. **[Imputation](https://bayareametro.github.io/travel-diary-survey-tools/pipeline_steps/imputation/)** *(placeholder)* - Imputes missing values for key variables (e.g., mode, purpose, locations)
+4. **[Link Trips](https://bayareametro.github.io/travel-diary-survey-tools/pipeline_steps/link_trips/)** - Aggregates individual trip segments into complete journey records by detecting mode changes and transfers
+5. **[Detect Joint Trips](https://bayareametro.github.io/travel-diary-survey-tools/pipeline_steps/detect_joint_trips/)** - Identifies shared household trips using spatial-temporal similarity matching
+6. **[Extract Tours](https://bayareametro.github.io/travel-diary-survey-tools/pipeline_steps/extract_tours/)** - Builds hierarchical tour structures (home-based tours and work-based subtours) from linked trips
+7. **[Weighting](https://bayareametro.github.io/travel-diary-survey-tools/pipeline_steps/weighting/)** *(placeholder)* - Calculates expansion weights to match survey sample to population targets
+8. **Format Output** - Transforms canonical data to model-specific formats (DaySim, ActivitySim, etc.)
+    - **[DaySim Format](https://bayareametro.github.io/travel-diary-survey-tools/pipeline_steps/format_output/daysim/)** - Formats data for DaySim model input
+    - **[CT-RAMP Format](https://bayareametro.github.io/travel-diary-survey-tools/pipeline_steps/format_output/ctramp/)** - Formats data for CT-RAMP model input
+9. **[Final Check](https://bayareametro.github.io/travel-diary-survey-tools/pipeline_steps/final_check/)** - Validates complete dataset against canonical schemas before export
+10. **[Write Data](https://bayareametro.github.io/travel-diary-survey-tools/pipeline_steps/read_write/)** - Writes processed tables to output files with optional validation
+
+Each step links to documentation generated by the step's docstring, and provides detailed documentation on:
 - Input/output data requirements
 - Core algorithm and processing logic
 - Configuration parameters
@@ -443,6 +444,7 @@ def new_processing_step(
 
 ## Additional Documentation
 For more details, see:
+* [API Documentation](https://bayareametro.github.io/travel-diary-survey-tools/) - Auto-generated API documentation for data models, pipeline, and processing functions (deployed to GitHub Pages).
 * [Validation Framework Documentation](docs/VALIDATION_README.md) - Which goes into more detail on the validation framework architecture and usage.
 * [Column Requirements Documentation](docs/COLUMN_REQUIREMENTS.md) - Contains auto-generated tables and enums for easy reference on which fields are required for each processing step. Essentially summarizes the data models in a table.
 
@@ -510,6 +512,36 @@ travel-diary-survey-tools/
 └── docs/                           # Documentation
 ```
 
+### Generating API Documentation
+
+The project uses [MkDocs](https://www.mkdocs.org/) with [Material theme](https://squidfunk.github.io/mkdocs-material/) to generate API documentation from docstrings.
+
+**Building locally:**
+```bash
+# Build documentation
+uv run mkdocs build --strict
+
+# Preview with live reload
+uv run mkdocs serve
+# View at http://127.0.0.1:8000
+```
+
+**How it works:**
+- `mkdocstrings[python]` auto-generates docs from Python docstrings and type hints
+- `griffe-pydantic` extension handles Pydantic model documentation
+- `mkdocs-include-markdown-plugin` embeds algorithm documentation from processing module READMEs
+- Documentation structure defined in `mkdocs.yml`
+- Source files in `docs/` directory (markdown files reference Python modules)
+
+**Adding new pages:**
+1. Create markdown file in `docs/`
+2. Add to navigation in `mkdocs.yml`
+3. Reference Python modules using `::: module.path.ClassName` syntax
+
+**Deployment:**
+- Automatic via GitHub Actions on push to `main` branch
+- Published to: https://bayareametro.github.io/travel-diary-survey-tools/
+- Workflow defined in [`.github/workflows/docs.yml`](.github/workflows/docs.yml)
 
 ### Running Tests
 Tests can be run using `pytest` via VSCode extension or command line:
 
@@ -0,0 +1,46 @@
+# Codebook
+
+The codebook modules define enumerated value labels and standardized coding schemes used throughout the survey processing pipeline.
+
+## Overview
+
+Codebook enumerations use the `LabeledEnum` pattern to provide both numeric codes and human-readable labels. These are used for:
+
+- Data validation and type checking
+- Consistent coding across different survey years
+- Output formatting for travel demand models
+- Documentation and data dictionaries
+
+## Usage Example
+
+```python
+from data_canon.codebook.trips import Mode, Purpose
+
+# Access code and label
+mode_code = Mode.WALK_TRANSIT.value  # 11
+mode_label = Mode.WALK_TRANSIT.label  # "Walk to transit"
+
+# Validate and look up
+purpose = Purpose(4)  # Purpose.SHOPPING_ERRANDS
+print(purpose.label)  # "Appointment, shopping, or errands (e.g., gas)"
+```
+
+---
+
+::: data_canon.codebook.households
+
+::: data_canon.codebook.vehicles
+
+::: data_canon.codebook.persons
+
+::: data_canon.codebook.trips
+
+::: data_canon.codebook.tours
+
+::: data_canon.codebook.days
+
+## Project/Format-specific
+
+::: data_canon.codebook.daysim
+
+::: data_canon.codebook.ctramp
@@ -0,0 +1,31 @@
+# Travel Diary Survey Tools
+
+Documentation for travel diary survey data processing tools.
+
+## Overview
+
+This project provides tools to process and analyze travel diary survey data with standardized data models and validation.
+
+## Documentation Structure
+
+### [Codebook](codebook.md)
+Enumerated value labels and coding schemes for survey data fields. Includes definitions for:
+
+- Trip purposes, modes, and characteristics
+- Person demographics and employment
+- Household attributes
+- Tour patterns
+- Model-specific codes (DaySim, CTRAMP)
+
+### [Data Models](models/index.md)
+Pydantic data models for validation and processing:
+
+- Survey data models (households, persons, trips, tours)
+- Model-specific output formats (DaySim, CTRAMP)
+- Validation rules and constraints
+
+## Quick Links
+
+- [Project README](https://github.com/BATS/travel-diary-survey-tools/blob/main/README.md)
+- [Column Requirements](COLUMN_REQUIREMENTS.md)
+- [Validation Documentation](VALIDATION_README.md)
@@ -0,0 +1,15 @@
+# CTRAMP Models
+
+Output file format models for the CT-RAMP (Coordinated Travel-Regional Activity Modeling Platform) travel demand model.
+
+::: data_canon.models.ctramp
+    options:
+      show_root_heading: true
+      members:
+        - HouseholdCTRAMPModel
+        - PersonCTRAMPModel
+        - MandatoryLocationCTRAMPModel
+        - IndividualTourCTRAMPModel
+        - JointTourCTRAMPModel
+        - IndividualTripCTRAMPModel
+        - JointTripCTRAMPModel
@@ -0,0 +1,16 @@
+# DaySim Models
+
+Output file format models for the DaySim activity-based travel demand model.
+
+Based on [DaySim Input Data File Documentation](https://github.com/RSGInc/DaySim/wiki/docs/Daysim%20Input%20Data%20File%20Documentation.docx)
+
+::: data_canon.models.daysim
+    options:
+      show_root_heading: true
+      members:
+        - HouseholdDaysimModel
+        - PersonDaysimModel
+        - HouseholdDayDaysimModel
+        - PersonDayDaysimModel
+        - TourDaysimModel
+        - LinkedTripDaysimModel
@@ -0,0 +1,82 @@
+# Data Models
+
+Pydantic data models provide validation and type checking for survey data processing.
+
+## Overview
+
+Data models represent individual records (rows) and define:
+
+- Required and optional fields
+- Field validation rules and constraints
+- Foreign key relationships between tables
+- Pipeline step requirements
+
+Models use Pydantic's `BaseModel` with custom field validators to ensure data quality throughout the processing pipeline.
+
+## Key Features
+
+### Field Validation
+Each field includes validation rules:
+```python
+age: AgeCategory = step_field(required_in_steps=["extract_tours"])
+home_lat: float = step_field(ge=-90, le=90, required_in_steps=["extract_tours"])
+```
+
+### Foreign Key Relationships
+Models enforce referential integrity:
+```python
+hh_id: int = step_field(
+    ge=1,
+    fk_to="households.hh_id",
+    required_child=True,
+)
+```
+
+### Pipeline Step Requirements
+Fields specify which processing steps require them:
+```python
+person_num: int = step_field(ge=1, required_in_steps=["format_ctramp", "format_daysim"])
+```
+
+## Usage Example
+
+```python
+from data_canon.models.survey import PersonModel
+
+person = PersonModel(
+    person_id=1,
+    hh_id=100,
+    person_num=1,
+    age=AgeCategory.AGE_35_64,
+    gender=Gender.FEMALE,
+    employment=Employment.FULL_TIME,
+    student=Student.NOT_STUDENT,
+    # ... other fields
+)
+```
+
+## Survey Data Models
+
+Core data models used in the processing pipeline for households, persons, days, trips, and tours.
+
+::: data_canon.models.survey.HouseholdModel
+
+::: data_canon.models.survey.PersonModel
+
+::: data_canon.models.survey.PersonDayModel
+
+::: data_canon.models.survey.UnlinkedTripModel
+
+::: data_canon.models.survey.LinkedTripModel
+
+::: data_canon.models.survey.TourModel
+
+::: data_canon.models.survey.JointTripModel
+
+## Travel Model-formatted Data Models
+
+### [DaySim Models](daysim.md)
+Output file format models for the DaySim activity-based travel demand model.
+
+### [CTRAMP Models](ctramp.md)
+Output file format models for the CT-RAMP travel demand model.