Skip to content

[P2-T1] Design & Implement Data Preprocessing Pipeline #10

@edaaydinea

Description

@edaaydinea
  • Description: Develop a scikit-learn pipeline in src/features/build_features.py. This pipeline should handle all preprocessing steps identified in Phase 1.
    • Imputation: Strategy for missing values (e.g., Mean/Median for numerical, Mode for categorical).
    • Scaling: Standardize or normalize numerical features (e.g., StandardScaler).
    • Encoding: One-hot encode categorical features (e.g., Gender, Schooling).
  • Acceptance Criteria: A reusable preprocessing pipeline object that can be saved and applied consistently to training, validation, and future data.

Metadata

Metadata

Assignees

Labels

No labels
No labels

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions