CogniLife is an advanced machine learning framework designed to identify dementia risk patterns using non-clinical determinants. By analyzing demographics, social engagement, and functional independence, this system provides a non-invasive screening methodology that bypasses the need for expensive medical imaging or laboratory tests.
The project leverages data from the National Alzheimer’s Coordinating Center (NACC) and employs sophisticated feature engineering and probability calibration to achieve an exceptional ROC-AUC of 0.998.
Dementia screening is traditionally a clinical process. CogniLife explores the "Digital Biomarker" potential of everyday behavioral data, focusing on:
Functional Autonomy: Tracking abilities in managing finances, shopping, and medication.
Social Connectivity: Quantifying the impact of social isolation vs. engagement.
Lifestyle Resilience: Mapping education and occupation history against cognitive longevity.
This tool is designed to serve as a pre-clinical screening layer to help community health workers identify individuals who require formal medical evaluation.
- Advanced Feature Architecture Social Interaction Index: A composite metric built from multi-variable social data.
Autonomy Mapping: Analysis of "Activities of Daily Living" (ADLs) to detect subtle cognitive shifts.
Target Encoding: High-efficiency handling of categorical variables to improve model convergence.
- The Modeling Pipeline Benchmarking: Comparative analysis of XGBoost, LightGBM, Random Forest, and SVM.
Hyperparameter Tuning: Automated optimization using Optuna.
Probability Calibration: Implementation of Platt Scaling to ensure risk percentages are statistically reliable.
- Interpretability (XAI) SHAP Integration: Local and global explanations for model decisions to ensure transparency.
The final LightGBM-based engine delivers the following results on the NACC validation set:
| Metric | Performance |
|---|---|
| ROC-AUC | 0.9989 |
| F1 Score | 0.9780 |
| Precision | 0.9811 |
| Recall | 0.9748 |
- Python 3.9+
- Access to NACC Dataset (Requires data use agreement)
git clone [https://github.com/yourusername/VedaLink.git](https://github.com/yourusername/VedaLink.git)
cd VedaLink
pip install -r requirements.txt