-
Notifications
You must be signed in to change notification settings - Fork 491
Open
Description
Description
The current machine learning models in this project (specifically the Random Forest implementation) show a significant disparity between Accuracy (~90%) and Recall (~50%). This indicates a class imbalance issue where the model struggles to identify the minority class effectively.
Proposed Improvement
I have implemented a pipeline that uses:
- SMOTE (Synthetic Minority Over-sampling Technique) to balance the training set.
- StandardScaler to normalize feature distributions.
Results
These changes result in a much more balanced and reliable model:
- Recall: Improved from 0.50 to 0.63 (+26% gain)
- F1-Score: Improved from 0.59 to 0.64
- Precision: Maintained at a healthy 0.65
Checklist
- Code follows Python 3.x standards.
- Descriptive comments included for all new logic.
- MIT License added to the top of the file.
I have the code ready and would like to be assigned to this issue to submit a Pull Request!
Reactions are currently unavailable
Metadata
Metadata
Assignees
Labels
No labels