-
Notifications
You must be signed in to change notification settings - Fork 105
Add rule GCI108 #411
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Add rule GCI108 #411
Changes from all commits
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -0,0 +1,17 @@ | ||
| { | ||
| "title": "Comparison between XGBoost and RandomForest", | ||
| "type": "CODE_SMELL", | ||
| "status": "ready", | ||
| "remediation": { | ||
| "func": "Constant/Issue", | ||
| "constantCost": "10min" | ||
| }, | ||
| "tags": [ | ||
| "creedengo", | ||
| "eco-design", | ||
| "performance", | ||
| "memory", | ||
| "ai" | ||
| ], | ||
| "defaultSeverity": "Minor" | ||
| } | ||
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -0,0 +1,62 @@ | ||
| = Prefer Efficient Classifiers: XGBoost vs RandomForest | ||
|
|
||
| Using resource-heavy classifiers like `RandomForestClassifier` for standard classification tasks can result in longer execution times and higher carbon emissions. | ||
|
|
||
| `XGBoost` offers a more optimized and eco-friendly alternative with competitive accuracy. | ||
|
|
||
| == Metrics Comparison Table | ||
|
|
||
| [cols="1,1,1,1", options="header"] | ||
|
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. sorry but I don't understand how you do your tests ... could you give us more information on your testing context, please ? |
||
| |=== | ||
| |Classifier |Accuracy |Time (s) |Carbon Emission (kg CO₂) | ||
|
|
||
| |RandomForestClassifier | ||
| |0.88 | ||
| |1.24 | ||
| |0.00091 | ||
|
|
||
| |XGBClassifier | ||
| |0.89 | ||
| |0.47 | ||
| |0.00035 | ||
| |=== | ||
|
|
||
| XGBoost not only matches or exceeds the accuracy of RandomForest but also runs significantly faster and emits less CO₂. | ||
|
|
||
| == Visual Comparison | ||
|
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. in your different graphics, you show several frameworks ... could you explain why do you choose XGBoost instead of RadomForestClassifier ... and not other framework ? only based on your array with metrics ? |
||
|
|
||
| === Accuracy vs Dataset Size | ||
|
|
||
| image::accuracy_vs_size.png[Accuracy Comparison] | ||
|
|
||
| === Carbon Emission vs Dataset Size | ||
|
|
||
| image::emissions_vs_size.png[Emission Comparison] | ||
|
|
||
| === Execution Time vs Dataset Size | ||
|
|
||
| image::execution_time_vs_size.png[Time Comparison] | ||
|
|
||
| == Non Compliant Code Example | ||
|
|
||
| [source,python] | ||
| ---- | ||
| from sklearn.ensemble import RandomForestClassifier | ||
|
|
||
| model = RandomForestClassifier(n_estimators=100) | ||
| model.fit(X_train, y_train) | ||
| ---- | ||
|
|
||
| == Compliant Code Example | ||
|
|
||
| [source,python] | ||
| ---- | ||
| from xgboost import XGBClassifier | ||
|
|
||
| model = XGBClassifier(use_label_encoder=False, eval_metric="logloss") | ||
| model.fit(X_train, y_train) | ||
| ---- | ||
|
|
||
| == 📓 Article about XGBoost: A Scalable Tree Boosting System | ||
|
|
||
| This article explains in details what XGBoost can do : https://arxiv.org/pdf/1603.02754 | ||
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
could change title to give information about the real good practice and note a global title with no good practice to follow.
Maybe something like "Use XGBoost instead RandomForest" ...