Skip to content
This repository was archived by the owner on Jul 20, 2025. It is now read-only.

Commit 7b3c36f

Browse files
committed
Minor fixing
1 parent 64535b8 commit 7b3c36f

File tree

3 files changed

+3
-3
lines changed

3 files changed

+3
-3
lines changed

.gitignore

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -1,5 +1,5 @@
11
*.py[co]
2-
data/settings.ini
2+
data/settings.*
33
*DS_Store*
44
pip-selfcheck.json
55
pip-log.txt

README.md

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -57,7 +57,7 @@ Used descriptor and model details
5757

5858
The term _descriptor_ stands for the compact information-rich representation, allowing the convenient mathematical treatment of the encoded complex data (_i.e._ crystalline structure). Any crystalline structure is populated to a certain relatively big fixed volume of minimum one cubic nanometer. Then the descriptor is constructed using the periodic numbers of atoms and the lengths of their radius-vectors. The details are in the file `mpds_ml_labs/prediction.py`.
5959

60-
As a machine-learning model an ensemble of decision trees ([random forest regressor](http://scikit-learn.org/stable/modules/ensemble.html)) is used, as implemented in [scikit-learn](http://scikit-learn.org) Python machine-learning toolkit. The whole MPDS dataset can be used for training. In order to estimate the prediction quality of the _regressor_ model, the metrics of _mean absolute error_ and _R2 coefficient of determination_ are used. In order to estimate the prediction quality of the _classifier_ model (binary case), the simple error percentage is used (`(false positives + false negatives)/all outcome`). The evaluation process is repeated at least 30 times to achieve a statistical reliability.
60+
As a machine-learning model an ensemble of decision trees ([random forest regressor](http://scikit-learn.org/stable/modules/ensemble.html)) is used, as implemented in [scikit-learn](http://scikit-learn.org) Python machine-learning toolkit. The whole MPDS dataset can be used for training. In order to estimate the prediction quality of the _regressor_ model, the _mean absolute error_ and _R2 coefficient of determination_ is saved. In order to estimate the prediction quality of the binary _classifier_ model, the _fraction incorrect_ (_i.e._ _error percentage_) is saved. The evaluation process is repeated at least 30 times to achieve a statistical reliability.
6161

6262
API
6363
------

train_regressor.py

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -141,7 +141,7 @@ def tune_model(data_file):
141141
parameter_a = results[-1][0]
142142

143143
results = []
144-
for parameter_b in range(5, 31):
144+
for parameter_b in range(10, 101, 2):
145145
avg_mae, avg_r2 = estimate_regr_quality(get_regr(a=parameter_a, b=parameter_b), X, y)
146146
results.append([parameter_b, avg_mae, avg_r2])
147147
print("%s\t\t\t%s\t\t\t%s" % (parameter_b, avg_mae, avg_r2))

0 commit comments

Comments
 (0)