Skip to content

Commit 8f548ad

Browse files
OkonSamuelablaom
andauthored
For a 0.2.0 release (#20)
* fix docstring and build documentation * fix typos in ci file * add MLJDecisionTreeInterface to docs/Project.toml file * update index.md * Traits (#18) * remove dependence of is_wrapper trait on base model. * bug fixes and doc build * fix typo in docstring Co-authored-by: Anthony Blaom, PhD <[email protected]> --------- Co-authored-by: Anthony Blaom, PhD <[email protected]> * Fix doctests (#19) * remove dependence of is_wrapper trait on base model. * bug fixes and doc build * fix typo in docstring Co-authored-by: Anthony Blaom, PhD <[email protected]> * fix doctests --------- Co-authored-by: Anthony Blaom, PhD <[email protected]> * update codecov version to avoid github rate limit * update codecov badge * bump 0.2.0 --------- Co-authored-by: Anthony Blaom, PhD <[email protected]>
1 parent 5d605c6 commit 8f548ad

File tree

9 files changed

+320
-184
lines changed

9 files changed

+320
-184
lines changed

.github/workflows/ci.yml

Lines changed: 5 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -66,9 +66,10 @@ jobs:
6666
env:
6767
JULIA_NUM_THREADS: 2
6868
- uses: julia-actions/julia-processcoverage@v1
69-
- uses: codecov/codecov-action@v1
69+
- uses: codecov/codecov-action@v4
7070
with:
7171
file: lcov.info
72+
token: ${{ secrets.CODECOV_TOKEN }}
7273
docs:
7374
name: Documentation
7475
runs-on: ubuntu-latest
@@ -125,9 +126,9 @@ jobs:
125126
julia --project=docs -e '
126127
if ENV["BUILD_DOCS"] == "true"
127128
using Documenter: doctest
128-
using MLJBase
129+
using FeatureSelection
129130
@info "attempting to run the doctests"
130-
doctest(MLJBase)
131+
doctest(FeatureSelection)
131132
else
132133
@info "skipping the doctests"
133134
end'
@@ -142,3 +143,4 @@ jobs:
142143
env:
143144
GITHUB_TOKEN: ${{ secrets.GITHUB_TOKEN }}
144145
DOCUMENTER_KEY: ${{ secrets.DOCUMENTER_KEY }}
146+

Project.toml

Lines changed: 2 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -1,7 +1,7 @@
11
name = "FeatureSelection"
22
uuid = "33837fe5-dbff-4c9e-8c2f-c5612fe2b8b6"
33
authors = ["Anthony D. Blaom <[email protected]>", "Samuel Okon <[email protected]"]
4-
version = "0.1.1"
4+
version = "0.2.0"
55

66
[deps]
77
MLJModelInterface = "e80e1ace-859a-464e-9ed9-23947d8ae3ea"
@@ -45,4 +45,4 @@ test = [
4545
"StableRNGs",
4646
"StatisticalMeasures",
4747
"Test"
48-
]
48+
]

README.md

Lines changed: 20 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -1,7 +1,24 @@
11
# FeatureSelection.jl
22

3-
| Linux | Coverage | Code Style
4-
| :------------ | :------- | :------------- |
5-
| [![Build Status](https://github.com/JuliaAI/FeatureSelection.jl/workflows/CI/badge.svg)](https://github.com/JuliaAI/FeatureSelection.jl/actions) | [![Coverage](https://codecov.io/gh/JuliaAI/FeatureSelection.jl/branch/master/graph/badge.svg)](https://codecov.io/github/JuliaAI/FeatureSelection.jl?branch=dev) | [![Code Style: Blue](https://img.shields.io/badge/code%20style-blue-4495d1.svg)](https://github.com/invenia/BlueStyle) |
3+
| Linux | Coverage | Documentation | Code Style
4+
| :------------ | :------- | :------------- | :------------- |
5+
| [![Build Status](https://github.com/JuliaAI/FeatureSelection.jl/workflows/CI/badge.svg)](https://github.com/JuliaAI/FeatureSelection.jl/actions) | [![Coverage](https://codecov.io/gh/JuliaAI/FeatureSelection.jl/branch/dev/graph/badge.svg)](https://codecov.io/github/JuliaAI/FeatureSelection.jl?branch=dev) | [![Stable](https://img.shields.io/badge/docs-stable-blue.svg)](https://juliaai.github.io/FeatureSelection.jl/dev/) | [![Code Style: Blue](https://img.shields.io/badge/code%20style-blue-4495d1.svg)](https://github.com/invenia/BlueStyle) |
66

77
Repository housing feature selection algorithms for use with the machine learning toolbox [MLJ](https://juliaai.github.io/MLJ.jl/dev/).
8+
9+
This package provides a collection of feature selection algorithms designed for use with MLJ, a powerful machine learning toolbox in Julia. It aims to facilitate the process of selecting the most relevant features from your datasets, enhancing the performance and interpretability of your machine learning models.
10+
11+
## Key Features
12+
- Integration with MLJ: Seamlessly integrates with MLJ's extensive suite of tools and models.
13+
- Variety of Algorithms: Includes multiple feature selection algorithms to suit different types of data and models.
14+
- User-friendly: Easy to use with clear documentation and examples.
15+
16+
## Getting Started
17+
To get started with this package, refer to the documentation for installation instructions, usage guides, and API references.
18+
19+
## Contributing
20+
Contributions are welcome! Please refer to MLJ contributing [guidelines](https://github.com/JuliaAI/MLJ.jl/blob/dev/CONTRIBUTING.md) for more information.
21+
22+
## License
23+
This project is licensed under the MIT License - see the [LICENSE](LICENSE) file for details.
24+

docs/Project.toml

Lines changed: 2 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -1,11 +1,13 @@
11
[deps]
22
Documenter = "e30172f5-a6a5-5a46-863b-614d45cd2de4"
33
MLJ = "add582a8-e3ab-11e8-2d5e-e98b27df1bc7"
4+
MLJDecisionTreeInterface = "c6f25543-311c-4c74-83dc-3ea6d1015661"
45
FeatureSelection = "33837fe5-dbff-4c9e-8c2f-c5612fe2b8b6"
56
StableRNGs = "860ef19b-820b-49d6-a774-d7a799459cd3"
67

78
[compat]
89
Documenter = "^1.4"
910
MLJ = "^0.20"
11+
MLJDecisionTreeInterface = "^0.4.2"
1012
StableRNGs = "^1.0"
1113
julia = "^1.0"

docs/make.jl

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -30,5 +30,6 @@ makedocs(;
3030
deploydocs(;
3131
deploy_config = Documenter.GitHubActions(),
3232
repo="github.com/JuliaAI/FeatureSelection.jl.git",
33+
devbranch="dev",
3334
push_preview=true
3435
)

docs/src/api.md

Lines changed: 5 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -6,4 +6,9 @@ CurrentModule = FeatureSelection
66
```@docs
77
FeatureSelector
88
RecursiveFeatureElimination
9+
```
10+
# Internal Utils
11+
```@docs
12+
abs_last
13+
score_features!
914
```

docs/src/index.md

Lines changed: 46 additions & 48 deletions
Original file line numberDiff line numberDiff line change
@@ -20,7 +20,7 @@ recursive feature elimination should return the first columns as important featu
2020
```@meta
2121
DocTestSetup = quote
2222
using MLJ, FeatureSelection, StableRNGs
23-
rng = StableRNG(10)
23+
rng = StableRNG(123)
2424
A = rand(rng, 50, 10)
2525
X = MLJ.table(A) # features
2626
y = @views(
@@ -52,16 +52,16 @@ end
5252
```
5353
```@example example1
5454
using MLJ, FeatureSelection, StableRNGs
55-
rng = StableRNG(10)
55+
rng = StableRNG(123)
5656
A = rand(rng, 50, 10)
5757
X = MLJ.table(A) # features
58-
y = @views(
59-
10 .* sin.(
60-
pi .* A[:, 1] .* A[:, 2]
61-
) .+ 20 .* (A[:, 3] .- 0.5).^ 2 .+ 10 .* A[:, 4] .+ 5 * A[:, 5]
58+
y = @views(
59+
10 .* sin.(
60+
pi .* A[:, 1] .* A[:, 2]
61+
) + 20 .* (A[:, 3] .- 0.5).^ 2 .+ 10 .* A[:, 4] .+ 5 * A[:, 5]
6262
) # target
6363
```
64-
Now we that we have our data we can create our recursive feature elimination model and
64+
Now we that we have our data, we can create our recursive feature elimination model and
6565
train it on our dataset
6666
```@example example1
6767
RandomForestRegressor = @load RandomForestRegressor pkg=DecisionTree
@@ -74,51 +74,49 @@ fit!(mach)
7474
```
7575
We can inspect the feature importances in two ways:
7676
```jldoctest
77-
julia> report(mach).ranking
78-
10-element Vector{Int64}:
79-
1
80-
1
81-
1
82-
1
83-
1
84-
2
85-
3
86-
4
87-
5
88-
6
77+
julia> report(mach).scores
78+
Dict{Symbol, Int64} with 10 entries:
79+
:x9 => 4
80+
:x2 => 6
81+
:x5 => 6
82+
:x6 => 3
83+
:x7 => 2
84+
:x3 => 6
85+
:x8 => 1
86+
:x4 => 6
87+
:x10 => 5
88+
:x1 => 6
8989
9090
julia> feature_importances(mach)
9191
10-element Vector{Pair{Symbol, Int64}}:
92-
:x1 => 6
93-
:x2 => 5
94-
:x3 => 4
95-
:x4 => 3
96-
:x5 => 2
97-
:x6 => 1
98-
:x7 => 1
92+
:x9 => 4
93+
:x2 => 6
94+
:x5 => 6
95+
:x6 => 3
96+
:x7 => 2
97+
:x3 => 6
9998
:x8 => 1
100-
:x9 => 1
101-
:x10 => 1
99+
:x4 => 6
100+
:x10 => 5
101+
:x1 => 6
102102
```
103-
Note that a variable with lower rank has more significance than a variable with higher rank while a variable with higher feature importance is better than a variable with lower feature importance.
104-
105103
We can view the important features used by our model by inspecting the `fitted_params`
106104
object.
107105
```jldoctest
108106
julia> p = fitted_params(mach)
109-
(features_left = [:x1, :x2, :x3, :x4, :x5],
107+
(features_left = [:x4, :x2, :x1, :x5, :x3],
110108
model_fitresult = (forest = Ensemble of Decision Trees
111109
Trees: 100
112-
Avg Leaves: 25.26
113-
Avg Depth: 8.36,),)
110+
Avg Leaves: 25.3
111+
Avg Depth: 8.01,),)
114112
115113
julia> p.features_left
116114
5-element Vector{Symbol}:
117-
:x1
118-
:x2
119-
:x3
120115
:x4
116+
:x2
117+
:x1
121118
:x5
119+
:x3
122120
```
123121
We can also call the `predict` method on the fitted machine, to predict using a
124122
random forest regressor trained using only the important features, or call the `transform`
@@ -149,24 +147,24 @@ As before we can inspect the important features by inspecting the object returne
149147
```jldoctest
150148
julia> fitted_params(self_tuning_rfe_mach).best_fitted_params.features_left
151149
5-element Vector{Symbol}:
152-
:x1
153-
:x2
154-
:x3
155150
:x4
151+
:x2
152+
:x1
156153
:x5
154+
:x3
157155
158156
julia> feature_importances(self_tuning_rfe_mach)
159157
10-element Vector{Pair{Symbol, Int64}}:
160-
:x1 => 6
161-
:x2 => 5
162-
:x3 => 4
163-
:x4 => 3
164-
:x5 => 2
165-
:x6 => 1
158+
:x9 => 2
159+
:x2 => 6
160+
:x5 => 6
161+
:x6 => 4
166162
:x7 => 1
167-
:x8 => 1
168-
:x9 => 1
169-
:x10 => 1
163+
:x3 => 6
164+
:x8 => 5
165+
:x4 => 6
166+
:x10 => 3
167+
:x1 => 6
170168
```
171169
and call `predict` on the tuned model machine as shown below
172170
```@example example1

0 commit comments

Comments
 (0)