Skip to content
Open
Show file tree
Hide file tree
Changes from 1 commit
Commits
Show all changes
41 commits
Select commit Hold shift + click to select a range
f51274d
🌟 Initialize docs
EssamWisam Sep 15, 2024
1e8a421
🔥 Delete useless logo
EssamWisam Sep 15, 2024
924df68
⭐️ Better documentation
EssamWisam Sep 29, 2024
28c1cb9
Merge pull request #29 from JuliaAI/dev
EssamWisam Jun 18, 2025
027f620
✨ Better structure and definitions
EssamWisam Jun 18, 2025
03892b7
✨ Link full list above
EssamWisam Jun 20, 2025
0263009
Create readme-reflect.yml
EssamWisam Jun 24, 2025
c0bb79d
Update readme-reflect.yml
EssamWisam Jun 24, 2025
27da502
chore: update README from docs index
invalid-email-address Jun 24, 2025
9953cef
✨ Better org
EssamWisam Jun 24, 2025
b92eefb
chore: update README from docs index
invalid-email-address Jun 24, 2025
24e7e0b
✨ Better style
EssamWisam Jun 25, 2025
cae2e9c
Merge branch 'docs' of https://github.com/JuliaAI/MLJTransforms.jl in…
EssamWisam Jun 25, 2025
50e74d1
Update docs/src/index.md
EssamWisam Jun 29, 2025
e85e221
chore: update README from docs index
invalid-email-address Jun 29, 2025
4bbc4e7
✨ Improve README, Transformers page and Entity Embedder page
EssamWisam Jun 29, 2025
02c8688
Update README.md
EssamWisam Jun 29, 2025
bc3b1df
✨ Improving internal documentation
EssamWisam Jun 29, 2025
d5af39a
💫 Add contributing and about pages
EssamWisam Jun 29, 2025
4270d25
🤩 Add four tutorials
EssamWisam Aug 18, 2025
a0244a1
Merge branch 'dev' into docs
EssamWisam Aug 18, 2025
d11d892
Update docs/src/index.md
EssamWisam Aug 18, 2025
14a8205
✨ Contributions change
EssamWisam Aug 24, 2025
7fff9bf
Update make.jl
EssamWisam Aug 24, 2025
2832695
Update make.jl
EssamWisam Aug 24, 2025
72d6ab2
Enable documentation deployment for docs branch
EssamWisam Aug 24, 2025
489dc42
Update documenter.yml
EssamWisam Aug 24, 2025
6cfd374
Update MLJFlux to v0.6.6 to include EntityEmbedder support
EssamWisam Aug 24, 2025
9f9d28e
fix mljflux
EssamWisam Aug 24, 2025
09de647
✨ Add CV analysis
EssamWisam Aug 25, 2025
f7ac80e
✨ improve entity embeddings tutorial
EssamWisam Aug 27, 2025
8bd0dc9
✨ Improve standardization
EssamWisam Aug 27, 2025
2630f08
✨ Fix high cardinality dataset
EssamWisam Aug 28, 2025
6dca39a
fix docs
EssamWisam Aug 28, 2025
94f562b
Update docs/src/tutorials/adult_example/notebook.md
EssamWisam Aug 28, 2025
c89a00d
Update docs/src/tutorials/adult_example/notebook.jl
EssamWisam Aug 28, 2025
1e78e89
Update docs/src/transformers/all_transformers.md
EssamWisam Sep 1, 2025
3d72ff6
✨ Fix links for contrast encoder
EssamWisam Sep 1, 2025
82c1631
👨‍🔧 More doc fixes
EssamWisam Sep 1, 2025
0fc26c0
Merge branch 'dev' into docs
EssamWisam Sep 1, 2025
c283429
Update all_transformers.md
EssamWisam Sep 6, 2025
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
30 changes: 30 additions & 0 deletions .github/workflows/documenter.yml
Original file line number Diff line number Diff line change
@@ -0,0 +1,30 @@
name: Documentation

on:
push:
branches:
- dev
- instate-docs
tags: '*'
pull_request:

jobs:
build:
permissions:
contents: write
pull-requests: read
statuses: write
runs-on: ubuntu-latest
steps:
- uses: actions/checkout@v4
- uses: julia-actions/setup-julia@v2
with:
version: '1.10'
- uses: julia-actions/cache@v1
- name: Install dependencies
run: julia --project=docs/ -e 'using Pkg; Pkg.develop(PackageSpec(path=pwd())); Pkg.instantiate()'
- name: Build and deploy
env:
GITHUB_TOKEN: ${{ secrets.GITHUB_TOKEN }} # If authenticating with GitHub Actions token
DOCUMENTER_KEY: ${{ secrets.DOCUMENTER_KEY }} # If authenticating with SSH deploy key
run: julia --project=docs/ docs/make.jl
1 change: 1 addition & 0 deletions docs/make.jl
Original file line number Diff line number Diff line change
Expand Up @@ -25,6 +25,7 @@ makedocs(
pages = [
"Introduction" => "index.md",
"Transformers" => Any[
"Numerical Transformers"=>"transformers/numerical.md",
"Classical Encoders"=>"transformers/classical.md",
"Neural-based Encoders"=>"transformers/neural.md",
"Contrast Encoders"=>"transformers/contrast.md",
Expand Down
25 changes: 12 additions & 13 deletions docs/src/index.md
Original file line number Diff line number Diff line change
Expand Up @@ -35,16 +35,15 @@ Xnew = transform(mach, X)
```

## Available Transformers
In `MLJTransforms` we define "encoders" to encompass models that specifically operate by encoding categorical variables; meanwhile, "transformers" refers to models that apply more generic transformations on columns that are not necessarily categorical. We define the following taxonomy for different models founds in `MLJTransforms`:

| Genre | Definition |
|:----------:|:----------:|
| **Classical Encoders** | Well known and commonly used categorical encoders |
| **Neural-based Encoders** | Categorical encoders based on neural networks |
| **Contrast Encoders** | Categorical encoders that could be modeled by a contrast matrix |
| **Utility Encoders** | Categorical encoders meant to be used as preprocessors for other encoders or models |
| **Other Transformers** | More generic transformers that go beyond categorical encoding |




In `MLJTransforms` we denote transformers that operate on columns with `Continuous` and/or `Count` [scientific types](https://juliaai.github.io/ScientificTypes.jl/dev/) as numerical transformers. Meanwhile, categorical transformers operate on `Multiclass` and/or `OrderedFactor` [scientific types](https://juliaai.github.io/ScientificTypes.jl/dev/). Most categorical transformers in this package operate by converting categorical values into numerical values or vectors, and are therefore considered categorical encoders.

Based on this, we categorize the methods as follows, with further distinctions for categorical encoders:

| **Category** | **Description** |
|:---------------------------:|:-------------------------------------------------------------------------------:|
| **Numerical Transformers** | Transformers that operate on `Continuous` or `Count` columns in a given dataset.|
| **Classical Encoders** | Widely recognized and frequently utilized categorical encoders. |
| **Neural-based Encoders** | Categorical encoders based on neural networks. |
| **Contrast Encoders** | Categorical encoders modeled via a contrast matrix. |
| **Utility Encoders** | Categorical encoders meant to be used as preprocessors for other encoders or models.|
| **Other Transformers** | Transformers that fall into other categories. |
3 changes: 0 additions & 3 deletions docs/src/transformers/contrast.md
Original file line number Diff line number Diff line change
Expand Up @@ -5,10 +5,7 @@ Contrast Encoders include categorical encoders that could be modeled by a contra
| [DummyEncoder](@ref) | Encodes by comparing each level to the reference level, intercept being the cell mean of the reference group |
| [SumEncoder](@ref) | Encodes by comparing each level to the reference level, intercept being the grand mean |
| [HelmertEncoder](@ref) | Encodes by comparing levels of a variable with the mean of the subsequent levels of the variable
|
| [HelmertEncoder](@ref) | Encodes by comparing levels of a variable with the mean of the subsequent levels of the variable |
| [ForwardDifferenceEncoder](@ref) | Encodes by comparing adjacent levels of a variable (each level minus the next level)
|
| [ContrastEncoder](@ref) | Allows defining a custom contrast encoder via a contrast matrix |
| [HypothesisEncoder](@ref) | Allows defining a custom contrast encoder via a hypothesis matrix |

Expand Down
2 changes: 1 addition & 1 deletion docs/src/transformers/neural.md
Original file line number Diff line number Diff line change
@@ -1,4 +1,4 @@
Neural-based Encoders include ategorical encoders based on neural networks:
Neural-based Encoders include categorical encoders based on neural networks:

| Transformer | Brief Description |
|:----------:|:----------:|
Expand Down
25 changes: 25 additions & 0 deletions docs/src/transformers/numerical.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,25 @@
Other Transformers include more generic transformers that go beyond categorical encoding

| Transformer | Brief Description |
|:----------:|:----------:|
| [Standardizer](@ref) | Transforming columns of numerical features by standardization |
| [BoxCoxTransformer](@ref) | Transforming columns of numerical features by BoxCox transformation |
| [UnivariateBoxCoxTransformer](@ref) | Apply BoxCox transformation given a single vector |
| [InteractionTransformer](@ref) | Transforming columns of numerical features to create new interaction features |
| [UnivariateDiscretizer](@ref) | Discretize a continuous vector into an ordered factor |

```@docs
MLJTransforms.Standardizer
```

```@docs
MLJTransforms.InteractionTransformer
```

```@docs
MLJTransforms.BoxCoxTransformer
```

```@docs
MLJTransforms.UnivariateDiscretizer
```
25 changes: 2 additions & 23 deletions docs/src/transformers/others.md
Original file line number Diff line number Diff line change
@@ -1,35 +1,14 @@
Other Transformers include more generic transformers that go beyond categorical encoding
Transformers that operate on columns with general or specialized scientific types.

| Transformer | Brief Description |
|:----------:|:----------:|
| [FillImputer](@ref) | Fill missing values of any features belonging to any scientific type |
| [Standardizer](@ref) | Transforming columns of numerical features by standardization |
| [BoxCoxTransformer](@ref) | Transforming columns of numerical features by BoxCox transformation |
| [InteractionTransformer](@ref) | Transforming columns of numerical features by creating new interaction features |
| [UnivariateBoxCoxTransformer](@ref) | Apply BoxCox transformation given a single vector |
| [UnivariateDiscretizer](@ref) | Discretize a continuous vector into an ordered factor |
| [FillImputer](@ref) | Fill missing values of features belonging to any scientific type |
| [UnivariateTimeTypeToContinuous](@ref) | Transform a vector of time type into continuous type |

```@docs
MLJTransforms.FillImputer
```

```@docs
MLJTransforms.Standardizer
```

```@docs
MLJTransforms.BoxCoxTransformer
```

```@docs
MLJTransforms.InteractionTransformer
```


```@docs
MLJTransforms.UnivariateDiscretizer
```

```@docs
MLJTransforms.UnivariateTimeTypeToContinuous
Expand Down
4 changes: 2 additions & 2 deletions docs/src/transformers/utility.md
Original file line number Diff line number Diff line change
Expand Up @@ -2,8 +2,8 @@ Utility Encoders include categorical encoders meant to be used as preprocessors

| Transformer | Brief Description |
|:----------:|:----------:|
| [CardinalityReducer](@ref) | Reduce cardinality of high cardinality features by grouping infrequent categories |
| [MissingnessEncoder](@ref) | Encode missing values of categorical columns into new values |
| [CardinalityReducer](@ref) | Reduce cardinality of high cardinality categorical features by grouping infrequent categories |
| [MissingnessEncoder](@ref) | Encode missing values of categorical features into new values |

```@docs
MLJTransforms.CardinalityReducer
Expand Down
Loading