You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Copy file name to clipboardExpand all lines: FORK.md
+5Lines changed: 5 additions & 0 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -30,3 +30,8 @@
30
30
*[SPARK-25862](https://issues.apache.org/jira/browse/SPARK-25862) - Removal of `unboundedPreceding`, `unboundedFollowing`, `currentRow`
31
31
*[SPARK-26127](https://issues.apache.org/jira/browse/SPARK-26127) - Removal of deprecated setters from tree regression and classification models
32
32
*[SPARK-25867](https://issues.apache.org/jira/browse/SPARK-25867) - Removal of KMeans computeCost
33
+
34
+
* e59507243d Robert Kruszewski 14 seconds ago (HEAD -> rk/merge-again) Revert "[SPARK-26216][SQL] Do not use case class as public API (UserDefinedFunction)"
35
+
* 8735a08f1b Robert Kruszewski 68 seconds ago Revert "[SPARK-26216][SQL][FOLLOWUP] use abstract class instead of trait for UserDefinedFunction"
36
+
* 1423024322 Robert Kruszewski 2 minutes ago Revert "[SPARK-26323][SQL] Scala UDF should still check input types even if some inputs are of type Any"
37
+
* b0d256d21a Robert Kruszewski 2 minutes ago Revert "[SPARK-26580][SQL] remove Scala 2.11 hack for Scala UDF"
Copy file name to clipboardExpand all lines: docs/ml-features.md
+15-9Lines changed: 15 additions & 9 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -781,37 +781,43 @@ for more details on the API.
781
781
</div>
782
782
</div>
783
783
784
-
## OneHotEncoder
784
+
## OneHotEncoder (Deprecated since 2.3.0)
785
+
786
+
Because this existing `OneHotEncoder` is a stateless transformer, it is not usable on new data where the number of categories may differ from the training data. In order to fix this, a new `OneHotEncoderEstimator` was created that produces an `OneHotEncoderModel` when fitting. For more detail, please see [SPARK-13030](https://issues.apache.org/jira/browse/SPARK-13030).
787
+
788
+
`OneHotEncoder` has been deprecated in 2.3.0 and will be removed in 3.0.0. Please use [OneHotEncoderEstimator](ml-features.html#onehotencoderestimator) instead.
789
+
790
+
## OneHotEncoderEstimator
785
791
786
792
[One-hot encoding](http://en.wikipedia.org/wiki/One-hot) maps a categorical feature, represented as a label index, to a binary vector with at most a single one-value indicating the presence of a specific feature value from among the set of all feature values. This encoding allows algorithms which expect continuous features, such as Logistic Regression, to use categorical features. For string type input data, it is common to encode categorical features using [StringIndexer](ml-features.html#stringindexer) first.
787
793
788
-
`OneHotEncoder` can transform multiple columns, returning an one-hot-encoded output vector column for each input column. It is common to merge these vectors into a single feature vector using [VectorAssembler](ml-features.html#vectorassembler).
794
+
`OneHotEncoderEstimator` can transform multiple columns, returning an one-hot-encoded output vector column for each input column. It is common to merge these vectors into a single feature vector using [VectorAssembler](ml-features.html#vectorassembler).
789
795
790
-
`OneHotEncoder` supports the `handleInvalid` parameter to choose how to handle invalid input during transforming data. Available options include 'keep' (any invalid inputs are assigned to an extra categorical index) and 'error' (throw an error).
796
+
`OneHotEncoderEstimator` supports the `handleInvalid` parameter to choose how to handle invalid input during transforming data. Available options include 'keep' (any invalid inputs are assigned to an extra categorical index) and 'error' (throw an error).
791
797
792
798
**Examples**
793
799
794
800
<divclass="codetabs">
795
801
<divdata-lang="scala"markdown="1">
796
802
797
-
Refer to the [OneHotEncoder Scala docs](api/scala/index.html#org.apache.spark.ml.feature.OneHotEncoder) for more details on the API.
803
+
Refer to the [OneHotEncoderEstimator Scala docs](api/scala/index.html#org.apache.spark.ml.feature.OneHotEncoderEstimator) for more details on the API.
0 commit comments