bloomberg
diff --git a/‎FORK.md
Lines changed: 5 additions & 0 deletions b/‎FORK.md
Lines changed: 5 additions & 0 deletions
diff --git a/‎docs/ml-features.md
Lines changed: 15 additions & 9 deletions b/‎docs/ml-features.md
Lines changed: 15 additions & 9 deletions
diff --git a/‎docs/ml-guide.md
Lines changed: 0 additions & 4 deletions b/‎docs/ml-guide.md
Lines changed: 0 additions & 4 deletions
diff --git a/‎examples/src/main/java/org/apache/spark/examples/ml/JavaOneHotEncoderExample.java renamed to ‎examples/src/main/java/org/apache/spark/examples/ml/JavaOneHotEncoderEstimatorExample.java
Lines changed: 4 additions & 4 deletions b/‎examples/src/main/java/org/apache/spark/examples/ml/JavaOneHotEncoderExample.java renamed to ‎examples/src/main/java/org/apache/spark/examples/ml/JavaOneHotEncoderEstimatorExample.java
Lines changed: 4 additions & 4 deletions
diff --git a/‎examples/src/main/python/ml/onehot_encoder_example.py renamed to ‎examples/src/main/python/ml/onehot_encoder_estimator_example.py
Lines changed: 4 additions & 4 deletions b/‎examples/src/main/python/ml/onehot_encoder_example.py renamed to ‎examples/src/main/python/ml/onehot_encoder_estimator_example.py
Lines changed: 4 additions & 4 deletions
diff --git a/‎examples/src/main/scala/org/apache/spark/examples/ml/OneHotEncoderExample.scala renamed to ‎examples/src/main/scala/org/apache/spark/examples/ml/OneHotEncoderEstimatorExample.scala
Lines changed: 4 additions & 4 deletions b/‎examples/src/main/scala/org/apache/spark/examples/ml/OneHotEncoderExample.scala renamed to ‎examples/src/main/scala/org/apache/spark/examples/ml/OneHotEncoderEstimatorExample.scala
Lines changed: 4 additions & 4 deletions
@@ -30,3 +30,8 @@
 * [SPARK-25862](https://issues.apache.org/jira/browse/SPARK-25862) - Removal of `unboundedPreceding`, `unboundedFollowing`, `currentRow`
 * [SPARK-26127](https://issues.apache.org/jira/browse/SPARK-26127) - Removal of deprecated setters from tree regression and classification models
 * [SPARK-25867](https://issues.apache.org/jira/browse/SPARK-25867) - Removal of KMeans computeCost
+
+* e59507243d Robert Kruszewski 14 seconds ago  (HEAD -> rk/merge-again) Revert "[SPARK-26216][SQL] Do not use case class as public API (UserDefinedFunction)"
+* 8735a08f1b Robert Kruszewski 68 seconds ago  Revert "[SPARK-26216][SQL][FOLLOWUP] use abstract class instead of trait for UserDefinedFunction"
+* 1423024322 Robert Kruszewski 2 minutes ago  Revert "[SPARK-26323][SQL] Scala UDF should still check input types even if some inputs are of type Any"
+* b0d256d21a Robert Kruszewski 2 minutes ago  Revert "[SPARK-26580][SQL] remove Scala 2.11 hack for Scala UDF"
@@ -781,37 +781,43 @@ for more details on the API.
 </div>
 </div>
 
-## OneHotEncoder
+## OneHotEncoder (Deprecated since 2.3.0)
+
+Because this existing `OneHotEncoder` is a stateless transformer, it is not usable on new data where the number of categories may differ from the training data. In order to fix this, a new `OneHotEncoderEstimator` was created that produces an `OneHotEncoderModel` when fitting. For more detail, please see [SPARK-13030](https://issues.apache.org/jira/browse/SPARK-13030).
+
+`OneHotEncoder` has been deprecated in 2.3.0 and will be removed in 3.0.0. Please use [OneHotEncoderEstimator](ml-features.html#onehotencoderestimator) instead.
+
+## OneHotEncoderEstimator
 
 [One-hot encoding](http://en.wikipedia.org/wiki/One-hot) maps a categorical feature, represented as a label index, to a binary vector with at most a single one-value indicating the presence of a specific feature value from among the set of all feature values. This encoding allows algorithms which expect continuous features, such as Logistic Regression, to use categorical features. For string type input data, it is common to encode categorical features using [StringIndexer](ml-features.html#stringindexer) first.
 
-`OneHotEncoder` can transform multiple columns, returning an one-hot-encoded output vector column for each input column. It is common to merge these vectors into a single feature vector using [VectorAssembler](ml-features.html#vectorassembler).
+`OneHotEncoderEstimator` can transform multiple columns, returning an one-hot-encoded output vector column for each input column. It is common to merge these vectors into a single feature vector using [VectorAssembler](ml-features.html#vectorassembler).
 
-`OneHotEncoder` supports the `handleInvalid` parameter to choose how to handle invalid input during transforming data. Available options include 'keep' (any invalid inputs are assigned to an extra categorical index) and 'error' (throw an error).
+`OneHotEncoderEstimator` supports the `handleInvalid` parameter to choose how to handle invalid input during transforming data. Available options include 'keep' (any invalid inputs are assigned to an extra categorical index) and 'error' (throw an error).
 
 **Examples**
 
 <div class="codetabs">
 <div data-lang="scala" markdown="1">
 
-Refer to the [OneHotEncoder Scala docs](api/scala/index.html#org.apache.spark.ml.feature.OneHotEncoder) for more details on the API.
+Refer to the [OneHotEncoderEstimator Scala docs](api/scala/index.html#org.apache.spark.ml.feature.OneHotEncoderEstimator) for more details on the API.
 
-{% include_example scala/org/apache/spark/examples/ml/OneHotEncoderExample.scala %}
+{% include_example scala/org/apache/spark/examples/ml/OneHotEncoderEstimatorExample.scala %}
 </div>
 
 <div data-lang="java" markdown="1">
 
-Refer to the [OneHotEncoder Java docs](api/java/org/apache/spark/ml/feature/OneHotEncoder.html)
+Refer to the [OneHotEncoderEstimator Java docs](api/java/org/apache/spark/ml/feature/OneHotEncoderEstimator.html)
 for more details on the API.
 
-{% include_example java/org/apache/spark/examples/ml/JavaOneHotEncoderExample.java %}
+{% include_example java/org/apache/spark/examples/ml/JavaOneHotEncoderEstimatorExample.java %}
 </div>
 
 <div data-lang="python" markdown="1">
 
-Refer to the [OneHotEncoder Python docs](api/python/pyspark.ml.html#pyspark.ml.feature.OneHotEncoder) for more details on the API.
+Refer to the [OneHotEncoderEstimator Python docs](api/python/pyspark.ml.html#pyspark.ml.feature.OneHotEncoderEstimator) for more details on the API.
 
-{% include_example python/ml/onehot_encoder_example.py %}
+{% include_example python/ml/onehot_encoder_estimator_example.py %}
 </div>
 </div>
 
 
@@ -106,10 +106,6 @@ and the migration guide below will explain all changes between releases.
 
 ## From 2.4 to 3.0
 
-### Breaking changes
-
-* `OneHotEncoder` which is deprecated in 2.3, is removed in 3.0 and `OneHotEncoderEstimator` is now renamed to `OneHotEncoder`.
-
 ### Changes of behavior
 
 * [SPARK-11215](https://issues.apache.org/jira/browse/SPARK-11215):
 
@@ -23,7 +23,7 @@
 import java.util.Arrays;
 import java.util.List;
 
-import org.apache.spark.ml.feature.OneHotEncoder;
+import org.apache.spark.ml.feature.OneHotEncoderEstimator;
 import org.apache.spark.ml.feature.OneHotEncoderModel;
 import org.apache.spark.sql.Dataset;
 import org.apache.spark.sql.Row;
@@ -34,11 +34,11 @@
 import org.apache.spark.sql.types.StructType;
 // $example off$
 
-public class JavaOneHotEncoderExample {
+public class JavaOneHotEncoderEstimatorExample {
   public static void main(String[] args) {
     SparkSession spark = SparkSession
       .builder()
-      .appName("JavaOneHotEncoderExample")
+      .appName("JavaOneHotEncoderEstimatorExample")
       .getOrCreate();
 
     // Note: categorical features are usually first encoded with StringIndexer
@@ -59,7 +59,7 @@ public static void main(String[] args) {
 
     Dataset<Row> df = spark.createDataFrame(data, schema);
 
-    OneHotEncoder encoder = new OneHotEncoder()
+    OneHotEncoderEstimator encoder = new OneHotEncoderEstimator()
       .setInputCols(new String[] {"categoryIndex1", "categoryIndex2"})
       .setOutputCols(new String[] {"categoryVec1", "categoryVec2"});
 
 
@@ -18,14 +18,14 @@
 from __future__ import print_function
 
 # $example on$
-from pyspark.ml.feature import OneHotEncoder
+from pyspark.ml.feature import OneHotEncoderEstimator
 # $example off$
 from pyspark.sql import SparkSession
 
 if __name__ == "__main__":
     spark = SparkSession\
         .builder\
-        .appName("OneHotEncoderExample")\
+        .appName("OneHotEncoderEstimatorExample")\
         .getOrCreate()
 
     # Note: categorical features are usually first encoded with StringIndexer
@@ -39,8 +39,8 @@
         (2.0, 0.0)
     ], ["categoryIndex1", "categoryIndex2"])
 
-    encoder = OneHotEncoder(inputCols=["categoryIndex1", "categoryIndex2"],
-                            outputCols=["categoryVec1", "categoryVec2"])
+    encoder = OneHotEncoderEstimator(inputCols=["categoryIndex1", "categoryIndex2"],
+                                     outputCols=["categoryVec1", "categoryVec2"])
     model = encoder.fit(df)
     encoded = model.transform(df)
     encoded.show()
 
@@ -19,15 +19,15 @@
 package org.apache.spark.examples.ml
 
 // $example on$
-import org.apache.spark.ml.feature.OneHotEncoder
+import org.apache.spark.ml.feature.OneHotEncoderEstimator
 // $example off$
 import org.apache.spark.sql.SparkSession
 
-object OneHotEncoderExample {
+object OneHotEncoderEstimatorExample {
   def main(args: Array[String]): Unit = {
     val spark = SparkSession
       .builder
-      .appName("OneHotEncoderExample")
+      .appName("OneHotEncoderEstimatorExample")
       .getOrCreate()
 
     // Note: categorical features are usually first encoded with StringIndexer
@@ -41,7 +41,7 @@ object OneHotEncoderExample {
       (2.0, 0.0)
     )).toDF("categoryIndex1", "categoryIndex2")
 
-    val encoder = new OneHotEncoder()
+    val encoder = new OneHotEncoderEstimator()
       .setInputCols(Array("categoryIndex1", "categoryIndex2"))
       .setOutputCols(Array("categoryVec1", "categoryVec2"))
     val model = encoder.fit(df)