You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
# Use Apache Spark MLlib to build a machine learning application and analyze a dataset
14
14
15
-
Learn how to use Apache Spark [MLlib](https://spark.apache.org/mllib/) to create a machine learning application. The application will do predictive analysis on an open dataset. From Spark's built-in machine learning libraries, this example uses *classification* through logistic regression.
15
+
Learn how to use Apache Spark MLlib to create a machine learning application. The application will do predictive analysis on an open dataset. From Spark's built-in machine learning libraries, this example uses *classification* through logistic regression.
16
16
17
17
MLlib is a core Spark library that provides many utilities useful for machine learning tasks, such as:
18
18
@@ -175,7 +175,7 @@ Let's start to get a sense of what the dataset contains.
3. You can also use [Matplotlib](https://en.wikipedia.org/wiki/Matplotlib), a library used to construct visualization of data, to create a plot. Because the plot must be created from the locally persisted **countResultsdf** dataframe, the code snippet must begin with the `%%local` magic. This action ensures that the code is run locally on the Jupyter server.
178
+
3. You can also use Matplotlib, a library used to construct visualization of data, to create a plot. Because the plot must be created from the locally persisted **countResultsdf** dataframe, the code snippet must begin with the `%%local` magic. This action ensures that the code is run locally on the Jupyter server.
179
179
180
180
```PySpark
181
181
%%local
@@ -357,28 +357,5 @@ After you have finished running the application, you should shut down the notebo
357
357
## Next steps
358
358
359
359
* [Overview: Apache Spark on Azure HDInsight](apache-spark-overview.md)
360
-
361
-
### Scenarios
362
-
363
-
* [Apache Spark with BI: Interactive data analysis using Spark in HDInsight with BI tools](apache-spark-use-bi-tools.md)
364
-
* [Apache Spark with Machine Learning: Use Spark in HDInsight for analyzing building temperature using HVAC data](apache-spark-ipython-notebook-machine-learning.md)
365
360
* [Website log analysis using Apache Spark in HDInsight](apache-spark-custom-library-website-log-analysis.md)
366
-
367
-
### Create and run applications
368
-
369
-
* [Create a standalone application using Scala](apache-spark-create-standalone-application.md)
370
-
* [Run jobs remotely on an Apache Spark cluster using Apache Livy](apache-spark-livy-rest-interface.md)
371
-
372
-
### Tools and extensions
373
-
374
-
* [Use HDInsight Tools Plugin for IntelliJ IDEA to create and submit Spark Scala applications](apache-spark-intellij-tool-plugin.md)
375
-
* [Use HDInsight Tools Plugin for IntelliJ IDEA to debug Apache Spark applications remotely](apache-spark-intellij-tool-plugin-debug-jobs-remotely.md)
376
-
* [Use Apache Zeppelin notebooks with an Apache Spark cluster on HDInsight](apache-spark-zeppelin-notebook.md)
377
-
* [Kernels available for Jupyter notebook in Apache Spark cluster for HDInsight](apache-spark-jupyter-notebook-kernels.md)
378
-
* [Use external packages with Jupyter notebooks](apache-spark-jupyter-notebook-use-external-packages.md)
379
-
* [Install Jupyter on your computer and connect to an HDInsight Spark cluster](apache-spark-jupyter-notebook-install-locally.md)
380
-
381
-
### Manage resources
382
-
383
-
* [Manage resources for the Apache Spark cluster in Azure HDInsight](apache-spark-resource-manager.md)
384
-
* [Track and debug jobs running on an Apache Spark cluster in HDInsight](apache-spark-job-debugging.md)
361
+
* [Microsoft Cognitive Toolkit deep learning model with Azure HDInsight](apache-spark-microsoft-cognitive-toolkit.md)
0 commit comments