Skip to content

Commit 76577b2

Browse files
authored
Merge branch 'master' into master
2 parents 94923d5 + 6c95bf0 commit 76577b2

File tree

145 files changed

+23462
-160
lines changed

Some content is hidden

Large Commits have some content hidden by default. Use the searchbox below for content that may be hidden.

145 files changed

+23462
-160
lines changed

README.md

Lines changed: 12 additions & 12 deletions
Original file line numberDiff line numberDiff line change
@@ -11,10 +11,10 @@ SynapseML requires Scala 2.12, Spark 3.4+, and Python 3.8+.
1111
| Topics | Links |
1212
| :------ | :------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------ |
1313
| Build | [![Build Status](https://msdata.visualstudio.com/A365/_apis/build/status/microsoft.SynapseML?branchName=master)](https://msdata.visualstudio.com/A365/_build/latest?definitionId=17563&branchName=master) [![codecov](https://codecov.io/gh/Microsoft/SynapseML/branch/master/graph/badge.svg)](https://codecov.io/gh/Microsoft/SynapseML) [![Code style: black](https://img.shields.io/badge/code%20style-black-000000.svg)](https://github.com/psf/black) |
14-
| Version | [![Version](https://img.shields.io/badge/version-1.0.10-blue)](https://github.com/Microsoft/SynapseML/releases) [![Release Notes](https://img.shields.io/badge/release-notes-blue)](https://github.com/Microsoft/SynapseML/releases) [![Snapshot Version](https://mmlspark.blob.core.windows.net/icons/badges/master_version3.svg)](#sbt) |
15-
| Docs | [![Website](https://img.shields.io/badge/SynapseML-Website-blue)](https://aka.ms/spark) [![Scala Docs](https://img.shields.io/static/v1?label=api%20docs&message=scala&color=blue&logo=scala)](https://mmlspark.blob.core.windows.net/docs/1.0.10/scala/index.html#package) [![PySpark Docs](https://img.shields.io/static/v1?label=api%20docs&message=python&color=blue&logo=python)](https://mmlspark.blob.core.windows.net/docs/1.0.10/pyspark/index.html) [![Academic Paper](https://img.shields.io/badge/academic-paper-7fdcf7)](https://arxiv.org/abs/1810.08744) |
14+
| Version | [![Version](https://img.shields.io/badge/version-1.0.11-blue)](https://github.com/Microsoft/SynapseML/releases) [![Release Notes](https://img.shields.io/badge/release-notes-blue)](https://github.com/Microsoft/SynapseML/releases) [![Snapshot Version](https://mmlspark.blob.core.windows.net/icons/badges/master_version3.svg)](#sbt) |
15+
| Docs | [![Website](https://img.shields.io/badge/SynapseML-Website-blue)](https://aka.ms/spark) [![Scala Docs](https://img.shields.io/static/v1?label=api%20docs&message=scala&color=blue&logo=scala)](https://mmlspark.blob.core.windows.net/docs/1.0.11/scala/index.html#package) [![PySpark Docs](https://img.shields.io/static/v1?label=api%20docs&message=python&color=blue&logo=python)](https://mmlspark.blob.core.windows.net/docs/1.0.11/pyspark/index.html) [![Academic Paper](https://img.shields.io/badge/academic-paper-7fdcf7)](https://arxiv.org/abs/1810.08744) |
1616
| Support | [![Gitter](https://badges.gitter.im/Microsoft/MMLSpark.svg)](https://gitter.im/Microsoft/MMLSpark?utm_source=badge&utm_medium=badge&utm_campaign=pr-badge) [![Mail](https://img.shields.io/badge/mail-synapseml--support-brightgreen)](mailto:synapseml-support@microsoft.com) |
17-
| Binder | [![Binder](https://mybinder.org/badge_logo.svg)](https://mybinder.org/v2/gh/microsoft/SynapseML/v1.0.10?labpath=notebooks%2Ffeatures) |
17+
| Binder | [![Binder](https://mybinder.org/badge_logo.svg)](https://mybinder.org/v2/gh/microsoft/SynapseML/v1.0.11?labpath=notebooks%2Ffeatures) |
1818
| Usage | [![Downloads](https://static.pepy.tech/badge/synapseml)](https://pepy.tech/project/synapseml) |
1919
<!-- markdownlint-disable MD033 -->
2020
<details open>
@@ -119,7 +119,7 @@ In Azure Synapse notebooks please place the following in the first cell of your
119119
{
120120
"name": "synapseml",
121121
"conf": {
122-
"spark.jars.packages": "com.microsoft.azure:synapseml_2.12:1.0.10",
122+
"spark.jars.packages": "com.microsoft.azure:synapseml_2.12:1.0.11",
123123
"spark.jars.repositories": "https://mmlspark.azureedge.net/maven",
124124
"spark.jars.excludes": "org.scala-lang:scala-reflect,org.apache.spark:spark-tags_2.12,org.scalactic:scalactic_2.12,org.scalatest:scalatest_2.12,com.fasterxml.jackson.core:jackson-databind",
125125
"spark.yarn.user.classpath.first": "true",
@@ -155,15 +155,15 @@ cloud](http://community.cloud.databricks.com), create a new [library from Maven
155155
coordinates](https://docs.databricks.com/user-guide/libraries.html#libraries-from-maven-pypi-or-spark-packages)
156156
in your workspace.
157157

158-
For the coordinates use: `com.microsoft.azure:synapseml_2.12:1.0.10`
158+
For the coordinates use: `com.microsoft.azure:synapseml_2.12:1.0.11`
159159
with the resolver: `https://mmlspark.azureedge.net/maven`. Ensure this library is
160160
attached to your target cluster(s).
161161

162162
Finally, ensure that your Spark cluster has at least Spark 3.2 and Scala 2.12. If you encounter Netty dependency issues please use DBR 10.1.
163163

164164
You can use SynapseML in both your Scala and PySpark notebooks. To get started with our example notebooks import the following databricks archive:
165165

166-
`https://mmlspark.blob.core.windows.net/dbcs/SynapseMLExamplesv1.0.10.dbc`
166+
`https://mmlspark.blob.core.windows.net/dbcs/SynapseMLExamplesv1.0.11.dbc`
167167

168168
### Python Standalone
169169

@@ -174,7 +174,7 @@ the above example, or from python:
174174
```python
175175
import pyspark
176176
spark = pyspark.sql.SparkSession.builder.appName("MyApp") \
177-
.config("spark.jars.packages", "com.microsoft.azure:synapseml_2.12:1.0.10") \
177+
.config("spark.jars.packages", "com.microsoft.azure:synapseml_2.12:1.0.11") \
178178
.getOrCreate()
179179
import synapse.ml
180180
```
@@ -185,9 +185,9 @@ SynapseML can be conveniently installed on existing Spark clusters via the
185185
`--packages` option, examples:
186186

187187
```bash
188-
spark-shell --packages com.microsoft.azure:synapseml_2.12:1.0.10
189-
pyspark --packages com.microsoft.azure:synapseml_2.12:1.0.10
190-
spark-submit --packages com.microsoft.azure:synapseml_2.12:1.0.10 MyApp.jar
188+
spark-shell --packages com.microsoft.azure:synapseml_2.12:1.0.11
189+
pyspark --packages com.microsoft.azure:synapseml_2.12:1.0.11
190+
spark-submit --packages com.microsoft.azure:synapseml_2.12:1.0.11 MyApp.jar
191191
```
192192

193193
### SBT
@@ -196,7 +196,7 @@ If you are building a Spark application in Scala, add the following lines to
196196
your `build.sbt`:
197197

198198
```scala
199-
libraryDependencies += "com.microsoft.azure" % "synapseml_2.12" % "1.0.10"
199+
libraryDependencies += "com.microsoft.azure" % "synapseml_2.12" % "1.0.11"
200200
```
201201

202202
### Apache Livy and HDInsight
@@ -210,7 +210,7 @@ Excluding certain packages from the library may be necessary due to current issu
210210
{
211211
"name": "synapseml",
212212
"conf": {
213-
"spark.jars.packages": "com.microsoft.azure:synapseml_2.12:1.0.10",
213+
"spark.jars.packages": "com.microsoft.azure:synapseml_2.12:1.0.11",
214214
"spark.jars.excludes": "org.scala-lang:scala-reflect,org.apache.spark:spark-tags_2.12,org.scalactic:scalactic_2.12,org.scalatest:scalatest_2.12,com.fasterxml.jackson.core:jackson-databind"
215215
}
216216
}

core/src/main/python/synapse/ml/core/logging/SynapseMLLogger.py

Lines changed: 2 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -6,6 +6,7 @@
66
from synapse.ml.core.platform.Platform import (
77
running_on_synapse_internal,
88
running_on_synapse,
9+
running_on_fabric_python,
910
)
1011
from pyspark.sql.dataframe import DataFrame
1112
from pyspark import SparkContext
@@ -82,7 +83,7 @@ def get_required_log_fields(
8283

8384
@classmethod
8485
def _get_environment_logger(cls, log_level: int) -> logging.Logger:
85-
if running_on_synapse_internal():
86+
if running_on_synapse_internal() or running_on_fabric_python():
8687
from synapse.ml.pymds.synapse_logger import get_mds_logger
8788

8889
return get_mds_logger(

core/src/main/python/synapse/ml/core/platform/Platform.py

Lines changed: 8 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -5,11 +5,13 @@
55

66
PLATFORM_SYNAPSE_INTERNAL = "synapse_internal"
77
PLATFORM_SYNAPSE = "synapse"
8+
PLATFORM_FABRIC_PYTHON = "fabric_python_env"
89
PLATFORM_BINDER = "binder"
910
PLATFORM_DATABRICKS = "databricks"
1011
PLATFORM_UNKNOWN = "unknown"
1112
SECRET_STORE = "mmlspark-build-keys"
1213
SYNAPSE_PROJECT_NAME = "Microsoft.ProjectArcadia"
14+
FABRIC_URL = "fabric.microsoft.com"
1315

1416

1517
def current_platform():
@@ -26,6 +28,8 @@ def current_platform():
2628
return PLATFORM_DATABRICKS
2729
elif os.environ.get("BINDER_LAUNCH_HOST", None) is not None:
2830
return PLATFORM_BINDER
31+
elif FABRIC_URL in os.environ.get("MSNOTEBOOKUTILS_SPARK_TRIDENT_PBIHOST", ""):
32+
return PLATFORM_FABRIC_PYTHON
2933
else:
3034
return PLATFORM_UNKNOWN
3135

@@ -46,6 +50,10 @@ def running_on_databricks():
4650
return current_platform() is PLATFORM_DATABRICKS
4751

4852

53+
def running_on_fabric_python():
54+
return current_platform() is PLATFORM_FABRIC_PYTHON
55+
56+
4957
def find_secret(secret_name, keyvault):
5058
try:
5159
if running_on_synapse():

core/src/test/scala/com/microsoft/azure/synapse/ml/nbtest/DatabricksCPUTests.scala

Lines changed: 1 addition & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -5,14 +5,13 @@ package com.microsoft.azure.synapse.ml.nbtest
55

66
import com.microsoft.azure.synapse.ml.nbtest.DatabricksUtilities._
77

8-
import scala.collection.mutable.ListBuffer
98
import scala.language.existentials
109

1110
class DatabricksCPUTests extends DatabricksTestHelper {
1211

1312
val clusterId: String = createClusterInPool(ClusterName, AdbRuntime, NumWorkers, PoolId, memory = Some("7g"))
1413

15-
databricksTestHelper(clusterId, Libraries, CPUNotebooks)
14+
databricksTestHelper(clusterId, Libraries, CPUNotebooks, 5)
1615

1716
protected override def afterAll(): Unit = {
1817
afterAllHelper(clusterId, ClusterName)

core/src/test/scala/com/microsoft/azure/synapse/ml/nbtest/DatabricksGPUTests.scala

Lines changed: 1 addition & 6 deletions
Original file line numberDiff line numberDiff line change
@@ -3,18 +3,13 @@
33

44
package com.microsoft.azure.synapse.ml.nbtest
55

6-
import com.microsoft.azure.synapse.ml.build.BuildInfo
7-
import com.microsoft.azure.synapse.ml.core.env.FileUtilities
86
import com.microsoft.azure.synapse.ml.nbtest.DatabricksUtilities._
97

10-
import java.io.File
11-
import scala.collection.mutable.ListBuffer
12-
138
class DatabricksGPUTests extends DatabricksTestHelper {
149

1510
val clusterId: String = createClusterInPool(GPUClusterName, AdbGpuRuntime, 2, GpuPoolId)
1611

17-
databricksTestHelper(clusterId, GPULibraries, GPUNotebooks, 1)
12+
databricksTestHelper(clusterId, GPULibraries, GPUNotebooks, 1, List())
1813

1914
protected override def afterAll(): Unit = {
2015
afterAllHelper(clusterId, GPUClusterName)

core/src/test/scala/com/microsoft/azure/synapse/ml/nbtest/DatabricksRapidsTests.scala

Lines changed: 1 addition & 8 deletions
Original file line numberDiff line numberDiff line change
@@ -5,18 +5,11 @@ package com.microsoft.azure.synapse.ml.nbtest
55

66
import com.microsoft.azure.synapse.ml.nbtest.DatabricksUtilities._
77

8-
import com.microsoft.azure.synapse.ml.build.BuildInfo
9-
import com.microsoft.azure.synapse.ml.core.env.FileUtilities
10-
import com.microsoft.azure.synapse.ml.nbtest.DatabricksUtilities._
11-
12-
import java.io.File
13-
import scala.collection.mutable.ListBuffer
14-
158
class DatabricksRapidsTests extends DatabricksTestHelper {
169

1710
val clusterId: String = createClusterInPool(GPUClusterName, AdbGpuRuntime, 1, GpuPoolId, RapidsInitScripts)
1811

19-
databricksTestHelper(clusterId, GPULibraries, RapidsNotebooks)
12+
databricksTestHelper(clusterId, GPULibraries, RapidsNotebooks, 4)
2013

2114
protected override def afterAll(): Unit = {
2215
afterAllHelper(clusterId, RapidsClusterName)

core/src/test/scala/com/microsoft/azure/synapse/ml/nbtest/DatabricksUtilities.scala

Lines changed: 7 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -21,11 +21,12 @@ import spray.json.{JsArray, JsObject, JsValue, _}
2121
import java.io.{File, FileInputStream}
2222
import java.time.LocalDateTime
2323
import java.util.concurrent.{Executors, TimeUnit, TimeoutException}
24-
import scala.collection.immutable.Map
2524
import scala.collection.mutable
2625
import scala.concurrent.duration.Duration
2726
import scala.concurrent.{Await, ExecutionContext, Future, blocking}
2827
import scala.util.Random
28+
import com.microsoft.azure.synapse.ml.io.http.RESTHelpers.retry
29+
2930

3031
object DatabricksUtilities {
3132

@@ -434,7 +435,8 @@ abstract class DatabricksTestHelper extends TestBase {
434435
def databricksTestHelper(clusterId: String,
435436
libraries: String,
436437
notebooks: Seq[File],
437-
maxConcurrency: Int = 8): Unit = {
438+
maxConcurrency: Int,
439+
retries: List[Int] = List(1000 * 15)): Unit = {
438440

439441
println("Checking if cluster is active")
440442
tryWithRetries(Seq.fill(60 * 20)(1000).toArray) { () =>
@@ -455,7 +457,9 @@ abstract class DatabricksTestHelper extends TestBase {
455457

456458
val futures = notebooks.map { notebook =>
457459
Future {
458-
runNotebook(clusterId, notebook)
460+
retry(retries, { () =>
461+
runNotebook(clusterId, notebook)
462+
})
459463
}
460464
}
461465
futures.zip(notebooks).foreach { case (f, nb) =>

0 commit comments

Comments
 (0)