Skip to content

Commit 835b336

Browse files
author
kbuilder user
committed
Release 0.43.0.
1 parent 3770c0a commit 835b336

File tree

2 files changed

+57
-24
lines changed

2 files changed

+57
-24
lines changed

CHANGES.md

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -1,6 +1,6 @@
11
# Release Notes
22

3-
## Next
3+
## 0.43.0 - 2025-10-17
44
* Added new connector, `spark-4.0-bigquery` aimed to be used in Spark 4.0. Like Spark 4.0, this connector requires at
55
least Java 17 runtime. It is currently in preview mode.
66
* PR #1367: Query Pushdown is no longer supported.

README.md

Lines changed: 56 additions & 23 deletions
Original file line numberDiff line numberDiff line change
@@ -57,14 +57,14 @@ The latest version of the connector is publicly available in the following links
5757

5858
| version | Link |
5959
|------------|------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|
60-
| Spark 3.5 | `gs://spark-lib/bigquery/spark-3.5-bigquery-0.42.2.jar`([HTTP link](https://storage.googleapis.com/spark-lib/bigquery/spark-3.5-bigquery-0.42.2.jar)) |
61-
| Spark 3.4 | `gs://spark-lib/bigquery/spark-3.4-bigquery-0.42.2.jar`([HTTP link](https://storage.googleapis.com/spark-lib/bigquery/spark-3.4-bigquery-0.42.2.jar)) |
62-
| Spark 3.3 | `gs://spark-lib/bigquery/spark-3.3-bigquery-0.42.2.jar`([HTTP link](https://storage.googleapis.com/spark-lib/bigquery/spark-3.3-bigquery-0.42.2.jar)) |
63-
| Spark 3.2 | `gs://spark-lib/bigquery/spark-3.2-bigquery-0.42.2.jar`([HTTP link](https://storage.googleapis.com/spark-lib/bigquery/spark-3.2-bigquery-0.42.2.jar)) |
64-
| Spark 3.1 | `gs://spark-lib/bigquery/spark-3.1-bigquery-0.42.2.jar`([HTTP link](https://storage.googleapis.com/spark-lib/bigquery/spark-3.1-bigquery-0.42.2.jar)) |
60+
| Spark 3.5 | `gs://spark-lib/bigquery/spark-3.5-bigquery-0.43.0.jar`([HTTP link](https://storage.googleapis.com/spark-lib/bigquery/spark-3.5-bigquery-0.43.0.jar)) |
61+
| Spark 3.4 | `gs://spark-lib/bigquery/spark-3.4-bigquery-0.43.0.jar`([HTTP link](https://storage.googleapis.com/spark-lib/bigquery/spark-3.4-bigquery-0.43.0.jar)) |
62+
| Spark 3.3 | `gs://spark-lib/bigquery/spark-3.3-bigquery-0.43.0.jar`([HTTP link](https://storage.googleapis.com/spark-lib/bigquery/spark-3.3-bigquery-0.43.0.jar)) |
63+
| Spark 3.2 | `gs://spark-lib/bigquery/spark-3.2-bigquery-0.43.0.jar`([HTTP link](https://storage.googleapis.com/spark-lib/bigquery/spark-3.2-bigquery-0.43.0.jar)) |
64+
| Spark 3.1 | `gs://spark-lib/bigquery/spark-3.1-bigquery-0.43.0.jar`([HTTP link](https://storage.googleapis.com/spark-lib/bigquery/spark-3.1-bigquery-0.43.0.jar)) |
6565
| Spark 2.4 | `gs://spark-lib/bigquery/spark-2.4-bigquery-0.37.0.jar`([HTTP link](https://storage.googleapis.com/spark-lib/bigquery/spark-2.4-bigquery-0.37.0.jar)) |
66-
| Scala 2.13 | `gs://spark-lib/bigquery/spark-bigquery-with-dependencies_2.13-0.42.2.jar` ([HTTP link](https://storage.googleapis.com/spark-lib/bigquery/spark-bigquery-with-dependencies_2.13-0.42.2.jar)) |
67-
| Scala 2.12 | `gs://spark-lib/bigquery/spark-bigquery-with-dependencies_2.12-0.42.2.jar` ([HTTP link](https://storage.googleapis.com/spark-lib/bigquery/spark-bigquery-with-dependencies_2.12-0.42.2.jar)) |
66+
| Scala 2.13 | `gs://spark-lib/bigquery/spark-bigquery-with-dependencies_2.13-0.43.0.jar` ([HTTP link](https://storage.googleapis.com/spark-lib/bigquery/spark-bigquery-with-dependencies_2.13-0.43.0.jar)) |
67+
| Scala 2.12 | `gs://spark-lib/bigquery/spark-bigquery-with-dependencies_2.12-0.43.0.jar` ([HTTP link](https://storage.googleapis.com/spark-lib/bigquery/spark-bigquery-with-dependencies_2.12-0.43.0.jar)) |
6868
| Scala 2.11 | `gs://spark-lib/bigquery/spark-bigquery-with-dependencies_2.11-0.29.0.jar` ([HTTP link](https://storage.googleapis.com/spark-lib/bigquery/spark-bigquery-with-dependencies_2.11-0.29.0.jar)) |
6969

7070
The first six versions are Java based connectors targeting Spark 2.4/3.1/3.2/3.3/3.4/3.5 of all Scala versions built on the new
@@ -107,14 +107,14 @@ repository. It can be used using the `--packages` option or the
107107

108108
| version | Connector Artifact |
109109
|------------|------------------------------------------------------------------------------------|
110-
| Spark 3.5 | `com.google.cloud.spark:spark-3.5-bigquery:0.42.2` |
111-
| Spark 3.4 | `com.google.cloud.spark:spark-3.4-bigquery:0.42.2` |
112-
| Spark 3.3 | `com.google.cloud.spark:spark-3.3-bigquery:0.42.2` |
113-
| Spark 3.2 | `com.google.cloud.spark:spark-3.2-bigquery:0.42.2` |
114-
| Spark 3.1 | `com.google.cloud.spark:spark-3.1-bigquery:0.42.2` |
110+
| Spark 3.5 | `com.google.cloud.spark:spark-3.5-bigquery:0.43.0` |
111+
| Spark 3.4 | `com.google.cloud.spark:spark-3.4-bigquery:0.43.0` |
112+
| Spark 3.3 | `com.google.cloud.spark:spark-3.3-bigquery:0.43.0` |
113+
| Spark 3.2 | `com.google.cloud.spark:spark-3.2-bigquery:0.43.0` |
114+
| Spark 3.1 | `com.google.cloud.spark:spark-3.1-bigquery:0.43.0` |
115115
| Spark 2.4 | `com.google.cloud.spark:spark-2.4-bigquery:0.37.0` |
116-
| Scala 2.13 | `com.google.cloud.spark:spark-bigquery-with-dependencies_2.13:0.42.2` |
117-
| Scala 2.12 | `com.google.cloud.spark:spark-bigquery-with-dependencies_2.12:0.42.2` |
116+
| Scala 2.13 | `com.google.cloud.spark:spark-bigquery-with-dependencies_2.13:0.43.0` |
117+
| Scala 2.12 | `com.google.cloud.spark:spark-bigquery-with-dependencies_2.12:0.43.0` |
118118
| Scala 2.11 | `com.google.cloud.spark:spark-bigquery-with-dependencies_2.11:0.29.0` |
119119

120120
### Specifying the Spark BigQuery connector version in a Dataproc cluster
@@ -124,8 +124,8 @@ Using the standard `--jars` or `--packages` (or alternatively, the `spark.jars`/
124124

125125
To use another version than the built-in one, please do one of the following:
126126

127-
* For Dataproc clusters, using image 2.1 and above, add the following flag on cluster creation to upgrade the version `--metadata SPARK_BQ_CONNECTOR_VERSION=0.42.2`, or `--metadata SPARK_BQ_CONNECTOR_URL=gs://spark-lib/bigquery/spark-3.3-bigquery-0.42.2.jar` to create the cluster with a different jar. The URL can point to any valid connector JAR for the cluster's Spark version.
128-
* For Dataproc serverless batches, add the following property on batch creation to upgrade the version: `--properties dataproc.sparkBqConnector.version=0.42.2`, or `--properties dataproc.sparkBqConnector.uri=gs://spark-lib/bigquery/spark-3.3-bigquery-0.42.2.jar` to create the batch with a different jar. The URL can point to any valid connector JAR for the runtime's Spark version.
127+
* For Dataproc clusters, using image 2.1 and above, add the following flag on cluster creation to upgrade the version `--metadata SPARK_BQ_CONNECTOR_VERSION=0.43.0`, or `--metadata SPARK_BQ_CONNECTOR_URL=gs://spark-lib/bigquery/spark-3.3-bigquery-0.43.0.jar` to create the cluster with a different jar. The URL can point to any valid connector JAR for the cluster's Spark version.
128+
* For Dataproc serverless batches, add the following property on batch creation to upgrade the version: `--properties dataproc.sparkBqConnector.version=0.43.0`, or `--properties dataproc.sparkBqConnector.uri=gs://spark-lib/bigquery/spark-3.3-bigquery-0.43.0.jar` to create the batch with a different jar. The URL can point to any valid connector JAR for the runtime's Spark version.
129129

130130
## Hello World Example
131131

@@ -135,7 +135,7 @@ You can run a simple PySpark wordcount against the API without compilation by ru
135135

136136
```
137137
gcloud dataproc jobs submit pyspark --cluster "$MY_CLUSTER" \
138-
--jars gs://spark-lib/bigquery/spark-bigquery-with-dependencies_2.12-0.42.2.jar \
138+
--jars gs://spark-lib/bigquery/spark-bigquery-with-dependencies_2.12-0.43.0.jar \
139139
examples/python/shakespeare.py
140140
```
141141

@@ -169,6 +169,16 @@ import com.google.cloud.spark.bigquery._
169169
val df = spark.read.bigquery("bigquery-public-data.samples.shakespeare")
170170
```
171171

172+
The connector supports reading from tables that contain spaces in their names.
173+
174+
**Note on ambiguous table names**: If a table name contains both spaces and a SQL keyword (e.g., "from", "where", "join"), it may be misinterpreted as a SQL query. To resolve this ambiguity, quote the table identifier with backticks \`. For example:
175+
176+
```
177+
df = spark.read \
178+
.format("bigquery") \
179+
.load("`my_project.my_dataset.orders from 2023`")
180+
```
181+
172182
For more information, see additional code samples in
173183
[Python](examples/python/shakespeare.py),
174184
[Scala](spark-bigquery-dsv1/src/main/scala/com/google/cloud/spark/bigquery/examples/Shakespeare.scala)
@@ -357,6 +367,19 @@ df.writeStream \
357367

358368
**Important:** The connector does not configure the GCS connector, in order to avoid conflict with another GCS connector, if exists. In order to use the write capabilities of the connector, please configure the GCS connector on your cluster as explained [here](https://github.com/GoogleCloudPlatform/bigdata-interop/tree/master/gcs).
359369

370+
### Running SQL on BigQuery
371+
372+
The connector supports Spark's [SparkSession#executeCommand](https://archive.apache.org/dist/spark/docs/3.0.0/api/java/org/apache/spark/sql/SparkSession.html#executeCommand-java.lang.String-java.lang.String-scala.collection.immutable.Map-)
373+
with the Spark-X.Y-bigquery connectors. It can be used to run any arbitrary DDL/DML StandardSQL statement on BigQuery as
374+
a query job. `SELECT` statements are not supported, as those are supported by reading from query as shown above. It can
375+
be used as follows:
376+
```
377+
spark.executeCommand("bigquery", sql, options)
378+
```
379+
Notice the following:
380+
* Notice that apart from the authentication options no other options are supported by this functionality.
381+
* This API is available only in the Scala/Java API. PySpark does not provide it.
382+
360383
### Properties
361384

362385
The API Supports a number of options to configure the read
@@ -925,6 +948,16 @@ word-break:break-word
925948
</td>
926949
<td>Read/Write</td>
927950
</tr>
951+
<tr>
952+
<td><code>credentialsScopes</code>
953+
</td>
954+
<td>Replaces the scopes of the Google Credentials if the credentials type supports that.
955+
If scope replacement is not supported then it does nothing.
956+
<br/>The value should be a comma separated list of valid scopes.
957+
<br/> (Optional)
958+
</td>
959+
<td>Read/Write</td>
960+
</tr>
928961
</table>
929962

930963
Options can also be set outside of the code, using the `--conf` parameter of `spark-submit` or `--properties` parameter
@@ -1196,7 +1229,7 @@ using the following code:
11961229
```python
11971230
from pyspark.sql import SparkSession
11981231
spark = SparkSession.builder \
1199-
.config("spark.jars.packages", "com.google.cloud.spark:spark-bigquery-with-dependencies_2.12:0.42.2") \
1232+
.config("spark.jars.packages", "com.google.cloud.spark:spark-bigquery-with-dependencies_2.12:0.43.0") \
12001233
.getOrCreate()
12011234
df = spark.read.format("bigquery") \
12021235
.load("dataset.table")
@@ -1205,15 +1238,15 @@ df = spark.read.format("bigquery") \
12051238
**Scala:**
12061239
```scala
12071240
val spark = SparkSession.builder
1208-
.config("spark.jars.packages", "com.google.cloud.spark:spark-bigquery-with-dependencies_2.12:0.42.2")
1241+
.config("spark.jars.packages", "com.google.cloud.spark:spark-bigquery-with-dependencies_2.12:0.43.0")
12091242
.getOrCreate()
12101243
val df = spark.read.format("bigquery")
12111244
.load("dataset.table")
12121245
```
12131246

12141247
In case Spark cluster is using Scala 2.12 (it's optional for Spark 2.4.x,
12151248
mandatory in 3.0.x), then the relevant package is
1216-
com.google.cloud.spark:spark-bigquery-with-dependencies_**2.12**:0.42.2. In
1249+
com.google.cloud.spark:spark-bigquery-with-dependencies_**2.12**:0.43.0. In
12171250
order to know which Scala version is used, please run the following code:
12181251

12191252
**Python:**
@@ -1237,14 +1270,14 @@ To include the connector in your project:
12371270
<dependency>
12381271
<groupId>com.google.cloud.spark</groupId>
12391272
<artifactId>spark-bigquery-with-dependencies_${scala.version}</artifactId>
1240-
<version>0.42.2</version>
1273+
<version>0.43.0</version>
12411274
</dependency>
12421275
```
12431276

12441277
### SBT
12451278

12461279
```sbt
1247-
libraryDependencies += "com.google.cloud.spark" %% "spark-bigquery-with-dependencies" % "0.42.2"
1280+
libraryDependencies += "com.google.cloud.spark" %% "spark-bigquery-with-dependencies" % "0.43.0"
12481281
```
12491282

12501283
### Connector metrics and how to view them
@@ -1289,7 +1322,7 @@ word-break:break-word
12891322
</table>
12901323

12911324

1292-
**Note:** To use the metrics in the Spark UI page, you need to make sure the `spark-bigquery-metrics-0.42.2.jar` is the class path before starting the history-server and the connector version is `spark-3.2` or above.
1325+
**Note:** To use the metrics in the Spark UI page, you need to make sure the `spark-bigquery-metrics-0.43.0.jar` is the class path before starting the history-server and the connector version is `spark-3.2` or above.
12931326

12941327
## FAQ
12951328

0 commit comments

Comments
 (0)