Releases · GoogleCloudDataproc/spark-bigquery-connector

02 Jun 15:23

dataproc-robot

0.31.0

2810182

0.31.0

⚠️ Breaking Change BigNumeric conversion has changed, and it is now converted to Spark's
Decimal data type. Notice that BigNumeric can have a wider precision than Decimal, so additional
setting may be needed. See here
for additional details.
Issue #945: Fixing unable to add new column even with option allowFieldAddition
PR #965: Fix to reuse the same BigQueryClient for the same BigQueryConfig, rather than creating a new one
PR #950: Added support for service account impersonation
PR #960: Added support for basic configuration of the gRPC channel pool size in the BigQueryReadClient.
PR #973: Added support for writing to CMEK managed tables.
PR #971: Fixing wrong results or schema error when Spark nested schema pruning is on for datasource v2
PR #974: Applying DPP to Hive partitioned BigLake tables (spark-3.2-bigquery and spark-3.3-bigquery only)
PR #986: CVE-2020-8908, CVE-2023-2976: Upgrading Guava to version 32.0-jre
BigQuery API has been upgraded to version 2.26.0
BigQuery Storage API has been upgraded to version 2.36.1
GAX has been upgraded to version 2.26.0
gRPC has been upgraded to version 1.55.1
Netty has been upgraded to version 4.1.92.Final
Protocol Buffers has been upgraded to version 3.23.0
PR #957: support direct write with subset field list.

Assets 8

11 Apr 16:12

dataproc-robot

0.30.0

1f9ed24

0.30.0

New connectors are out of preview and are now generally available! This includes all the new
connectors: spark-2.4-bigquery, spark-3.1-bigquery, spark-3.2-bigquery and spark-3.3-bigquery are GA and ready to be used in all workloads. Please
refer to the compatibility matrix
when using them.
Direct write method is out of preview and is now generally available!
spark-bigquery-with-dependencies_2.11 is no longer published. If a recent version of the Scala
2.11 connector is needed, it can be built by checking out the code and running
./mvnw install -Pdsv1_2.11.
Issue #522: Supporting Spark's Map type. Notice there are few restrictions as this is not a
BigQuery native type.
Added support for reading BigQuery table snapshots.
BigQuery API has been upgraded to version 2.24.4
BigQuery Storage API has been upgraded to version 2.34.2
GAX has been upgraded to version 2.24.0
gRPC has been upgraded to version 1.54.0
Netty has been upgraded to version 4.1.90.Final
PR #944: Added support to set query job priority

Assets 8

03 Mar 19:54

dataproc-robot

0.29.0

21558d5

0.29.0

Added two new connectors, spark-3.2-bigquery and spark-3.3-bigquery aimed to be used in Spark 3.2 and 3.3
respectively. Those connectors implement new APIs and capabilities provided by the Spark Data Source V2 API. Both
connectors are in preview mode.
Dynamic partition pruning is supported in preview mode by spark-3.2-bigquery and spark-3.3-bigquery.
This is the last version of the Spark BigQuery connector for scala 2.11. The code will remain in the repository and
can be compiled into a connector if needed.
PR #857: Fixing autovalue shaded classes repackaging
BigQuery API has been upgraded to version 2.22.0
BigQuery Storage API has been upgraded to version 2.31.0
GAX has been upgraded to version 2.23.0
gRPC has been upgraded to version 1.53.0
Netty has been upgraded to version 4.1.89.Final

Assets 9

28 Feb 00:46

dataproc-robot

0.28.1

dfa7663

0.28.1

PR #904: Fixing premature client closing in certain cases, which causes RejectedExecutionException to be thrown

Assets 7

10 Jan 01:15

dataproc-robot

0.28.0

2e162c4

0.28.0

Adding support for the JSON data type.
Thanks to @abhijeet-lele and @jonathan-ostrander for their contributions!
Issue #821: Fixing direct write of empty DataFrames
PR #832: Fixed client closing
Issue #838: Fixing unshaded artifacts
PR #848: Making schema comparison on write less strict
PR #852: fixed enableListInference usage when using the default intermediate format
Jackson has been upgraded to version 2.14.1, addressing CVE-2022-42003
BigQuery API has been upgraded to version 2.20.0
BigQuery Storage API has been upgraded to version 2.27.0
GAX has been upgraded to version 2.20.1
Guice has been upgraded to version 5.1.0
gRPC has been upgraded to version 1.51.1
Netty has been upgraded to version 4.1.86.Final
Protocol Buffers has been upgraded to version 3.21.12

Contributors

jonathan-ostrander and abhijeet-lele

Assets 7

18 Oct 22:09

dataproc-robot

0.27.1

0b16b89

0.27.1

PR #792: Added ability to set table labels while writing to a BigQuery table
PR #796: Allowing custom BigQuery API endpoints
PR #803: Removed grpc-netty-shaded from the connector jar
Protocol Buffers has been upgraded to version 3.21.7, addressing CVE-2022-3171
BigQuery API has been upgraded to version 2.16.1
BigQuery Storage API has been upgraded to version 2.21.0
gRPC has been upgraded to version 1.49.1
Netty has been upgraded to version 4.1.82.Final

Assets 7

21 Sep 20:08

dataproc-robot

0.27.0

8c0a586

0.27.0

Added new Scala 2.13 connector, aimed at Spark versions from 3.2 and above
PR #750: Adding support for custom access token creation. See more here.
PR #745: Supporting load from query in spark-3.1-bigquery.
PR #767: Adding the option createReadSessionTimeoutInSeconds, to override the timeout for CreateReadSession.

Assets 7

18 Jul 17:44

dataproc-robot

0.26.0

4fa0584

0.26.0

All connectors support the DIRECT write method, using the BigQuery Storage Write API,
without first writing the data to GCS. DIRECT write method is in preview mode.
spark-3.1-bigquery has been released in preview mode. This is a Java only library,
implementing the Spark 3.1 DataSource v2 APIs.
BigQuery API has been upgraded to version 2.13.8
BigQuery Storage API has been upgraded to version 2.16.0
gRPC has been upgraded to version 1.47.0
Netty has been upgraded to version 4.1.79.Final

Assets 6

23 Jun 00:55

dataproc-robot

0.25.2

fe49790

0.25.2

PR #673: Added integration tests for BigLake external tables.
PR #674: Increasing default maxParallelism to 10K for BigLake external tables

Assets 5

13 Jun 21:33

dataproc-robot

0.25.1

dd885c6

0.25.1

Issue #651: Fixing the write back to BigQuery.
PR #664: Add support for BigLake external tables.
PR #667: Allowing clustering on unpartitioned tables.
PR #668: Using spark default parallelism as default.

Assets 5

Releases: GoogleCloudDataproc/spark-bigquery-connector

0.31.0

Uh oh!

0.30.0

Uh oh!

0.29.0

Uh oh!

0.28.1

Uh oh!

0.28.0

Contributors

Uh oh!

0.27.1

Uh oh!

0.27.0

Uh oh!

0.26.0

Uh oh!

0.25.2

Uh oh!

0.25.1

Uh oh!