@@ -20,7 +20,7 @@ In Java and Scala applications, you can use different dependency management
2020tools (e.g., Maven, sbt, or Gradle) to access the
2121connector ` com.google.cloud.spark.bigtable:spark-bigtable_2.13:<version> ` or
2222` com.google.cloud.spark.bigtable:spark-bigtable_2.12:<version> ` (current
23- ` <version> ` is ` 0.7.2 ` ) and package it inside your application JAR
23+ ` <version> ` is ` 0.8.0 ` ) and package it inside your application JAR
2424using libraries such as Maven Shade Plugin. For PySpark applications, you can
2525use the ` --jars ` flag to pass the GCS address of the connector when submitting
2626it.
@@ -32,7 +32,7 @@ For Maven, you can add the following snippet to your `pom.xml` file:
3232<dependency >
3333 <groupId >com.google.cloud.spark.bigtable</groupId >
3434 <artifactId >spark-bigtable_2.13</artifactId >
35- <version >0.7.2 </version >
35+ <version >0.8.0 </version >
3636</dependency >
3737```
3838
@@ -41,20 +41,20 @@ For Maven, you can add the following snippet to your `pom.xml` file:
4141<dependency >
4242 <groupId >com.google.cloud.spark.bigtable</groupId >
4343 <artifactId >spark-bigtable_2.12</artifactId >
44- <version >0.7.2 </version >
44+ <version >0.8.0 </version >
4545</dependency >
4646```
4747
4848For sbt, you can add the following to your ` build.sbt ` file:
4949
5050```
5151// for scala 2.13
52- libraryDependencies += "com.google.cloud.spark.bigtable" % "spark-bigtable_2.13" % "0.7.2 "
52+ libraryDependencies += "com.google.cloud.spark.bigtable" % "spark-bigtable_2.13" % "0.8.0 "
5353```
5454
5555```
5656// for scala 2.12
57- libraryDependencies += "com.google.cloud.spark.bigtable" % "spark-bigtable_2.12" % "0.7.2 "
57+ libraryDependencies += "com.google.cloud.spark.bigtable" % "spark-bigtable_2.12" % "0.8.0 "
5858```
5959
6060Finally, you can add the following to your ` build.gradle ` file when using
@@ -63,14 +63,14 @@ Gradle:
6363```
6464// for scala 2.13
6565dependencies {
66- implementation group: 'com.google.cloud.bigtable', name: 'spark-bigtable_2.13', version: '0.7.2 '
66+ implementation group: 'com.google.cloud.bigtable', name: 'spark-bigtable_2.13', version: '0.8.0 '
6767}
6868```
6969
7070```
7171// for scala 2.12
7272dependencies {
73- implementation group: 'com.google.cloud.bigtable', name: 'spark-bigtable_2.12', version: '0.7.2 '
73+ implementation group: 'com.google.cloud.bigtable', name: 'spark-bigtable_2.12', version: '0.8.0 '
7474}
7575```
7676
@@ -240,6 +240,32 @@ Dataset<Row> dataFrame = spark
240240 .load();
241241```
242242
243+ ### Reading from Bigtable with complex Filters
244+
245+ You can read from Bigtable with any supported [ filters] ( https://docs.cloud.google.com/bigtable/docs/using-filters ) with
246+ the ` spark.bigtable.read.row.filters ` option. This option expects a string which is the Base64 encoding of a
247+ [ Bigtable RowFilter] ( https://github.com/googleapis/java-bigtable/blob/v2.70.0/proto-google-cloud-bigtable-v2/src/main/java/com/google/bigtable/v2/RowFilter.java )
248+ object.
249+
250+ ``` scala
251+ import com .google .cloud .spark .bigtable .repackaged .com .google .cloud .bigtable .data .v2 .models .Filters .FILTERS
252+ import com .google .cloud .spark .bigtable .repackaged .com .google .common .io .BaseEncoding
253+
254+ val filters = FILTERS .chain()
255+ .filter(FILTERS .family().exactMatch(" info" ))
256+ .filter(FILTERS .qualifier().regex(" \\ C*" ))
257+ val filterString = BaseEncoding .base64().encode(filters.toProto.toByteArray)
258+
259+ val dataFrame = spark
260+ .read()
261+ .format(" bigtable" )
262+ .option(" catalog" , catalog)
263+ .option(" spark.bigtable.project.id" , projectId)
264+ .option(" spark.bigtable.instance.id" , instanceId)
265+ .option(" spark.bigtable.read.row.filters" , filterString)
266+ .load();
267+ ```
268+
243269### Efficient joins with other data sources
244270
245271If you have a large DataFrame that you want to join with some Bigtable data and
0 commit comments