-
Notifications
You must be signed in to change notification settings - Fork 224
Open
Description
DataProc: 2.1
bigquery-connector: spark-bigquery-with-dependencies_2.12-0.42.2.jar
Running a query in project that requires a kms key to be specified fails.
Succeeds
spark_session.conf.set('destinationTableKmsKeyName', destination_table_kms_key)
df = (
spark_session.read.format('bigquery')
.option('table', '<project>:<dataset>.my_table')
.load()
.select('*')
)Fails
spark_session.conf.set('viewsEnabled', 'true')
spark_session.conf.set('destinationTableKmsKeyName', destination_table_kms_key)
query='SELECT * FROM `<project>.<dataset>.my_table`'
df = spark_session.read.format('bigquery')\
.option('query', query).load()Caused by: com.google.cloud.bigquery.connector.common.BigQueryConnectorException: Error creating destination table using the following query: [SELECT * FROM `<project>.<dataset>.my_table`]
at com.google.cloud.bigquery.connector.common.BigQueryClient.materializeTable(BigQueryClient.java:743)
at com.google.cloud.bigquery.connector.common.BigQueryClient.materializeQueryToTable(BigQueryClient.java:673)
at com.google.cloud.bigquery.connector.common.BigQueryClient.getReadTable(BigQueryClient.java:441)
at com.google.cloud.spark.bigquery.v2.context.BigQueryDataSourceReaderModule.provideDataSourceReaderContext(BigQueryDataSourceReaderModule.java:58)
at com.google.cloud.spark.bigquery.v2.context.BigQueryDataSourceReaderModule$$FastClassByGuice$$3955632.GUICE$TRAMPOLINE(<generated>)
at com.google.cloud.spark.bigquery.v2.context.BigQueryDataSourceReaderModule$$FastClassByGuice$$3955632.apply(<generated>)
at com.google.cloud.spark.bigquery.repackaged.com.google.inject.internal.ProviderMethod$FastClassProviderMethod.doProvision(ProviderMethod.java:260)
at com.google.cloud.spark.bigquery.repackaged.com.google.inject.internal.ProviderMethod.doProvision(ProviderMethod.java:171)
at com.google.cloud.spark.bigquery.repackaged.com.google.inject.internal.InternalProviderInstanceBindingImpl$CyclicFactory.provision(InternalProviderInstanceBindingImpl.java:185)
at com.google.cloud.spark.bigquery.repackaged.com.google.inject.internal.InternalProviderInstanceBindingImpl$CyclicFactory.get(InternalProviderInstanceBindingImpl.java:162)
at com.google.cloud.spark.bigquery.repackaged.com.google.inject.internal.ProviderToInternalFactoryAdapter.get(ProviderToInternalFactoryAdapter.java:40)
at com.google.cloud.spark.bigquery.repackaged.com.google.inject.internal.SingletonScope$1.get(SingletonScope.java:169)
at com.google.cloud.spark.bigquery.repackaged.com.google.inject.internal.InternalFactoryToProviderAdapter.get(InternalFactoryToProviderAdapter.java:45)
at com.google.cloud.spark.bigquery.repackaged.com.google.inject.internal.InjectorImpl$1.get(InjectorImpl.java:1101) ... 101 moreCaused by: com.google.cloud.spark.bigquery.repackaged.com.google.common.util.concurrent.UncheckedExecutionException: com.google.cloud.spark.bigquery.repackaged.com.google.cloud.bigquery.BigQueryException: Your administrator requires that you specify an encryption key for queries in project `<project>`. See https://cloud.google.com/bigquery/docs/customer-managed-encryption#services_constraint for more info.
at com.google.cloud.spark.bigquery.repackaged.com.google.common.cache.LocalCache$Segment.get(LocalCache.java:2086)
at com.google.cloud.spark.bigquery.repackaged.com.google.common.cache.LocalCache.get(LocalCache.java:4017)
at com.google.cloud.spark.bigquery.repackaged.com.google.common.cache.LocalCache$LocalManualCache.get(LocalCache.java:4898)
at com.google.cloud.bigquery.connector.common.BigQueryClient.materializeTable(BigQueryClient.java:732) ... 114 moreCaused by: com.google.cloud.spark.bigquery.repackaged.com.google.cloud.bigquery.BigQueryException: Your administrator requires that you specify an encryption key for queries in project `<project>`. See https://cloud.google.com/bigquery/docs/customer-managed-encryption#services_constraint for more info.
at com.google.cloud.spark.bigquery.repackaged.com.google.cloud.bigquery.spi.v2.HttpBigQueryRpc.translate(HttpBigQueryRpc.java:116)
at com.google.cloud.spark.bigquery.repackaged.com.google.cloud.bigquery.spi.v2.HttpBigQueryRpc.getQueryResults(HttpBigQueryRpc.java:764)
at com.google.cloud.spark.bigquery.repackaged.com.google.cloud.bigquery.BigQueryImpl$36.call(BigQueryImpl.java:1504)
at com.google.cloud.spark.bigquery.repackaged.com.google.cloud.bigquery.BigQueryImpl$36.call(BigQueryImpl.java:1499)
at com.google.cloud.spark.bigquery.repackaged.com.google.api.gax.retrying.DirectRetryingExecutor.submit(DirectRetryingExecutor.java:102)
at com.google.cloud.spark.bigquery.repackaged.com.google.cloud.bigquery.BigQueryRetryHelper.run(BigQueryRetryHelper.java:86)
at com.google.cloud.spark.bigquery.repackaged.com.google.cloud.bigquery.BigQueryRetryHelper.runWithRetries(BigQueryRetryHelper.java:49)
at com.google.cloud.spark.bigquery.repackaged.com.google.cloud.bigquery.BigQueryImpl.getQueryResults(BigQueryImpl.java:1498)
at com.google.cloud.spark.bigquery.repackaged.com.google.cloud.bigquery.BigQueryImpl.getQueryResults(BigQueryImpl.java:1482)
at com.google.cloud.spark.bigquery.repackaged.com.google.cloud.bigquery.Job$1.call(Job.java:390)
at com.google.cloud.spark.bigquery.repackaged.com.google.cloud.bigquery.Job$1.call(Job.java:387)
at com.google.cloud.spark.bigquery.repackaged.com.google.api.gax.retrying.DirectRetryingExecutor.submit(DirectRetryingExecutor.java:102)
at com.google.cloud.spark.bigquery.repackaged.com.google.cloud.bigquery.BigQueryRetryHelper.run(BigQueryRetryHelper.java:86)
at com.google.cloud.spark.bigquery.repackaged.com.google.cloud.bigquery.BigQueryRetryHelper.runWithRetries(BigQueryRetryHelper.java:49)
at com.google.cloud.spark.bigquery.repackaged.com.google.cloud.bigquery.Job.waitForQueryResults(Job.java:386)
at com.google.cloud.spark.bigquery.repackaged.com.google.cloud.bigquery.Job.waitForInternal(Job.java:281)
at com.google.cloud.spark.bigquery.repackaged.com.google.cloud.bigquery.Job.waitFor(Job.java:202)
at com.google.cloud.bigquery.connector.common.BigQueryClient.waitForJob(BigQueryClient.java:158)
at com.google.cloud.bigquery.connector.common.BigQueryClient$TempTableBuilder.createTableFromQuery(BigQueryClient.java:1026)
at com.google.cloud.bigquery.connector.common.BigQueryClient$TempTableBuilder.call(BigQueryClient.java:1014)
at com.google.cloud.bigquery.connector.common.BigQueryClient$TempTableBuilder.call(BigQueryClient.java:992)
at com.google.cloud.spark.bigquery.repackaged.com.google.common.cache.LocalCache$LocalManualCache$1.load(LocalCache.java:4903)
at com.google.cloud.spark.bigquery.repackaged.com.google.common.cache.LocalCache$LoadingValueReference.loadFuture(LocalCache.java:3574)
at com.google.cloud.spark.bigquery.repackaged.com.google.common.cache.LocalCache$Segment.loadSync(LocalCache.java:2316)
at com.google.cloud.spark.bigquery.repackaged.com.google.common.cache.LocalCache$Segment.lockedGetOrLoad(LocalCache.java:2190)
at com.google.cloud.spark.bigquery.repackaged.com.google.common.cache.LocalCache$Segment.get(LocalCache.java:2080) ... 117 moreCaused by: com.google.cloud.spark.bigquery.repackaged.com.google.api.client.googleapis.json.GoogleJsonResponseException: 412 Precondition Failed
GET https://bigquery.googleapis.com/bigquery/v2/projects/<project>/queries/6765018c-fefa-47dc-af96-d3dded084881?location=europe-west3&maxResults=0&prettyPrint=false
{
"code": 412,
"errors": [
{
"domain": "global",
"location": "If-Match",
"locationType": "header",
"message": "Your administrator requires that you specify an encryption key for queries in project `<project>`. See https://cloud.google.com/bigquery/docs/customer-managed-encryption#services_constraint for more info.",
"reason": "conditionNotMet"
}
],
"message": "Your administrator requires that you specify an encryption key for queries in project `<project>`. See https://cloud.google.com/bigquery/docs/customer-managed-encryption#services_constraint for more info.",
"status": "FAILED_PRECONDITION"
}
at com.google.cloud.spark.bigquery.repackaged.com.google.api.client.googleapis.json.GoogleJsonResponseException.from(GoogleJsonResponseException.java:145)
at com.google.cloud.spark.bigquery.repackaged.com.google.api.client.googleapis.services.json.AbstractGoogleJsonClientRequest.newExceptionOnError(AbstractGoogleJsonClientRequest.java:118)
at com.google.cloud.spark.bigquery.repackaged.com.google.api.client.googleapis.services.json.AbstractGoogleJsonClientRequest.newExceptionOnError(AbstractGoogleJsonClientRequest.java:37)
at com.google.cloud.spark.bigquery.repackaged.com.google.api.client.googleapis.services.AbstractGoogleClientRequest$3.interceptResponse(AbstractGoogleClientRequest.java:479)
at com.google.cloud.spark.bigquery.repackaged.com.google.api.client.http.HttpRequest.execute(HttpRequest.java:1111)
at com.google.cloud.spark.bigquery.repackaged.com.google.api.client.googleapis.services.AbstractGoogleClientRequest.executeUnparsed(AbstractGoogleClientRequest.java:565)
at com.google.cloud.spark.bigquery.repackaged.com.google.api.client.googleapis.services.AbstractGoogleClientRequest.executeUnparsed(AbstractGoogleClientRequest.java:506)
at com.google.cloud.spark.bigquery.repackaged.com.google.api.client.googleapis.services.AbstractGoogleClientRequest.execute(AbstractGoogleClientRequest.java:616)
at com.google.cloud.spark.bigquery.repackaged.com.google.cloud.bigquery.spi.v2.HttpBigQueryRpc.getQueryResults(HttpBigQueryRpc.java:762)
... 141 more
Looking in the bigquery-connector repo it seems the destinationTableKmsKeyName is not used for
the query job configuration.
Expectation
Using either the of the approaches to read data from bigquery should succeed when a kms key is required, and provided.
Possible Solution
Use destinationTableKmsKeyName for the query job configuration or add an extra option to specify
a kms key for the query job.
The ability to pass any of the config in QueryJobConfiguration may be useful.
Reactions are currently unavailable
Metadata
Metadata
Assignees
Labels
No labels