Skip to content

unable to access staging_fs_url using org.apache.hadoop.fs.s3a.auth.IAMInstanceCredentialsProvider in aws_credentials_provider option  #552

@akshay7023

Description

@akshay7023

Environment

  • Spark version: 3.3
  • Hadoop version:
  • Vertica version: Vertica 12.0.3-3
  • Vertica Spark Connector version: 3.3.5
  • Java version: jdk 11
  • Additional Environment Information: glue 4.0

Problem Description

  • Describe the issue in as much details as possible, so it is possible to reproduce it.
    when i am using option "aws_credentials_provider" with value "org.apache.hadoop.fs.s3a.auth.IAMInstanceCredentialsProvider" throws an error
    Caused by:
    java.sql.SQLSyntaxErrorException: [Vertica]VJDBC ERROR: Permission denied for storage location

Following options are used to read from vertica table:
host,
user,
password,
staging_fs_url,
db,
dbschema,
table,
aws_credentials_provider


Spark Connector Logs

  • Add related logs entries here.
    ERROR [Thread-8] v2.VerticaScan (ErrorHandling.scala:logAndThrowError(77)): There was an error when attempting to export from Vertica: connection error with JDBC.
    JDBC Error: A syntax error occurred

Caused by:
java.sql.SQLSyntaxErrorException: [Vertica]VJDBC ERROR: Permission denied for storage location [s3a://##/##/##]
Stack trace:
com.vertica.util.ServerErrorData.buildException(Unknown Source)
com.vertica.dataengine.VResultSet.fetchChunk(Unknown Source)
com.vertica.dataengine.VResultSet.initialize(Unknown Source)
com.vertica.dataengine.VQueryExecutor.readExecuteResponse(Unknown Source)
com.vertica.dataengine.VQueryExecutor.handleExecuteResponse(Unknown Source)
com.vertica.dataengine.VQueryExecutor.execute(Unknown Source)
com.vertica.jdbc.common.SStatement.executeNoParams(SStatement.java:3349)
com.vertica.jdbc.common.SStatement.execute(SStatement.java:753)
com.vertica.spark.datasource.jdbc.VerticaJdbcLayer.$anonfun$execute$1(VerticaJdbcLayer.scala:315)
scala.util.Try$.apply(Try.scala:213)
com.vertica.spark.datasource.jdbc.VerticaJdbcLayer.execute(VerticaJdbcLayer.scala:303)
com.vertica.spark.datasource.core.VerticaDistributedFilesystemReadPipe.$anonfun$doPreReadSteps$9(VerticaDistributedFilesystemReadPipe.scala:293)
scala.util.Either.flatMap(Either.scala:341)
com.vertica.spark.datasource.core.VerticaDistributedFilesystemReadPipe.$anonfun$doPreReadSteps$8(VerticaDistributedFilesystemReadPipe.scala:282)
com.vertica.spark.datasource.core.VerticaDistributedFilesystemReadPipe.$anonfun$doPreReadSteps$8$adapted(VerticaDistributedFilesystemReadPipe.scala:280)
scala.util.Either.flatMap(Either.scala:341)
com.vertica.spark.datasource.core.VerticaDistributedFilesystemReadPipe.$anonfun$doPreReadSteps$7(VerticaDistributedFilesystemReadPipe.scala:280)
scala.util.Either.flatMap(Either.scala:341)
com.vertica.spark.datasource.core.VerticaDistributedFilesystemReadPipe.$anonfun$doPreReadSteps$6(VerticaDistributedFilesystemReadPipe.scala:268)
scala.util.Either.flatMap(Either.scala:341)
com.vertica.spark.datasource.core.VerticaDistributedFilesystemReadPipe.$anonfun$doPreReadSteps$4(VerticaDistributedFilesystemReadPipe.scala:261)
scala.util.Either.flatMap(Either.scala:341)
com.vertica.spark.datasource.core.VerticaDistributedFilesystemReadPipe.$anonfun$doPreReadSteps$3(VerticaDistributedFilesystemReadPipe.scala:259)
scala.util.Either.flatMap(Either.scala:341)
com.vertica.spark.datasource.core.VerticaDistributedFilesystemReadPipe.$anonfun$doPreReadSteps$2(VerticaDistributedFilesystemReadPipe.scala:257)
scala.util.Either.flatMap(Either.scala:341)
com.vertica.spark.datasource.core.VerticaDistributedFilesystemReadPipe.exportData$1(VerticaDistributedFilesystemReadPipe.scala:254)
com.vertica.spark.datasource.core.VerticaDistributedFilesystemReadPipe.doPreReadSteps(VerticaDistributedFilesystemReadPipe.scala:345)
com.vertica.spark.datasource.core.DSReadConfigSetup.performInitialSetup(DSConfigSetup.scala:697)
com.vertica.spark.datasource.core.DSReadConfigSetup.performInitialSetup(DSConfigSetup.scala:650)
com.vertica.spark.datasource.v2.VerticaScan.planInputPartitions(VerticaDatasourceV2Read.scala:228)
org.apache.spark.sql.execution.datasources.v2.BatchScanExec.inputPartitions$lzycompute(BatchScanExec.scala:54)
org.apache.spark.sql.execution.datasources.v2.BatchScanExec.inputPartitions(BatchScanExec.scala:54)
org.apache.spark.sql.execution.datasources.v2.DataSourceV2ScanExecBase.supportsColumnar(DataSourceV2ScanExecBase.scala:142)
org.apache.spark.sql.execution.datasources.v2.DataSourceV2ScanExecBase.supportsColumnar$(DataSourceV2ScanExecBase.scala:141)
org.apache.spark.sql.execution.datasources.v2.BatchScanExec.supportsColumnar(BatchScanExec.scala:36)
org.apache.spark.sql.execution.datasources.v2.DataSourceV2Strategy.apply(DataSourceV2Strategy.scala:143)
org.apache.spark.sql.catalyst.planning.QueryPlanner.$anonfun$plan$1(QueryPlanner.scala:63)
scala.collection.Iterator$$anon$11.nextCur(Iterator.scala:486)
scala.collection.Iterator$$anon$11.hasNext(Iterator.scala:492)
scala.collection.Iterator$$anon$11.hasNext(Iterator.scala:491)
org.apache.spark.sql.catalyst.planning.QueryPlanner.plan(QueryPlanner.scala:93)
org.apache.spark.sql.execution.SparkStrategies.plan(SparkStrategies.scala:72)
org.apache.spark.sql.catalyst.planning.QueryPlanner.$anonfun$plan$3(QueryPlanner.scala:78)
scala.collection.TraversableOnce$folder$1.apply(TraversableOnce.scala:196)
scala.collection.TraversableOnce$folder$1.apply(TraversableOnce.scala:194)
scala.collection.Iterator.foreach(Iterator.scala:943)
scala.collection.Iterator.foreach$(Iterator.scala:943)
scala.collection.AbstractIterator.foreach(Iterator.scala:1431)
scala.collection.TraversableOnce.foldLeft(TraversableOnce.scala:199)
scala.collection.TraversableOnce.foldLeft$(TraversableOnce.scala:192)
scala.collection.AbstractIterator.foldLeft(Iterator.scala:1431)
org.apache.spark.sql.catalyst.planning.QueryPlanner.$anonfun$plan$2(QueryPlanner.scala:75)
scala.collection.Iterator$$anon$11.nextCur(Iterator.scala:486)
scala.collection.Iterator$$anon$11.hasNext(Iterator.scala:492)
org.apache.spark.sql.catalyst.planning.QueryPlanner.plan(QueryPlanner.scala:93)
org.apache.spark.sql.execution.SparkStrategies.plan(SparkStrategies.scala:72)
org.apache.spark.sql.execution.QueryExecution$.createSparkPlan(QueryExecution.scala:495)
org.apache.spark.sql.execution.QueryExecution.$anonfun$sparkPlan$1(QueryExecution.scala:153)
org.apache.spark.sql.catalyst.QueryPlanningTracker.measurePhase(QueryPlanningTracker.scala:192)
org.apache.spark.sql.execution.QueryExecution.$anonfun$executePhase$2(QueryExecution.scala:213)
org.apache.spark.sql.execution.QueryExecution$.withInternalError(QueryExecution.scala:552)
org.apache.spark.sql.execution.QueryExecution.$anonfun$executePhase$1(QueryExecution.scala:213)
org.apache.spark.sql.SparkSession.withActive(SparkSession.scala:779)
org.apache.spark.sql.execution.QueryExecution.executePhase(QueryExecution.scala:212)
org.apache.spark.sql.execution.QueryExecution.sparkPlan$lzycompute(QueryExecution.scala:153)
org.apache.spark.sql.execution.QueryExecution.sparkPlan(QueryExecution.scala:146)
org.apache.spark.sql.execution.QueryExecution.$anonfun$executedPlan$1(QueryExecution.scala:166)
org.apache.spark.sql.catalyst.QueryPlanningTracker.measurePhase(QueryPlanningTracker.scala:192)
org.apache.spark.sql.execution.QueryExecution.$anonfun$executePhase$2(QueryExecution.scala:213)
org.apache.spark.sql.execution.QueryExecution$.withInternalError(QueryExecution.scala:552)
org.apache.spark.sql.execution.QueryExecution.$anonfun$executePhase$1(QueryExecution.scala:213)
org.apache.spark.sql.SparkSession.withActive(SparkSession.scala:779)
org.apache.spark.sql.execution.QueryExecution.executePhase(QueryExecution.scala:212)
org.apache.spark.sql.execution.QueryExecution.executedPlan$lzycompute(QueryExecution.scala:163)
org.apache.spark.sql.execution.QueryExecution.executedPlan(QueryExecution.scala:159)
org.apache.spark.sql.execution.QueryExecution.$anonfun$writePlans$5(QueryExecution.scala:298)
org.apache.spark.sql.catalyst.plans.QueryPlan$.append(QueryPlan.scala:657)
org.apache.spark.sql.execution.QueryExecution.writePlans(QueryExecution.scala:298)
org.apache.spark.sql.execution.QueryExecution.toString(QueryExecution.scala:313)
org.apache.spark.sql.execution.QueryExecution.org$apache$spark$sql$execution$QueryExecution$$explainString(QueryExecution.scala:267)
org.apache.spark.sql.execution.QueryExecution.explainString(QueryExecution.scala:246)
org.apache.spark.sql.execution.SQLExecution$.executeQuery$1(SQLExecution.scala:107)
org.apache.spark.sql.execution.SQLExecution$.$anonfun$withNewExecutionId$7(SQLExecution.scala:139)
org.apache.spark.sql.catalyst.QueryPlanningTracker$.withTracker(QueryPlanningTracker.scala:107)
org.apache.spark.sql.execution.SQLExecution$.withTracker(SQLExecution.scala:224)
org.apache.spark.sql.execution.SQLExecution$.$anonfun$withNewExecutionId$6(SQLExecution.scala:139)
org.apache.spark.sql.execution.SQLExecution$.withSQLConfPropagated(SQLExecution.scala:245)
org.apache.spark.sql.execution.SQLExecution$.$anonfun$withNewExecutionId

Metadata

Metadata

Assignees

No one assigned

    Labels

    bugSomething isn't working

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions