Replies: 2 comments
-
@tgravescs any thoughts and suggestion on the above. I tried using the latest version v25.04.0 |
Beta Was this translation helpful? Give feedback.
0 replies
-
Related to #12891 |
Beta Was this translation helpful? Give feedback.
0 replies
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Uh oh!
There was an error while loading. Please reload this page.
Uh oh!
There was an error while loading. Please reload this page.
-
Describe the bug
Getting the nullpointerexception at ai.rapids.cudf.HostColumnVectorCore.getByte
Steps/Code to reproduce bug
columns might have nulls or did not find keys for join?
I am using spark version 3.3.1 and spark rapids jar version 22.12.0
Expected behavior
the spark code should run without failing
Environment details (please complete the following information)
Environment location: YARN
Spark configuration settings related to the issue
"spark.blacklist.decommissioning.timeout": "600s",
"spark.default.parallelism": "50",
"spark.driver.cores": "5",
"spark.driver.extraJavaOptions": "-Dfile.encoding=utf-8",
"spark.driver.memory": "6g",
"spark.driver.memoryOverhead": "1g",
"spark.dynamicAllocation.executorAllocationRatio": "0.6",
"spark.dynamicAllocation.executorIdleTimeout": "300s",
"spark.dynamicAllocation.maxExecutors": "50",
"spark.eventLog.rolling.enabled": "true",
"spark.eventLog.rolling.maxFileSize": "512m",
"spark.executor.cores": "5",
"spark.executor.defaultJavaOptions": "-XX:OnOutOfMemoryError='kill -9 %p' -XX:+UseG1GC -XX:+IgnoreUnrecognizedVMOptions -XX:+UnlockDiagnosticVMOptions",
"spark.executor.extraLibraryPath": "/usr/local/cuda/targets/x86_64-linux/lib:/usr/local/cuda/extras/CUPTI/lib64:/usr/local/cuda/compat/lib:/usr/local/cuda/lib:/usr/local/cuda/lib64:/usr/lib/hadoop/lib/native:/usr/lib/hadoop-lzo/lib/native:/docker/usr/lib/hadoop/lib/native:/docker/usr/lib/hadoop-lzo/lib/native",
"spark.executor.heartbeatInterval": "60s",
"spark.executor.memory": "6g",
"spark.executor.memoryOverhead": "1g",
"spark.executor.resource.gpu.amount": "1",
"spark.executor.resource.gpu.discoveryScript": "/usr/lib/spark/scripts/gpu/getGpusResources.sh",
"spark.executorEnv.JAVA_HOME": "/usr/lib/jvm/java-11-amazon-corretto.x86_64",
"spark.executorEnv.PKI_KEYSTORE_PATH": "/tmp/client.jks",
"spark.history.fs.cleaner.enabled": "true",
"spark.history.fs.cleaner.interval": "1h",
"spark.history.fs.cleaner.maxAge": "2h",
"spark.history.fs.eventLog.rolling.maxFilesToRetain": "2",
"spark.metrics.conf": "/home/hadoop/spark-metrics.properties",
"spark.network.timeout": "600s",
"spark.plugins": "com.nvidia.spark.SQLPlugin",
"spark.rapids.filecache.enabled": "true",
"spark.rapids.memory.host.spillStorageSize": "16g",
"spark.rapids.memory.pinnedPool.size": "8g",
"spark.rapids.shuffle.multiThreaded.reader.threads": "32",
"spark.rapids.shuffle.multiThreaded.writer.threads": "32",
"spark.rapids.sql.concurrentGpuTasks": "3",
"spark.rapids.sql.multiThreadedRead.numThreads": "100",
"spark.rdd.compress": "true",
"spark.sql.adaptive.coalescePartitions.enabled": "false",
"spark.sql.adaptive.enabled": "true",
"spark.sql.broadcastTimeout": "1200",
"spark.sql.files.maxPartitionBytes": "2g",
"spark.sql.parquet.enableVectorizedReader": "false",
"spark.sql.shuffle.partitions": "50",
"spark.storage.level": "MEMORY_AND_DISK_SER",
"spark.task.resource.gpu.amount": "0.5",
"spark.yarn.appMasterEnv.JAVA_HOME": "/usr/lib/jvm/java-11-amazon-corretto.x86_64",
"spark.yarn.appMasterEnv.PKI_CERTS_DIR_PATH": "/etc/pki_service",
"spark.yarn.maxAppAttempts": "1"
Additional context
Full stack trace below
Caused by: java.lang.NullPointerException
at ai.rapids.cudf.HostColumnVectorCore.getByte(HostColumnVectorCore.java:239) ~[rapids-4-spark_2.12-22.12.0-amzn-0.jar:?]
at com.nvidia.spark.rapids.RapidsHostColumnVectorCore.getByte(RapidsHostColumnVectorCore.java:99) ~[spark3xx-common/:?]
at org.apache.spark.sql.vectorized.ColumnVector.getBytes(ColumnVector.java:115) ~[spark-catalyst_2.12-3.3.1-amzn-0.1.jar:3.3.1-amzn-0.1]
at org.apache.spark.sql.vectorized.ColumnarArray.toByteArray(ColumnarArray.java:79) ~[spark-catalyst_2.12-3.3.1-amzn-0.1.jar:3.3.1-amzn-0.1]
at com.nvidia.spark.rapids.RapidsHostColumnVectorCore.getBinary(RapidsHostColumnVectorCore.java:196) ~[spark3xx-common/:?]
at org.apache.spark.sql.vectorized.ColumnarBatchRow.getBinary(ColumnarBatchRow.java:131) ~[spark-catalyst_2.12-3.3.1-amzn-0.1.jar:3.3.1-amzn-0.1]
at org.apache.spark.sql.catalyst.expressions.GeneratedClass$SpecificUnsafeProjection.apply(Unknown Source) ~[?:?]
at org.apache.spark.sql.catalyst.expressions.GeneratedClass$SpecificUnsafeProjection.apply(Unknown Source) ~[?:?]
at scala.collection.Iterator$$anon$10.next(Iterator.scala:461) ~[scala-library-2.12.15.jar:?]
at org.apache.spark.sql.catalyst.expressions.GeneratedClass$GeneratedIteratorForCodegenStage1.processNext(Unknown Source) ~[?:?]
at org.apache.spark.sql.execution.BufferedRowIterator.hasNext(BufferedRowIterator.java:35) ~[spark-sql_2.12-3.3.1-amzn-0.1.jar:3.3.1-amzn-0.1]
at org.apache.spark.sql.catalyst.expressions.GeneratedClass$GeneratedIteratorForCodegenStage1.hasNext(Unknown Source) ~[?:?]
at org.apache.spark.sql.execution.WholeStageCodegenExec$$anon$1.hasNext(WholeStageCodegenExec.scala:955) ~[spark-sql_2.12-3.3.1-amzn-0.1.jar:3.3.1-amzn-0.1]
at org.apache.spark.sql.execution.aggregate.ObjectHashAggregateExec.$anonfun$doExecute$1(ObjectHashAggregateExec.scala:88) ~[spark-sql_2.12-3.3.1-amzn-0.1.jar:3.3.1-amzn-0.1]
at org.apache.spark.sql.execution.aggregate.ObjectHashAggregateExec.$anonfun$doExecute$1$adapted(ObjectHashAggregateExec.scala:86) ~[spark-sql_2.12-3.3.1-amzn-0.1.jar:3.3.1-amzn-0.1]
Has anyone faced similar issue and know how to resolve it?
Beta Was this translation helpful? Give feedback.
All reactions