This repository was archived by the owner on Aug 13, 2024. It is now read-only.
-
Notifications
You must be signed in to change notification settings - Fork 47
dynamoDB object is not serializable #31
Copy link
Copy link
Open
Description
when i try to write to dynamo inside my foreachRDD function, i get the error below. it looks like the dynamoDB class is not serializable. has anyone run into this and fixed it? thanks!
org.apache.spark.SparkException: Task not serializable
at org.apache.spark.util.ClosureCleaner$.ensureSerializable(ClosureCleaner.scala:305)
at org.apache.spark.util.ClosureCleaner$.org$apache$spark$util$ClosureCleaner$$clean(ClosureCleaner.scala:295)
at org.apache.spark.util.ClosureCleaner$.clean(ClosureCleaner.scala:123)
at org.apache.spark.SparkContext.clean(SparkContext.scala:1930)
at org.apache.spark.rdd.RDD$$anonfun$foreach$1.apply(RDD.scala:862)
at org.apache.spark.rdd.RDD$$anonfun$foreach$1.apply(RDD.scala:861)
at org.apache.spark.rdd.RDDOperationScope$.withScope(RDDOperationScope.scala:151)
at org.apache.spark.rdd.RDDOperationScope$.withScope(RDDOperationScope.scala:112)
at org.apache.spark.rdd.RDD.withScope(RDD.scala:357)
at org.apache.spark.rdd.RDD.foreach(RDD.scala:861)
at com.snowplowanalytics.spark.streaming.StreamingCounts$$anonfun$execute$2.apply(StreamingCounts.scala:110)
at com.snowplowanalytics.spark.streaming.StreamingCounts$$anonfun$execute$2.apply(StreamingCounts.scala:109)
at org.apache.spark.streaming.dstream.DStream$$anonfun$foreachRDD$1$$anonfun$apply$mcV$sp$3.apply(DStream.scala:628)
at org.apache.spark.streaming.dstream.DStream$$anonfun$foreachRDD$1$$anonfun$apply$mcV$sp$3.apply(DStream.scala:628)
at org.apache.spark.streaming.dstream.ForEachDStream$$anonfun$1$$anonfun$apply$mcV$sp$1.apply$mcV$sp(ForEachDStream.scala:51)
at org.apache.spark.streaming.dstream.ForEachDStream$$anonfun$1$$anonfun$apply$mcV$sp$1.apply(ForEachDStream.scala:51)
at org.apache.spark.streaming.dstream.ForEachDStream$$anonfun$1$$anonfun$apply$mcV$sp$1.apply(ForEachDStream.scala:51)
at org.apache.spark.streaming.dstream.DStream.createRDDWithLocalProperties(DStream.scala:415)
at org.apache.spark.streaming.dstream.ForEachDStream$$anonfun$1.apply$mcV$sp(ForEachDStream.scala:50)
at org.apache.spark.streaming.dstream.ForEachDStream$$anonfun$1.apply(ForEachDStream.scala:50)
at org.apache.spark.streaming.dstream.ForEachDStream$$anonfun$1.apply(ForEachDStream.scala:50)
at scala.util.Try$.apply(Try.scala:192)
at org.apache.spark.streaming.scheduler.Job.run(Job.scala:39)
at org.apache.spark.streaming.scheduler.JobScheduler$JobHandler$$anonfun$run$1.apply$mcV$sp(JobScheduler.scala:227)
at org.apache.spark.streaming.scheduler.JobScheduler$JobHandler$$anonfun$run$1.apply(JobScheduler.scala:227)
at org.apache.spark.streaming.scheduler.JobScheduler$JobHandler$$anonfun$run$1.apply(JobScheduler.scala:227)
at scala.util.DynamicVariable.withValue(DynamicVariable.scala:58)
at org.apache.spark.streaming.scheduler.JobScheduler$JobHandler.run(JobScheduler.scala:226)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
at java.lang.Thread.run(Thread.java:745)
Caused by: java.io.NotSerializableException: com.amazonaws.services.dynamodbv2.document.DynamoDB
Serialization stack:
- object not serializable (class: com.amazonaws.services.dynamodbv2.document.DynamoDB, value: com.amazonaws.services.dynamodbv2.document.DynamoDB@2713aa67)
- field (class: com.snowplowanalytics.spark.streaming.StreamingCounts$$anonfun$execute$2, name: dynamoDB$1, type: class com.amazonaws.services.dynamodbv2.document.DynamoDB)
- object (class com.snowplowanalytics.spark.streaming.StreamingCounts$$anonfun$execute$2, <function1>)
- field (class: com.snowplowanalytics.spark.streaming.StreamingCounts$$anonfun$execute$2$$anonfun$apply$1, name: $outer, type: class com.snowplowanalytics.spark.streaming.StreamingCounts$$anonfun$execute$2)
- object (class com.snowplowanalytics.spark.streaming.StreamingCounts$$anonfun$execute$2$$anonfun$apply$1, <function1>)
at org.apache.spark.serializer.SerializationDebugger$.improveException(SerializationDebugger.scala:40)
at org.apache.spark.serializer.JavaSerializationStream.writeObject(JavaSerializer.scala:46)
at org.apache.spark.serializer.JavaSerializerInstance.serialize(JavaSerializer.scala:100)
at org.apache.spark.util.ClosureCleaner$.ensureSerializable(ClosureCleaner.scala:302)
... 30 more
Reactions are currently unavailable
Metadata
Metadata
Assignees
Labels
No labels