-
Notifications
You must be signed in to change notification settings - Fork 239
Description
Schema derivation yields "Unknown datum class" w/ nested classes / scalapb enums as defaults
Error
I'm getting a Unknown datum class: ExampleEnumEvent$Action$Undefined$ error when trying to derive a schema for a scalapb generated enum.
More generally, this can be reproduced with any nested structure (see below).
Similar to #677?
Minimal Protobuf Example
// pb/types.proto
package pb;
message ID {
string id = 1;
}
// example.proto
syntax = "proto3";
import "pb/types.proto";
message ExampleEnumEvent {
pb.ID id = 1;
Action action = 2;
enum Action {
Undefined = 0;
Allow = 1;
Deny = 2;
}
}Package settings preserve_unknown_fields: false, lenses: false.
Yields
@SerialVersionUID(0L)
final case class ExampleEnumEvent(
id: _root_.scala.Option[pb.types.ID] = _root_.scala.None,
action: ExampleEnumEvent.Action = ExampleEnumEvent.Action.Undefined
) extends scalapb.GeneratedMessage {Action is a
sealed abstract class Action(val value: _root_.scala.Int) extends _root_.scalapb.GeneratedEnum Test
import com.sksamuel.avro4s.{SchemaFor, ToRecord, Encoder as AvroEncoder}
val e = ExampleEnumEvent(
id = Some(ID("1")),
)
type T = ExampleEnumEvent
val schema = AvroSchema[T]
println(schema.toString(true))
val enc = AvroEncoder[T]
val toRecord: ToRecord[T] = ToRecord[T](schema)(using enc)
val gen = toRecord.to(e)
println(gen)Which gets us
Unknown datum class: class ExampleEnumEvent$Action$Undefined$
org.apache.avro.AvroRuntimeException: Unknown datum class: class ExampleEnumEvent$Action$Undefined$
at org.apache.avro.util.internal.JacksonUtils.toJson(JacksonUtils.java:96)
at org.apache.avro.util.internal.JacksonUtils.toJsonNode(JacksonUtils.java:53)
at org.apache.avro.Schema$Field.<init>(Schema.java:598)
at com.sksamuel.avro4s.schemas.Records$.buildSchemaField(records.scala:89)
at com.sksamuel.avro4s.schemas.Records$.$anonfun$1(records.scala:31)
at scala.collection.immutable.List.flatMap(List.scala:293)
at com.sksamuel.avro4s.schemas.Records$.schema(records.scala:32)
at com.sksamuel.avro4s.schemas.MagnoliaDerivedSchemas.join(magnolia.scala:14)
at com.sksamuel.avro4s.schemas.MagnoliaDerivedSchemas.join$(magnolia.scala:10)
at com.sksamuel.avro4s.SchemaFor$.join(SchemaFor.scala:55)
W/o proto
This has functionally the same effect:
final case class Nested(s: String = "foo", n: Nested.Nest = Nested.Undefined())
object Nested {
sealed abstract class Nest(i: Int)
final case class Undefined() extends Nest(-1)
final case class N(i: Int) extends Nest(i)
}Validation / Workaround
If we set no_default_values_in_constructor (or remove the default), it works and yields:
{
"type" : "record",
"name" : "ExampleEnumEvent",
"namespace" : "test",
"fields" : [ {
"name" : "id",
"type" : [ "null", "string" ]
}, {
"name" : "action",
"type" : [ {
"type" : "record",
"name" : "Allow",
"namespace" : "ExampleEnumEvent.Action",
"fields" : [ ]
}, {
"type" : "record",
"name" : "Deny",
"namespace" : "ExampleEnumEvent.Action",
"fields" : [ ]
}, {
"type" : "enum",
"name" : "Recognized",
"namespace" : "ExampleEnumEvent.Action",
"symbols" : [ "Undefined", "Allow", "Deny" ]
}, {
"type" : "record",
"name" : "Undefined",
"namespace" : "ExampleEnumEvent.Action",
"fields" : [ ]
}, {
"type" : "record",
"name" : "Unrecognized",
"namespace" : "ExampleEnumEvent.Action",
"fields" : [ {
"name" : "unrecognizedValue",
"type" : "int"
} ]
} ]
} ]
}Alternatively, explicitly setting
val e = ExampleEnumEvent(
id = Some(ID("1")),
action = ExampleEnumEvent.Action.Undefined
)Has the same effect (which of course isn't viable for events that come in from another service).
I can't do a given SchemaFor[ExampleEnumEvent.Action] = SchemaFor[ExampleEnumEvent.Action], since that causes a StackOverflowError, since I suppose that causes infinite recursion at runtime.
I've also tried tricking avro4s into treating the scalapb.GeneratedEnum as a Enumeration type by defining a trait that extends from Enumeration, but to no avail.
Other
On a side note, compilation time for scalapb generated objects that include a val of AvroSchema[A] are very, very long, presumably since the generated scalapb classes are rather large (up to ~10s/class). Scala 3 doesn't have any good compiler profilers, as far as I'm aware, so I'm not 100% sure where exactly that happens.
But I figured I'd mention that here, since I'm not sure if that's expected.
Environment
22.04.1-Ubuntu, avro4s 5.0.9, Scala 3.4.0