[java] Fix avro logical-types conversions for BQ storage#33422
[java] Fix avro logical-types conversions for BQ storage#33422Abacn merged 6 commits intoapache:masterfrom
Conversation
|
|
||
| static Long convertTimestamp(Object value, boolean micros) { | ||
| if (value instanceof ReadableInstant) { | ||
| return ((ReadableInstant) value).getMillis() * (micros ? 1000 : 1); |
There was a problem hiding this comment.
This was wrong. BQ always expects epoch microseconds. Conversion should be applied on the raw type depending if it represents millis or micros
| .setScale(type.getScale(), RoundingMode.DOWN) | ||
| .round(new MathContext(type.getPrecision(), RoundingMode.DOWN)); |
There was a problem hiding this comment.
Does this seems legit to round ? We might also fail if the BigDecimal precision and scale do not match with expected logical type
There was a problem hiding this comment.
I understand previously it only accepts a ByteBuffer, now it's adding support of java.math.BigDecimal? If previously it would fail or it's not lose precision compared to the existing behavior I think it's fine.
There was a problem hiding this comment.
All other conversions accept the logical-type as well as the underlying-type.
This is mandatory to support both: even if the logical-type is present on the schema, it can be discarded when the GenericData used for serialization does not have the feature enabled.
Concerning the rounding, The doc states to use BigDecimalByteStringEncoder. BeamRowToStorageApiProto is a copy of that.
It's however not supporting parameterized NUMERIC and BIGNUMERIC types.
|
Checks are failing. Will not request review until checks are succeeding. If you'd like to override that behavior, comment |
|
assign set of reviewers |
|
Assigning reviewers. If you would like to opt out of this review, comment R: @kennknowles for label java. Available commands:
The PR bot will only process comments in the main thread (not review comments). |
|
Reminder, please take a look at this pr: @kennknowles @chamikaramj |
|
Assigning new set of reviewers because Pr has gone too long without review. If you would like to opt out of this review, comment R: @Abacn for label java. Available commands:
|
|
Reminder, please take a look at this pr: @Abacn @johnjcasey |
Most of the avro logical-type to BQ are broken. Add support for both joda and java time to ensure compatibility with older avro versions
30d0a2a to
f8d575b
Compare
|
Assigning new set of reviewers because Pr has gone too long without review. If you would like to opt out of this review, comment R: @damondouglas for label java. Available commands:
|
Abacn
left a comment
There was a problem hiding this comment.
Thanks for the change! Had a few questions.
| .put(LogicalTypes.timestampMicros().getName(), TableFieldSchema.Type.TIMESTAMP) | ||
| .put(LogicalTypes.timestampMillis().getName(), TableFieldSchema.Type.TIMESTAMP) | ||
| .put(LogicalTypes.uuid().getName(), TableFieldSchema.Type.STRING) | ||
| .put("date", TableFieldSchema.Type.DATE) |
There was a problem hiding this comment.
what's the reason to change them to hard coded names here? I understand they should be equivalent? Or keep the getName() while note the resolved name as comments?
There was a problem hiding this comment.
They are equivalent. I changed because type name for decimal is not accessible and requires creation of a 'fake' logical-type.
LogicalTypes.decimal(1).getName()I can revert to the old style if that's prefered.
There was a problem hiding this comment.
both are fine, if choose to go with resolved names, good to add a comment these come from corresponding avro LogicalTypes implementations' getName()
There was a problem hiding this comment.
This code has been refactored to accomodate conversion from avro decimal type to BQ NUMERIC/BIGNUMERIC
...rm/src/main/java/org/apache/beam/sdk/io/gcp/bigquery/AvroGenericRecordToStorageApiProto.java
Show resolved
Hide resolved
| .setScale(type.getScale(), RoundingMode.DOWN) | ||
| .round(new MathContext(type.getPrecision(), RoundingMode.DOWN)); |
There was a problem hiding this comment.
I understand previously it only accepts a ByteBuffer, now it's adding support of java.math.BigDecimal? If previously it would fail or it's not lose precision compared to the existing behavior I think it's fine.
...rm/src/main/java/org/apache/beam/sdk/io/gcp/bigquery/AvroGenericRecordToStorageApiProto.java
Show resolved
Hide resolved
...rm/src/main/java/org/apache/beam/sdk/io/gcp/bigquery/AvroGenericRecordToStorageApiProto.java
Show resolved
Hide resolved
Abacn
left a comment
There was a problem hiding this comment.
Thanks, just a nit. Will merge after tests passed
* [java] Fix avro logical-types conversions for BQ storage Most of the avro logical-type to BQ are broken. Add support for both joda and java time to ensure compatibility with older avro versions * expected int raw type for time-millis * Remove unused qualifier * Fix avro numeric convertion * Add support for parametrized NUMERIC and BIGNUMERIC
* [java] Fix avro logical-types conversions for BQ storage Most of the avro logical-type to BQ are broken. Add support for both joda and java time to ensure compatibility with older avro versions * expected int raw type for time-millis * Remove unused qualifier * Fix avro numeric convertion * Add support for parametrized NUMERIC and BIGNUMERIC
Most of the avro logical-type to BQ are broken.
Add support for both joda and java time to ensure compatibility with older avro versions