Skip to content

Conversation

@scovich
Copy link
Contributor

@scovich scovich commented Jun 25, 2025

Which issue does this PR close?

Rationale for this change

Follow-up to #7670, which accidentally introduced a lossy to-string conversion for variant decimals.

What changes are included in this PR?

Use integer + string operations to convert decimal values to string, instead of floating point that could lose precision.

Also, the VariantDecimalXX structs now impl Display, which greatly simplifies the to-json path. A new (private) macro encapsulates the to-string logic, since it's syntactically identical for all three decimal sizes.

Are these changes tested?

Yes, new unit tests added.

Are there any user-facing changes?

The VariantDecimalXX structs now impl Display

@github-actions github-actions bot added the parquet Changes to the parquet crate label Jun 25, 2025
@scovich
Copy link
Contributor Author

scovich commented Jun 25, 2025

CC @alamb @carpecodeum

Comment on lines -78 to -80
// fall back to floating point
let result = integer as f64 / divisor as f64;
write!(json_buffer, "{}", result)?;
return Ok(());
Copy link
Contributor Author

@scovich scovich Jun 26, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

My original suggestion to use floating point was for the serde_json::Value::Number code path, which can only support i64, u64, and f64 values. Normal to-string can use arbitrary precision -- let the reader be the one to lose information (if it must).

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

got it! this makes more sense, thank you

Comment on lines +21 to +26
macro_rules! format_decimal {
($f:expr, $integer:expr, $scale:expr, $int_type:ty) => {{
let integer = if $scale == 0 {
$integer
} else {
let divisor = (10 as $int_type).pow($scale as u32);
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I was sorely tempted to just define a helper function that takes i128, since it would also produce correct output for all narrower integer types. But 128-bit division is vastly more expensive than 32- or 64-bit division, and the narrower types usually produce quotient and remainder from a single machine instruction.

If we think that performance difference doesn't matter, then we should use the helper function instead for simplicity.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

until we have benchmarks I don't think we can tell. This seems like a good improvement to me simply from a reusability perspective (I would liked to have a display impl for Variant in other times too)

@carpecodeum
Copy link
Contributor

CC @alamb @carpecodeum

CC @alamb @carpecodeum

This is great! thank you @scovich !!

Copy link
Contributor

@alamb alamb left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you @scovich and @carpecodeum for the review

I agree this is a nice improvement

Comment on lines +21 to +26
macro_rules! format_decimal {
($f:expr, $integer:expr, $scale:expr, $int_type:ty) => {{
let integer = if $scale == 0 {
$integer
} else {
let divisor = (10 as $int_type).pow($scale as u32);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

until we have benchmarks I don't think we can tell. This seems like a good improvement to me simply from a reusability perspective (I would liked to have a display impl for Variant in other times too)

@alamb
Copy link
Contributor

alamb commented Jun 26, 2025

I merged up to resolve a conflict with this PR

@alamb
Copy link
Contributor

alamb commented Jun 26, 2025

The clippy failures are not related. See

@alamb
Copy link
Contributor

alamb commented Jun 26, 2025

In order to keep the code moving here, I pushed up a change to fix clippy only for variant to this PR. Once the CI has passed I will merge it in

@github-actions github-actions bot added the arrow Changes to the arrow crate label Jun 26, 2025
@alamb alamb merged commit f8bcc58 into apache:main Jun 26, 2025
26 of 29 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

arrow Changes to the arrow crate parquet Changes to the parquet crate

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants