Skip to content

Conversation

@andreatgretel
Copy link
Contributor

@andreatgretel andreatgretel commented Jan 6, 2026

During structured generation, quantities like prices and such are typically modeled using Decimal, see e.g. our tutorial number 2.
Some models will output such quantities both with and without quotes. We currently use gsonschema within the StructuredResponseRecipe to validate these outputs, and for Decimals, Pydantic gives for the schema:

"anyOf": [
  {"type": "number", "minimum": 10.0, "maximum": 1000.0},
  {"type": "string", "pattern": "^(?!^[-+.]*$)[+-]?0*\\d*\\.?\\d{0,2}0*$"}
]

Thus, both strings and floats are accepted, which will cause an issue when calling dataframe.to_parquet. This has been happening often in our tutorial number 2.

The proposed solution is to convert fields detected as Decimals to string, always. This is done inside gsonschema.validators.validate. A small test is added to ensure the behavior.

Comment on lines 211 to 212
# Numeric value should be converted to string
result1 = validate({"name": "Widget", "price": 189.99}, schema)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is this for sure the direction it needs to go? i was thinking that the string would need to be converted to a float. could definitely be wrong, though

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think the point of Decimal is being able to control the number of digits after the decimal point. That is better accomplished by a string I believe? Floats can approximate it incorrectly etc.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Alright, following offline discussion - I'm first converting it to Decimal to ensure the right precision, then finally to float as the final format.

@andreatgretel andreatgretel force-pushed the andreatgretel/fix/decimals-in-structgen branch from 8756d75 to 8939a3f Compare January 7, 2026 21:20
@andreatgretel andreatgretel merged commit ca1a7b2 into main Jan 7, 2026
13 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants