I have the following column defined in my data contract YAML (based on T-SQL).
- name: Print Sequence
logicalType: number
physicalType: decimal(12,2)
description: Print Sequence
required: true
classification: internal
examples:
- 12.20
- 13.00
after using the CLI export function to export the schema into Spark schema, the chosen Spark data type is always DECIMAL(38,0)
StructField("Print Sequence",
DecimalType(38, 0),
False,
{"comment": "Print Sequence"}
),
which poses two issues:
- It is actually an integer (real number without a decimal place)
- It is using the longest decimal that Spark can hold without respecting the defined length (decimal(12,2)) from the source system (tsql)