-
Notifications
You must be signed in to change notification settings - Fork 121
added dremio data_type_list #843
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Merged
Merged
Changes from all commits
Commits
File filter
Filter by extension
Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
There are no files selected for viewing
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Oops, something went wrong.
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
💡 Verification agent
🧩 Analysis chain
Dremio type list added — verify aliases and TZ types; consider adding REAL/BOOL
Good addition and consistent with the dispatch pattern. A couple of points to tighten correctness vs Dremio’s reported data_type values:
Proposed minimal improvement (adds REAL and BOOL; leave BIT and TZ types pending verification):
{% macro dremio__data_type_list(data_type) %} - {% set string_list = ['VARCHAR', 'CHARACTER VARYING'] | list %} - {% set numeric_list = ['INT','INTEGER','BIGINT','DOUBLE','DECIMAL','FLOAT','SMALLINT','TINYINT'] | list %} - {% set timestamp_list = ['DATE','TIME','TIMESTAMP', 'TIME WITH TIME ZONE', 'TIMESTAMP WITH TIME ZONE'] | list %} - {% set boolean_list = ['BOOLEAN', 'BIT'] | list %} + {% set string_list = ['VARCHAR', 'CHARACTER VARYING'] | list %} + {% set numeric_list = ['INT','INTEGER','BIGINT','DOUBLE','DECIMAL','FLOAT','REAL','SMALLINT','TINYINT'] | list %} + {% set timestamp_list = ['DATE','TIME','TIMESTAMP', 'TIME WITH TIME ZONE', 'TIMESTAMP WITH TIME ZONE'] | list %} + {% set boolean_list = ['BOOLEAN', 'BOOL', 'BIT'] | list %}Additionally, if Dremio reports fixed-length CHAR/CHARACTER in INFORMATION_SCHEMA for some sources, consider including:
Please confirm with Dremio docs and/or INFORMATION_SCHEMA outputs before finalizing removals.
To validate support and normalize to actual INFORMATION_SCHEMA values, please check Dremio docs for supported data types and the canonical strings reported in INFORMATION_SCHEMA:
🌐 Web query:
💡 Result:
Dremio does not natively support canonical SQL types SMALLINT, TINYINT, TIME WITH TIME ZONE, or TIMESTAMP WITH TIME ZONE; it maps them to other types. The canonical type names returned in
INFORMATION_SCHEMAand Dremio metadata for supported types are as follows:Support for Specific Types & Canonical Type Names
Supported by Dremio
BOOLEAN[7][1].BOOLEAN[7][1].FLOAT(single-precision float)[1][3][4][7].INTEGER[1][6][7][9].INTEGER[1][6][9].TIME(without time zone)[1][6][7].TIMESTAMP(without time zone)[1][6][7].TIMESTAMPorTIMESTAMP(without TZ information—Dremio ignores/strips the time zone part)[6][7].Not Supported as Canonical Types
TIME(no TZ)[6][7].TIMESTAMP(no TZ)[6][7].INTEGER(not as "SMALLINT" or "TINYINT")[1][6][7][9].Exact Canonical Type Names
The values Dremio reports in metadata (including
INFORMATION_SCHEMA.COLUMNS) are the canonical type names listed below (case-sensitive, all uppercase):These are the types you will observe when querying for column data types in Dremio's metadata tables and via
INFORMATION_SCHEMA[1][3][4][6][7].Mappings from External Sources
SMALLINT,TINYINT,BIT→INTEGER;BOOL/BIT→BOOLEAN;REAL→FLOAT; no support forTIME WITH TIME ZONEor trueTIMESTAMP WITH TIME ZONE(mapped toTIME/TIMESTAMP)[1].BOOL/BIT→BOOLEAN;SMALLINT/TINYINT→INTEGER;REAL→FLOAT;TIMESTAMPTZ/TIMETZ→TIMESTAMP/TIME(zone info stripped)[7].SMALLINT/TINYINT/BYTEINT→INTEGER;TIME WITH TIMEZONE→TIME;TIMESTAMP WITH TIMEZONE→TIMESTAMP[6].Summary Table
If you query
INFORMATION_SCHEMA.COLUMNS, these canonical types (INTEGER, FLOAT, BOOLEAN, TIME, TIMESTAMP, etc.) will be returned—not the original SQL type names from the source system[1][3][6][7][9].If you need to preserve the distinction between, for example, SMALLINT/TINYINT and INTEGER or retain time zone information, Dremio does not natively support this at the type level. You'll need to manage such semantics in your data model or ETL process outside Dremio.
Citations:
Update Dremio data_type_list to use only canonical INFORMATION_SCHEMA types
Dremio only returns the uppercase canonical type names—it does not emit SMALLINT, TINYINT, REAL, BOOL, BIT, or TZ-qualified types in INFORMATION_SCHEMA.COLUMNS. Tighten the macro to match exactly what Dremio reports:
• File:
macros/utils/data_types/data_type_list.sql• Lines: 155–173
{% macro dremio__data_type_list(data_type) %} - {% set string_list = ['VARCHAR', 'CHARACTER VARYING'] | list %} - {% set numeric_list = ['INT','INTEGER','BIGINT','DOUBLE','DECIMAL','FLOAT','SMALLINT','TINYINT'] | list %} - {% set timestamp_list = ['DATE','TIME','TIMESTAMP', 'TIME WITH TIME ZONE', 'TIMESTAMP WITH TIME ZONE'] | list %} - {% set boolean_list = ['BOOLEAN', 'BIT'] | list %} + {% set string_list = ['VARCHAR'] | list %} + {% set numeric_list = ['INTEGER','BIGINT','FLOAT','DOUBLE','DECIMAL'] | list %} + {% set timestamp_list = ['DATE','TIME','TIMESTAMP'] | list %} + {% set boolean_list = ['BOOLEAN'] | list %} {%- if data_type == 'string' %} {{ return(string_list) }} @@ unchanged @@ {% endmacro %}This aligns the lists with what Dremio’s INFORMATION_SCHEMA actually returns.
📝 Committable suggestion