-
Notifications
You must be signed in to change notification settings - Fork 122
Ele 4931 dremio types mapping #844
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
WalkthroughAdds a new macro Changes
Sequence Diagram(s)(omitted — changes are a macro addition and small test adjustments; no new runtime control flow to diagram) Estimated code review effort🎯 2 (Simple) | ⏱️ ~8 minutes Possibly related PRs
Suggested reviewers
Poem
Tip 🔌 Remote MCP (Model Context Protocol) integration is now available!Pro plan users can now connect to remote MCP servers from the Integrations page. Connect with popular remote MCPs such as Notion and Linear to add more context to your reviews and chats. ✨ Finishing Touches
🧪 Generate unit tests
🪧 TipsChatThere are 3 ways to chat with CodeRabbit:
SupportNeed help? Create a ticket on our support page for assistance with any issues or questions. CodeRabbit Commands (Invoked using PR/Issue comments)Type Other keywords and placeholders
CodeRabbit Configuration File (
|
|
👋 @GuyEshdat |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 0
🧹 Nitpick comments (1)
macros/utils/data_types/data_type_list.sql (1)
155-173: Dremio mapping added — consider case coverage and a few type synonyms; verify BIT and “WITH TIME ZONE”Good addition and consistent with the adapter pattern. Two follow-ups to improve robustness across information_schema/introspection outputs:
- Include upper- and lower-case variants to avoid case-sensitivity mismatches (other adapters trend toward matching the adapter’s native casing; Dremio often surfaces uppercase).
- Consider adding common string synonyms CHAR/CHARACTER.
- Verify whether Dremio surfaces BIT as BOOLEAN and whether TIME/TIMESTAMP WITH TIME ZONE are valid type names. If yes, include both cases; if not, drop to reduce false positives.
Proposed update (keeps BIT and WITH TIME ZONE entries but adds uppercase variants and CHAR/CHARACTER):
{% macro dremio__data_type_list(data_type) %} - {% set string_list = ['varchar', 'character varying'] | list %} - {% set numeric_list = ['int','integer','bigint','double','decimal','float','smallint','tinyint'] | list %} - {% set timestamp_list = ['date','time','timestamp', 'time with time zone', 'timestamp with time zone'] | list %} - {% set boolean_list = ['boolean', 'bit'] | list %} + {# Include both lower/upper-case synonyms to match information_schema/introspection outputs #} + {% set string_list = ['varchar','character varying','char','character','VARCHAR','CHARACTER VARYING','CHAR','CHARACTER'] | list %} + {% set numeric_list = ['int','integer','bigint','double','decimal','float','smallint','tinyint','INT','INTEGER','BIGINT','DOUBLE','DECIMAL','FLOAT','SMALLINT','TINYINT'] | list %} + {% set timestamp_list = ['date','time','timestamp','time with time zone','timestamp with time zone','DATE','TIME','TIMESTAMP','TIME WITH TIME ZONE','TIMESTAMP WITH TIME ZONE'] | list %} + {% set boolean_list = ['boolean','bit','BOOLEAN','BIT'] | list %} {%- if data_type == 'string' %} {{ return(string_list) }} {%- elif data_type == 'numeric' %} {{ return(numeric_list) }} {%- elif data_type == 'timestamp' %} {{ return(timestamp_list) }} {%- elif data_type == "boolean" %} {{ return(boolean_list) }} {%- else %} {{ return([]) }} {%- endif %} {% endmacro %}If docs confirm BIT is not a Dremio type, or WITH TIME ZONE variants are invalid, we should remove those entries to avoid misclassification. I can adjust the diff accordingly once confirmed.
Would you like me to check Dremio’s latest type docs and update the lists precisely, or add adapter-level unit tests to validate the classification against a mocked information_schema?
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro
💡 Knowledge Base configuration:
- MCP integration is disabled by default for public repositories
- Jira integration is disabled by default for public repositories
- Linear integration is disabled by default for public repositories
You can enable these sources in your CodeRabbit configuration.
📒 Files selected for processing (1)
macros/utils/data_types/data_type_list.sql(2 hunks)
🔇 Additional comments (2)
macros/utils/data_types/data_type_list.sql (2)
132-132: No-op whitespace changeTrivial whitespace-only change. No action needed.
154-154: No-op empty line additionThis is a formatting-only change. No action needed.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 0
🧹 Nitpick comments (1)
integration_tests/tests/test_exposure_schema_validity.py (1)
58-62: Rename variable for clarity; confirm “other” is the intended Dremio mappingAdding "dremio" makes the name
explicit_target_for_bigquerymisleading. A more neutral name clarifies intent. Also, please confirm that "other" is the correct canonical value for the referenced string dtype on Dremio in this test context.Apply this small rename for clarity:
- explicit_target_for_bigquery = ( + exposure_string_dtype = ( "other" if dbt_project.dbt_runner.target in ["bigquery", "snowflake", "dremio", ""] else "string" ) ... - "data_type": explicit_target_for_bigquery, + "data_type": exposure_string_dtype,Also applies to: 71-73
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro
💡 Knowledge Base configuration:
- MCP integration is disabled by default for public repositories
- Jira integration is disabled by default for public repositories
- Linear integration is disabled by default for public repositories
You can enable these sources in your CodeRabbit configuration.
📒 Files selected for processing (1)
integration_tests/tests/test_exposure_schema_validity.py(2 hunks)
🧰 Additional context used
🧠 Learnings (1)
📚 Learning: 2025-08-10T11:29:19.004Z
Learnt from: GuyEshdat
PR: elementary-data/dbt-data-reliability#838
File: integration_tests/tests/dbt_project.py:191-201
Timestamp: 2025-08-10T11:29:19.004Z
Learning: In the Elementary dbt package integration tests, BigQuery works correctly with the default `("database", "schema")` property mapping in the `get_database_and_schema_properties` function. When using `target.database` and `target.schema` in source definitions, BigQuery's dbt adapter handles these references appropriately without requiring special mapping to `project` and `dataset`.
Applied to files:
integration_tests/tests/test_exposure_schema_validity.py
🧬 Code Graph Analysis (1)
integration_tests/tests/test_exposure_schema_validity.py (1)
integration_tests/tests/conftest.py (2)
dbt_project(88-89)target(93-94)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (11)
- GitHub Check: test (latest_official, redshift) / test
- GitHub Check: test (1.8.0, postgres) / test
- GitHub Check: test (latest_official, dremio) / test
- GitHub Check: test (latest_official, snowflake) / test
- GitHub Check: test (latest_official, postgres) / test
- GitHub Check: test (latest_official, athena) / test
- GitHub Check: test (latest_official, bigquery) / test
- GitHub Check: test (latest_official, clickhouse) / test
- GitHub Check: test (latest_pre, postgres) / test
- GitHub Check: test (latest_official, databricks_catalog) / test
- GitHub Check: test (latest_official, trino) / test
🔇 Additional comments (1)
integration_tests/tests/test_exposure_schema_validity.py (1)
125-134: Resolved: Dremio numeric mapping is correctThe
dremio__data_type_listmacro’snumeric_listincludes both"int"and"integer", so mapping Dremio to"int"in the test aligns perfectly with the adapter. No changes required here.
Summary by CodeRabbit
New Features
Tests