-
Notifications
You must be signed in to change notification settings - Fork 295
[Bug] If the number of objects in the schema = max_results_per_iter - dbt errors with Tables contain columns with the same names, but different types #1827
Copy link
Copy link
Open
Labels
triage:productIn Product's queueIn Product's queuetype:bugSomething isn't working as documentedSomething isn't working as documented
Description
Is this a new bug?
- I believe this is a new bug
- I have searched the existing issues, and I could not find an existing issue for this bug
Which packages are affected?
- dbt-adapters
- dbt-tests-adapter
- dbt-athena
- dbt-athena-community
- dbt-bigquery
- dbt-postgres
- dbt-redshift
- dbt-snowflake
- dbt-spark
Current Behavior
- dbt's list relations without caching macro will paginate if we hit 1000 tables in a schema.
- When we hit pagination, dbt will error with the error shared above.
Expected Behavior
No error.
Steps To Reproduce
- Project setup:
# dbt_project.yml
name: dbt_basic_5
profile: sf
version: "1.0.0"
models:
dbt_basic_5:
+materialized: table- Create an empty schema:
drop schema db.sch;
create schema db.sch;- Modify the list_relations_without_caching macro:
{% macro snowflake__list_relations_without_caching(schema_relation, max_iter=10000, max_results_per_iter=1) %}
{%- if schema_relation is string -%}
{%- set schema = schema_relation -%}
{%- else -%}
{%- set schema = schema_relation.include(identifier=False) -%}
{%- endif -%}
{%- set max_results_per_iter = adapter.config.flags.get('list_relations_per_page', max_results_per_iter) -%}
{%- set max_iter = adapter.config.flags.get('list_relations_page_limit', max_iter) -%}
{%- set too_many_relations_msg -%}
dbt is currently configured to list a maximum of {{ max_results_per_iter * max_iter }} objects per schema.
{{ schema }} exceeds this limit. If this is expected, you may configure this limit
by setting list_relations_per_page and list_relations_page_limit in your project flags.
It is recommended to start by increasing list_relations_page_limit.
{%- endset -%}
{%- set paginated_state = namespace(paginated_results=[], watermark=none) -%}
{#-
loop an extra time to catch the breach of max iterations
Note: while range is 0-based, loop.index starts at 1
-#}
{%- for _ in range(max_iter + 1) -%}
{#-
raise a warning and break if we still didn't exit and we're beyond the max iterations limit
Note: while range is 0-based, loop.index starts at 1
-#}
{%- if loop.index == max_iter + 1 -%}
{%- do exceptions.warn(too_many_relations_msg) -%}
{%- break -%}
{%- endif -%}
{%- set show_objects_sql = snowflake__show_objects_sql(schema, max_results_per_iter, paginated_state.watermark) -%}
{%- set paginated_result = run_query(show_objects_sql) -%}
{% do print(schema) %}
{% do print(paginated_result) %}
{%- do paginated_state.paginated_results.append(paginated_result) -%}
{%- set paginated_state.watermark = paginated_result.columns.get('name').values()[-1] -%}
{#- we got less results than the max_results_per_iter (includes 0), meaning we reached the end -#}
{%- if (paginated_result | length) < max_results_per_iter -%}
{%- break -%}
{%- endif -%}
{%- endfor -%}
{#- grab the first table in the paginated results to access the `merge` method -#}
{%- set agate_table = paginated_state.paginated_results[0] -%}
{%- do return(agate_table.merge(paginated_state.paginated_results)) -%}
{% endmacro %}^ I'm setting max_results_per_iter to 1 which is effectively - if we see 1 table in the schema, then we're going to paginate.
- Do a double run
$ dbt run && dbt run --debug
23:02:19 Running with dbt=1.11.7
23:02:20 Registered adapter: snowflake=1.11.3
23:02:20 Found 1 model, 523 macros
23:02:20
23:02:20 Concurrency: 4 threads (target='ci')
23:02:20
db.sch
| column | data_type |
| --------------- | --------- |
| created_on | Integer |
| name | Integer |
| database_name | Integer |
| schema_name | Integer |
| kind | Integer |
| comment | Integer |
| cluster_by | Integer |
| rows | Integer |
| bytes | Integer |
| owner | Integer |
| retention_time | Integer |
| owner_role_type | Integer |
| is_hybrid | Integer |
| is_dynamic | Integer |
| is_iceberg | Integer |
| is_interactive | Integer |
23:02:21 1 of 1 START sql table model sch.bar ........................................... [RUN]
23:02:22 1 of 1 OK created sql table model sch.bar ...................................... [SUCCESS 1 in 1.80s]
23:02:23
23:02:23 Finished running 1 table model in 0 hours 0 minutes and 2.86 seconds (2.86s).
23:02:23
23:02:23 Completed successfully
23:02:23
23:02:23 Done. PASS=1 WARN=0 ERROR=0 SKIP=0 NO-OP=0 TOTAL=1
23:02:24 Sending event: {'category': 'dbt', 'action': 'invocation', 'label': 'start', 'context': [<snowplow_tracker.self_describing_json.SelfDescribingJson object at 0x1142bf010>, <snowplow_tracker.self_describing_json.SelfDescribingJson object at 0x11429b650>, <snowplow_tracker.self_describing_json.SelfDescribingJson object at 0x1142d71d0>]}
23:02:24 Running with dbt=1.11.7
23:02:24 running dbt with arguments {'empty': 'False', 'version_check': 'True', 'no_print': 'None', 'profiles_dir': '/Users/jeremy/.dbt', 'log_format': 'default', 'log_path': '/Users/jeremy/git/dbt-basic/logs', 'warn_error_options': 'WarnErrorOptionsV2(error=[], warn=[], silence=[])', 'log_cache_events': 'False', 'write_json': 'True', 'indirect_selection': 'eager', 'use_colors': 'True', 'static_parser': 'True', 'partial_parse': 'True', 'cache_selected_only': 'False', 'fail_fast': 'False', 'introspect': 'True', 'target_path': 'None', 'warn_error': 'None', 'invocation_command': 'dbt run --debug', 'use_experimental_parser': 'False', 'send_anonymous_usage_stats': 'True', 'debug': 'True', 'quiet': 'False', 'printer_width': '80'}
23:02:24 Snowflake adapter: Setting snowflake.connector to ERROR (file logging only)
23:02:24 Snowflake adapter: Setting botocore to ERROR (file logging only)
23:02:24 Snowflake adapter: Setting boto3 to ERROR (file logging only)
23:02:25 Sending event: {'category': 'dbt', 'action': 'project_id', 'label': 'a5210b3e-e2f0-4fae-9890-da8fe609d703', 'context': [<snowplow_tracker.self_describing_json.SelfDescribingJson object at 0x1131e4250>]}
23:02:25 Sending event: {'category': 'dbt', 'action': 'adapter_info', 'label': 'a5210b3e-e2f0-4fae-9890-da8fe609d703', 'context': [<snowplow_tracker.self_describing_json.SelfDescribingJson object at 0x11288e1d0>]}
23:02:25 Registered adapter: snowflake=1.11.3
23:02:25 checksum: b80718358155fc17c3559352095564dd56ada1cc1a1c0ef2e62c857e71583775, vars: {}, profile: , target: , version: 1.11.7
23:02:25 Partial parsing enabled: 0 files deleted, 0 files added, 0 files changed.
23:02:25 Nothing changed, skipping partial parsing.
23:02:25 Partial parsing enabled, no changes found, skipping parsing
23:02:25 Sending event: {'category': 'dbt', 'action': 'load_project', 'label': 'a5210b3e-e2f0-4fae-9890-da8fe609d703', 'context': [<snowplow_tracker.self_describing_json.SelfDescribingJson object at 0x1142c7890>]}
23:02:25 Wrote artifact WritableManifest to /Users/jeremy/git/dbt-basic/target/manifest.json
23:02:25 Wrote artifact SemanticManifest to /Users/jeremy/git/dbt-basic/target/semantic_manifest.json
23:02:25 Sending event: {'category': 'dbt', 'action': 'resource_counts', 'label': 'a5210b3e-e2f0-4fae-9890-da8fe609d703', 'context': [<snowplow_tracker.self_describing_json.SelfDescribingJson object at 0x124f44550>]}
23:02:25 Found 1 model, 523 macros
23:02:25 Sending event: {'category': 'dbt', 'action': 'runnable_timing', 'label': 'a5210b3e-e2f0-4fae-9890-da8fe609d703', 'context': [<snowplow_tracker.self_describing_json.SelfDescribingJson object at 0x1142cf910>]}
23:02:25
23:02:25 Concurrency: 4 threads (target='ci')
23:02:25
23:02:25 Acquiring new snowflake connection 'master'
23:02:25 Acquiring new snowflake connection 'list_db'
23:02:25 Using snowflake connection "list_db"
23:02:25 On list_db: show terse schemas in database db
limit 10000
/* {"app": "dbt", "dbt_version": "1.11.7", "profile_name": "sf", "target_name": "ci", "connection_name": "list_db"} */
23:02:25 Opening a new connection, currently in state init
23:02:25 SQL status: SUCCESS 55 in 0.583 seconds
23:02:25 Re-using an available connection from the pool (formerly list_db, now list_db_sch)
23:02:25 Using snowflake connection "list_db_sch"
23:02:25 On list_db_sch: show objects in db.sch
limit 1
/* {"app": "dbt", "dbt_version": "1.11.7", "profile_name": "sf", "target_name": "ci", "connection_name": "list_db_sch"} */;
23:02:26 SQL status: SUCCESS 1 in 0.152 seconds
db.sch
| column | data_type |
| --------------- | --------- |
| created_on | DateTime |
| name | Text |
| database_name | Text |
| schema_name | Text |
| kind | Text |
| comment | Text |
| cluster_by | Text |
| rows | Integer |
| bytes | Integer |
| owner | Text |
| retention_time | Text |
| owner_role_type | Text |
| is_hybrid | Text |
| is_dynamic | Text |
| is_iceberg | Text |
| is_interactive | Text |
23:02:26 Using snowflake connection "list_db_sch"
23:02:26 On list_db_sch: show objects in db.sch
limit 1
from 'BAR'
/* {"app": "dbt", "dbt_version": "1.11.7", "profile_name": "sf", "target_name": "ci", "connection_name": "list_db_sch"} */;
23:02:26 SQL status: SUCCESS 0 in 0.148 seconds
db.sch
| column | data_type |
| --------------- | --------- |
| created_on | Integer |
| name | Integer |
| database_name | Integer |
| schema_name | Integer |
| kind | Integer |
| comment | Integer |
| cluster_by | Integer |
| rows | Integer |
| bytes | Integer |
| owner | Integer |
| retention_time | Integer |
| owner_role_type | Integer |
| is_hybrid | Integer |
| is_dynamic | Integer |
| is_iceberg | Integer |
| is_interactive | Integer |
23:02:26 Snowflake adapter: Error running SQL: macro list_relations_without_caching
23:02:26 Snowflake adapter: Rolling back transaction.
23:02:26 Connection 'master' was properly closed.
23:02:26 Connection 'list_db_sch' was left open.
23:02:26 On list_db_sch: Close
23:02:26
23:02:26 Finished running in 0 hours 0 minutes and 1.13 seconds (1.13s).
23:02:26 Encountered an error:
Compilation Error
Tables contain columns with the same names, but different types.
> in macro list_relations_without_caching (macros/adapters/metadata.sql)
> called by <Unknown>
23:02:26 Resource report: {"command_name": "run", "command_success": false, "command_wall_clock_time": 1.8282759, "process_in_blocks": "0", "process_kernel_time": 0.235137, "process_mem_max_rss": "227311616", "process_out_blocks": "0", "process_user_time": 1.308286}
23:02:26 Command `dbt run` failed at 16:02:26.352880 after 1.83 seconds
23:02:26 Sending event: {'category': 'dbt', 'action': 'invocation', 'label': 'end', 'context': [<snowplow_tracker.self_describing_json.SelfDescribingJson object at 0x10516e4d0>, <snowplow_tracker.self_describing_json.SelfDescribingJson object at 0x1265e9610>, <snowplow_tracker.self_describing_json.SelfDescribingJson object at 0x1142a37d0>]}
23:02:26 Flushing usage events
23:02:26 An error was encountered while trying to flush usage eventsRelevant log output
Environment
- OS: macOS
- Python: 3.11
- dbt-adapters: 1.22.9
- dbt-snowflake: 1.11.3Additional Context
No response
Reactions are currently unavailable
Metadata
Metadata
Assignees
Labels
triage:productIn Product's queueIn Product's queuetype:bugSomething isn't working as documentedSomething isn't working as documented