-
Notifications
You must be signed in to change notification settings - Fork 204
Add Databricks 1.10.2 compatibility patch #1937
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
|
""" WalkthroughThe Changes
Sequence Diagram(s)sequenceDiagram
participant User
participant CommandLineDbtRunner
participant DatabricksPatch
participant dbt.adapters.databricks
User->>CommandLineDbtRunner: Instantiate
CommandLineDbtRunner->>CommandLineDbtRunner: __init__()
CommandLineDbtRunner->>DatabricksPatch: apply_databricks_patch()
DatabricksPatch->>dbt.adapters.databricks: Import parse_model module
DatabricksPatch->>dbt.adapters.databricks: Monkey-patch functions with safe wrappers
dbt.adapters.databricks-->>DatabricksPatch: Patch success/failure
DatabricksPatch-->>CommandLineDbtRunner: Return patch result
Poem
✨ Finishing Touches
🪧 TipsChatThere are 3 ways to chat with CodeRabbit:
SupportNeed help? Create a ticket on our support page for assistance with any issues or questions. Note: Be mindful of the bot's finite context window. It's strongly recommended to break down tasks such as reading entire modules into smaller chunks. For a focused discussion, use review comments to chat about specific files and their changes, instead of using the PR comments. CodeRabbit Commands (Invoked using PR comments)
Other keywords and placeholders
CodeRabbit Configuration File (
|
|
👋 @arbiv |
62d7976 to
b666dbe
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 4
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (1)
elementary/clients/dbt/command_line_dbt_runner.py(2 hunks)
🧰 Additional context used
🪛 Ruff (0.11.9)
elementary/clients/dbt/command_line_dbt_runner.py
330-330: logging imported but unused; consider using importlib.util.find_spec to test for availability
(F401)
340-340: f-string without any placeholders
Remove extraneous f prefix
(F541)
🪛 GitHub Actions: Run pre-commit hooks
elementary/clients/dbt/command_line_dbt_runner.py
[error] 329-329: flake8: 'logging' imported but unused (F401)
[error] 342-342: flake8: f-string is missing placeholders (F541)
[error] 331-331: mypy: Cannot find implementation or library stub for module named "dbt.adapters.databricks" [import-not-found]
[error] pre-commit hook 'black' reformatted the file. Run 'black --write' to fix code style issues.
[error] pre-commit hook 'isort' modified the file to fix import sorting.
⏰ Context from checks skipped due to timeout of 90000ms (1)
- GitHub Check: test / test
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 3
🧹 Nitpick comments (2)
elementary/clients/dbt/databricks_patch.py (2)
13-31: Consider making the default catalog configurable.Hard-coding "unity" as the default catalog might not be appropriate for all Databricks environments, as some users might be using different catalog names.
-def safe_catalog_name(model: Any) -> str: +def safe_catalog_name(model: Any, default_catalog: str = "unity") -> str: try: if is_unsupported_object(model): logger.debug( "Received unsupported object type for catalog_name, using unity as default" ) - return "unity" + return default_catalog # Handle RelationConfig objects if hasattr(model, "config") and model.config and hasattr(model.config, "get"): catalog = model.config.get("catalog") if catalog: return catalog # Fallback to unity catalog - return "unity" + return default_catalog except Exception as e: logger.debug( f"Failed to parse catalog name from model: {e}. Using unity as default." ) - return "unity" + return default_catalog
1-124: Consider the maintainability implications of monkey patching.While this patch addresses immediate compatibility issues, monkey patching can make debugging difficult and may mask underlying problems. Consider documenting the specific issues this patch addresses and establishing a plan for its removal once the upstream issues are resolved.
Add comprehensive documentation about:
- The specific dbt-databricks 1.10.2 issues this patch addresses
- Expected timeline for upstream fixes
- Instructions for testing whether the patch is still needed
- Monitoring to detect if the patch causes unintended side effects
Consider implementing a feature flag or configuration option to enable/disable the patch for easier testing and rollback if needed.
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (2)
elementary/clients/dbt/command_line_dbt_runner.py(3 hunks)elementary/clients/dbt/databricks_patch.py(1 hunks)
🚧 Files skipped from review as they are similar to previous changes (1)
- elementary/clients/dbt/command_line_dbt_runner.py
⏰ Context from checks skipped due to timeout of 90000ms (1)
- GitHub Check: code-quality
| def is_unsupported_object(model: Any) -> bool: | ||
| """Check if the object is a Macro or other unsupported type""" | ||
| return hasattr(model, "__class__") and "Macro" in str(model.__class__) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
🛠️ Refactor suggestion
Use more robust type checking instead of string-based detection.
The current approach of checking for "Macro" in the string representation of the class is fragile and could break with different Python versions or object representations.
def is_unsupported_object(model: Any) -> bool:
- """Check if the object is a Macro or other unsupported type"""
- return hasattr(model, "__class__") and "Macro" in str(model.__class__)
+ """Check if the object is a Macro or other unsupported type"""
+ try:
+ # Check for specific dbt Macro types
+ return (hasattr(model, "__class__") and
+ (model.__class__.__name__ == "Macro" or
+ "Macro" in [cls.__name__ for cls in model.__class__.__mro__]))
+ except (AttributeError, TypeError):
+ return False📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
| def is_unsupported_object(model: Any) -> bool: | |
| """Check if the object is a Macro or other unsupported type""" | |
| return hasattr(model, "__class__") and "Macro" in str(model.__class__) | |
| def is_unsupported_object(model: Any) -> bool: | |
| """Check if the object is a Macro or other unsupported type""" | |
| try: | |
| # Check for specific dbt Macro types | |
| return (hasattr(model, "__class__") and | |
| (model.__class__.__name__ == "Macro" or | |
| "Macro" in [cls.__name__ for cls in model.__class__.__mro__])) | |
| except (AttributeError, TypeError): | |
| return False |
🤖 Prompt for AI Agents
In elementary/clients/dbt/databricks_patch.py around lines 8 to 10, replace the
fragile string-based check for "Macro" in the class name with a more robust type
checking method. Use isinstance or check the class directly against the known
Macro type or its base class to determine if the object is a Macro, ensuring
compatibility across Python versions and avoiding reliance on string
representations.
| def safe_get( | ||
| model: Any, setting: str, case_sensitive: Union[bool, None] = False | ||
| ) -> Union[str, None]: | ||
| try: | ||
| if is_unsupported_object(model): | ||
| return None | ||
| # Check if model has config attribute | ||
| if not hasattr(model, "config") or not model.config: | ||
| return None | ||
| # Check if config has get method | ||
| if not hasattr(model.config, "get"): | ||
| return None | ||
| value = model.config.get(setting) | ||
| if value: | ||
| return value if case_sensitive else value.lower() | ||
| return None | ||
| except Exception as e: | ||
| logger.debug(f"Failed to get {setting} from model config: {e}") | ||
| return None |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Fix inconsistent parameter usage in safe_get function.
The case_sensitive parameter defaults to False but is used in a way that suggests it should default to True or None. When case_sensitive=False, the function calls .lower() on the value, but this could cause issues if the value is not a string.
def safe_get(
- model: Any, setting: str, case_sensitive: Union[bool, None] = False
+ model: Any, setting: str, case_sensitive: bool = True
) -> Union[str, None]:
try:
if is_unsupported_object(model):
return None
# Check if model has config attribute
if not hasattr(model, "config") or not model.config:
return None
# Check if config has get method
if not hasattr(model.config, "get"):
return None
value = model.config.get(setting)
if value:
- return value if case_sensitive else value.lower()
+ if isinstance(value, str) and not case_sensitive:
+ return value.lower()
+ return str(value) if value is not None else None
return None
except Exception as e:
logger.debug(f"Failed to get {setting} from model config: {e}")
return None📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
| def safe_get( | |
| model: Any, setting: str, case_sensitive: Union[bool, None] = False | |
| ) -> Union[str, None]: | |
| try: | |
| if is_unsupported_object(model): | |
| return None | |
| # Check if model has config attribute | |
| if not hasattr(model, "config") or not model.config: | |
| return None | |
| # Check if config has get method | |
| if not hasattr(model.config, "get"): | |
| return None | |
| value = model.config.get(setting) | |
| if value: | |
| return value if case_sensitive else value.lower() | |
| return None | |
| except Exception as e: | |
| logger.debug(f"Failed to get {setting} from model config: {e}") | |
| return None | |
| def safe_get( | |
| model: Any, setting: str, case_sensitive: bool = True | |
| ) -> Union[str, None]: | |
| try: | |
| if is_unsupported_object(model): | |
| return None | |
| # Check if model has config attribute | |
| if not hasattr(model, "config") or not model.config: | |
| return None | |
| # Check if config has get method | |
| if not hasattr(model.config, "get"): | |
| return None | |
| value = model.config.get(setting) | |
| if value: | |
| if isinstance(value, str) and not case_sensitive: | |
| return value.lower() | |
| return str(value) if value is not None else None | |
| return None | |
| except Exception as e: | |
| logger.debug(f"Failed to get {setting} from model config: {e}") | |
| return None |
🤖 Prompt for AI Agents
In elementary/clients/dbt/databricks_patch.py between lines 78 and 96, the
safe_get function uses the case_sensitive parameter inconsistently by defaulting
it to False and then calling .lower() on the value, which may not be a string.
To fix this, change the default of case_sensitive to True or None to reflect
that values should be returned as-is by default, and add a type check before
calling .lower() to ensure the value is a string when case_sensitive is False.
|
@arbiv are there any reasons why this PR is not yet marked as ready for review? Is there any help needed? We are critically blocked by that elementary downside, as we need to use >=1.10.5 of dbt-databricks. |
|
Is there a timeline of when this PR is going to be updated and merged? |
|
Hey all, we were a bit cautious about this fix because it patches internal |
null
Summary by CodeRabbit