Skip to content

Bug: composing duplicate queries throws an error#620

Merged
kbolashev merged 1 commit intomainfrom
bug/duplicate-tree-composition
Jul 20, 2025
Merged

Bug: composing duplicate queries throws an error#620
kbolashev merged 1 commit intomainfrom
bug/duplicate-tree-composition

Conversation

@kbolashev
Copy link
Copy Markdown
Member

To reproduce:

# Assuming there's a ds already
ds2 = ds["x"] > 5
ds3 = (ds2["y"] < 10) | (ds2["y"] > 20)

Before this PR, doing something like this threw a DuplicateKeyError.
Solving this by doing a deep copy of the operand tree of composed trees every time.

@kbolashev kbolashev requested a review from Copilot July 20, 2025 11:47
@kbolashev kbolashev self-assigned this Jul 20, 2025
@kbolashev kbolashev added the bug Something isn't working label Jul 20, 2025
@dagshub
Copy link
Copy Markdown

dagshub bot commented Jul 20, 2025

@kbolashev kbolashev changed the title Bug: composing duplicate query throws an error Bug: composing duplicate queries throws an error Jul 20, 2025
Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull Request Overview

This PR fixes a bug where composing queries with duplicate subqueries would throw a DuplicateKeyError due to node identifier conflicts in the query tree structure. The solution implements deep cloning of operand trees with unique identifiers to prevent conflicts when the same subquery is reused.

Key changes:

  • Added deep cloning functionality to prevent node identifier conflicts in query composition
  • Implemented comprehensive test coverage for the duplicate subquery scenario

Reviewed Changes

Copilot reviewed 2 out of 2 changed files in this pull request and generated 2 comments.

File Description
dagshub/data_engine/model/query.py Adds deep cloning logic with UUID-based node identifiers to resolve duplicate key conflicts
tests/data_engine/test_querying.py Adds test case reproducing and validating the fix for duplicate subquery composition

@kbolashev kbolashev merged commit 725b19a into main Jul 20, 2025
8 checks passed
@kbolashev kbolashev deleted the bug/duplicate-tree-composition branch July 20, 2025 11:55
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

bug Something isn't working

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants