Skip to content

SNOW-1843881: Change StructType columns to return Row objects#2820

Merged
sfc-gh-jrose merged 27 commits intomainfrom
jrose_snow_1843881_struct_return_row
Jan 8, 2025
Merged

SNOW-1843881: Change StructType columns to return Row objects#2820
sfc-gh-jrose merged 27 commits intomainfrom
jrose_snow_1843881_struct_return_row

Conversation

@sfc-gh-jrose
Copy link
Contributor

@sfc-gh-jrose sfc-gh-jrose commented Jan 3, 2025

  1. Which Jira issue is this PR addressing? Make sure that there is an accompanying issue to your PR.

    Fixes SNOW-1843881

  2. Fill out the following pre-review checklist:

    • I am adding a new automated test(s) to verify correctness of my new code
      • If this test skips Local Testing mode, I'm requesting review from @snowflakedb/local-testing
    • I am adding new logging messages
    • I am adding a new telemetry message
    • I am adding new credentials
    • I am adding a new dependency
    • If this is a new feature/behavior, I'm adding the Local Testing parity changes.
    • I acknowledge that I have ensured my changes to be thread-safe. Follow the link for more information: Thread-safe Developer Guidelines
  3. Please describe how your code solves the related issue.

    In pyspark struct type objects are collected as Rows instead of dicts. Another recent change allowed creating struct type columns by creating a dataframe from nested Row objects. These two changes together allow round trip structured objects without manually specifying the schema.

sfc-gh-jrose and others added 26 commits December 6, 2024 13:44
)

1. Which Jira issue is this PR addressing? Make sure that there is an
accompanying issue to your PR.

   Fixes SNOW-1852779

2. Fill out the following pre-review checklist:

- [ ] I am adding a new automated test(s) to verify correctness of my
new code
- [ ] If this test skips Local Testing mode, I'm requesting review from
@snowflakedb/local-testing
   - [ ] I am adding new logging messages
   - [ ] I am adding a new telemetry message
   - [ ] I am adding new credentials
   - [ ] I am adding a new dependency
- [ ] If this is a new feature/behavior, I'm adding the Local Testing
parity changes.
- [x] I acknowledge that I have ensured my changes to be thread-safe.
Follow the link for more information: [Thread-safe Developer
Guidelines](https://github.com/snowflakedb/snowpark-python/blob/main/CONTRIBUTING.md#thread-safe-development)

3. Please describe how your code solves the related issue.

Fixed AST encoding for Column `in_`, `asc`, and `desc`. Removed an
unused entity and renamed `sp_column_in__seq` to `sp_column_in`. Changed
the `nulls_first` optional boolean param to be an enum.
@sfc-gh-jrose sfc-gh-jrose added the NO-CHANGELOG-UPDATES This pull request does not need to update CHANGELOG.md label Jan 3, 2025
Comment on lines +705 to +706
) -> Tuple[Optional[List[str]], Optional[List[Callable]]]:
if not result_meta:
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

can we add a small description of what this returns.

@sfc-gh-jrose sfc-gh-jrose marked this pull request as ready for review January 4, 2025 00:06
@sfc-gh-jrose sfc-gh-jrose requested review from a team as code owners January 4, 2025 00:06
@sfc-gh-jrose sfc-gh-jrose requested review from sfc-gh-aling, sfc-gh-jdu and sfc-gh-smirzaei and removed request for a team January 4, 2025 00:06
Comment on lines -879 to +880
({"x": 1}, {"A": "a", "b": 1}, [1, 1, 1]),
({"x": 2}, {"A": "b", "b": 2}, [2, 2, 2]),
({"x": 1}, Row(A="a", b=1), [1, 1, 1]),
({"x": 2}, Row(A="b", b=2), [2, 2, 2]),
Copy link
Contributor

@sfc-gh-aling sfc-gh-aling Jan 4, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

is this considered a breaking change or this feature is not rolled out yet?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, this is a breaking change. It only gets applied when the feature flag in snowflake.snowpark.context is set though so won't break current users until that flag is rolled out. The structured type tests have the flag enabled.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I assume that flag is _should_use_structured_type_semantics.
just for my understanding, we will do a BCR release when rolling out this parameter?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That is the correct flag. I believe we will release the change in a BCR release, but for now this unblocks sas and customers that want to enabled it if we do PrPr

Base automatically changed from jrose_snow_1865926_create_dataframe_default_structured to main January 7, 2025 18:22
@sfc-gh-jrose sfc-gh-jrose merged commit 98330fa into main Jan 8, 2025
38 checks passed
@sfc-gh-jrose sfc-gh-jrose deleted the jrose_snow_1843881_struct_return_row branch January 8, 2025 23:41
@github-actions github-actions bot locked and limited conversation to collaborators Jan 8, 2025
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.

Labels

NO-CHANGELOG-UPDATES This pull request does not need to update CHANGELOG.md

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants