May I ask why presto on spark needs to perform rdd collect?

![101e6ce847a6c737767f3b485](https://github.com/user-attachments/assets/37857181-4373-4ec4-a3eb-98dc93cb56b5)
I want to know why we need to do rdd collect here. Doing this kind of operation in spark will put a lot of pressure on the driver. During the test, I found that the driver often reports OOM.
When I check the source code, the spark driver performs the commit operation of the coordinator_olnly type stage. I understand that only metadata needs to be submitted when doing the commit operation. Why do I need to submit the data together.
Is the coordinator implemented in this way in native presto?


Provide feedback

Saved searches

Use saved searches to filter your results more quickly

May I ask why presto on spark needs to perform rdd collect? #23830

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

May I ask why presto on spark needs to perform rdd collect? #23830

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions