-
Notifications
You must be signed in to change notification settings - Fork 487
Introduce option to limit the heap capacity for a dataflow #31246
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Changes from all commits
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
| Original file line number | Diff line number | Diff line change |
|---|---|---|
|
|
@@ -2580,6 +2580,7 @@ impl Coordinator { | |
| None, | ||
| ExplainContext::Pushdown, | ||
| Some(ctx.session().vars().max_query_result_size()), | ||
| ctx.session().vars().max_query_heap_size(), | ||
| ), | ||
| ctx | ||
| ); | ||
|
|
@@ -3004,6 +3005,7 @@ impl Coordinator { | |
| }, | ||
| TargetCluster::Active, | ||
| None, | ||
| None, | ||
|
Contributor
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Why are we not enforcing limits here? Afaict this is the code path for
Contributor
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. +1 What's tricky here is the already existing variable Some more context is here but because the results of this peek are not generally user facing (except in the case where the user specifies a |
||
| ) | ||
| .await; | ||
|
|
||
|
|
||
| Original file line number | Diff line number | Diff line change |
|---|---|---|
|
|
@@ -107,6 +107,7 @@ message ProtoSubscribeBatch { | |
| message ProtoStatusResponse { | ||
| oneof kind { | ||
| ProtoOperatorHydrationStatus operator_hydration = 1; | ||
| ProtoDataflowLimitStatus dataflow_limit_status = 2; | ||
|
Contributor
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I think this field should be called |
||
| } | ||
| } | ||
|
|
||
|
|
@@ -116,3 +117,7 @@ message ProtoOperatorHydrationStatus { | |
| uint64 worker_id = 3; | ||
| bool hydrated = 4; | ||
| } | ||
|
|
||
| message ProtoDataflowLimitStatus { | ||
| mz_repr.global_id.ProtoGlobalId collection_id = 1; | ||
| } | ||
| Original file line number | Diff line number | Diff line change |
|---|---|---|
|
|
@@ -589,6 +589,8 @@ impl Arbitrary for SubscribeBatch<mz_repr::Timestamp> { | |
| pub enum StatusResponse { | ||
| /// Reports the hydration status of dataflow operators. | ||
| OperatorHydration(OperatorHydrationStatus), | ||
| /// Reports limit violations for dataflows. | ||
| DataflowLimitExceeded(DataflowLimitStatus), | ||
| } | ||
|
|
||
| impl RustType<ProtoStatusResponse> for StatusResponse { | ||
|
|
@@ -597,6 +599,7 @@ impl RustType<ProtoStatusResponse> for StatusResponse { | |
|
|
||
| let kind = match self { | ||
| Self::OperatorHydration(status) => Kind::OperatorHydration(status.into_proto()), | ||
| Self::DataflowLimitExceeded(status) => Kind::DataflowLimitStatus(status.into_proto()), | ||
| }; | ||
| ProtoStatusResponse { kind: Some(kind) } | ||
| } | ||
|
|
@@ -608,6 +611,9 @@ impl RustType<ProtoStatusResponse> for StatusResponse { | |
| Some(Kind::OperatorHydration(status)) => { | ||
| Ok(Self::OperatorHydration(status.into_rust()?)) | ||
| } | ||
| Some(Kind::DataflowLimitStatus(status)) => { | ||
| Ok(Self::DataflowLimitExceeded(status.into_rust()?)) | ||
| } | ||
| None => Err(TryFromProtoError::missing_field( | ||
| "ProtoStatusResponse::kind", | ||
| )), | ||
|
|
@@ -650,6 +656,29 @@ impl RustType<ProtoOperatorHydrationStatus> for OperatorHydrationStatus { | |
| } | ||
| } | ||
|
|
||
| /// A dataflow exceeded some limit. | ||
| #[derive(Debug, Clone, Eq, PartialEq, Serialize, Deserialize, Arbitrary)] | ||
| pub struct DataflowLimitStatus { | ||
| /// The ID of the compute collection exported by the dataflow. | ||
| pub collection_id: GlobalId, | ||
| } | ||
|
Comment on lines
+659
to
+664
Contributor
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Mentioned above, but we should also report, which limit! |
||
|
|
||
| impl RustType<ProtoDataflowLimitStatus> for DataflowLimitStatus { | ||
| fn into_proto(&self) -> ProtoDataflowLimitStatus { | ||
| ProtoDataflowLimitStatus { | ||
| collection_id: Some(self.collection_id.into_proto()), | ||
| } | ||
| } | ||
|
|
||
| fn from_proto(proto: ProtoDataflowLimitStatus) -> Result<Self, TryFromProtoError> { | ||
| Ok(Self { | ||
| collection_id: proto | ||
| .collection_id | ||
| .into_rust_if_some("ProtoDataflowLimitStatus::collection_id")?, | ||
| }) | ||
| } | ||
| } | ||
|
|
||
| #[cfg(test)] | ||
| mod tests { | ||
| use mz_ore::assert_ok; | ||
|
|
||
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Seems like a good idea to bundle these into a
QueryResourceLimitsor similar, to making passing them around less boilerplate-y.There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think there's maybe a larger refactoring around here (the first ~900 lines of
coord.rsare types without documentation, other than some field-level doccomments). Imo, maybe this PR isn't the place to do that, but rather a more holistic sweep?There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Having said that, the next several pages of diffs do seem to be more boilerplate, pairing up the new parameter with the prior one. I may be coming around!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should have called this a nit! Imo it's good to follow the boy scout rule when it's not too much of a hassle, because we might never get to that holistic refactor. But nothing I'd block the PR on.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
+1 to bundling into a
QueryResourceLimitsbut also not blocking