Skip to content

Conversation

@MisterArdavan
Copy link
Collaborator

Implement ResourceHoarding class which enables the analysis of jobs and users who cause resources being inaccessible to others by requesting a disproportionate amount of CPU cores or RAM.

Also adds a demo notebook to show the functionality of the class.

Changes were made to other files when necessary for implementation of this feature.

@MisterArdavan MisterArdavan marked this pull request as draft August 13, 2025 16:27
@MisterArdavan MisterArdavan marked this pull request as ready for review August 13, 2025 16:29
def sort_and_filter_records_with_metrics(
self,
metrics_df_name_enum: MetricsDataFrameNameEnum,
metrics_df_name_enum: MetricsDFNameEnumT,
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

do we need to add this to every method? I'm not sure why anyone would try to pass in anything other than jobs here. Similar for other methods. I think the earlier implementation was fine. It would be good to have if the function did different stuff if different types of DFs were passed, but since this only runs on jobs, we should just use the jobs df (raise an error and calculate the metrics if it doesn't work). And merging this would break the reports, any other pieces like frequency analysis, a100, and ROC, unless they're all changed. I don't think it's feasible to do this right now in my opinion, and it's not needed unless the function handles different DFs.

with open(self.local_path, "w") as f:
json.dump(remote_info, f, indent=2)
print(f"Fetched and saved {self.local_path.name} from remote URL.")
if os.getenv("OUTPUT_MODE") == "VERBOSE":
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

How do we change this so that it is printed? Do we have to manually set that env variable? If so, we should add something in the documentation to specify that.

@bpachev bpachev merged commit 4e45a93 into main Sep 3, 2025
6 checks passed
@bpachev bpachev deleted the feature/high-cpu-mem-analysis branch September 3, 2025 21:32
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants