Skip to content

[Draft] Add enroot as potential option for deployment#224

Closed
kjain14 wants to merge 7 commits intoSWE-agent:mainfrom
kjain14:main
Closed

[Draft] Add enroot as potential option for deployment#224
kjain14 wants to merge 7 commits intoSWE-agent:mainfrom
kjain14:main

Conversation

@kjain14
Copy link

@kjain14 kjain14 commented Jul 28, 2025

Goal: add enroot as a potential deployment option.

Allows for SWE-ReX to be run on HPC clusters.

@klieret klieret requested a review from Copilot July 28, 2025 23:34
Copy link
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull Request Overview

This PR adds Enroot as a new deployment option for SWE-ReX, enabling deployment on HPC clusters using Slurm and Pyxis. The implementation follows the existing deployment pattern and integrates with the SWE-ReX runtime system.

  • Implements EnrootDeployment class with Slurm job management capabilities
  • Adds configuration support for Enroot-specific parameters like sbatch args, pyxis args, and cluster resources
  • Includes robust job lifecycle management with cleanup handlers and signal handling

Reviewed Changes

Copilot reviewed 2 out of 2 changed files in this pull request and generated 5 comments.

File Description
src/swerex/deployment/enroot.py New deployment implementation with Slurm job submission, container management, and runtime coordination
src/swerex/deployment/config.py Configuration class for Enroot deployment parameters and integration with deployment config union

kjain14 and others added 5 commits July 28, 2025 16:36
Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>
Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>
Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>
Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>
Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>
@klieret
Copy link
Member

klieret commented Jul 28, 2025

Hi @kjain14 thank you very much for this, looks like quite a bit of work went into this! Absolutely open to have more backends in swe-rex, especially if they don't interfere with any of the other deployments.

Since this is going to be hard to test end-to-end (because it needs SLURM etc.), I'm curious if you've tested this on your cluster?

@klieret
Copy link
Member

klieret commented Jul 28, 2025

Also I guess this is specific in that it uses both SLURM and enroot, I guess? Totally open to expand in this direction, just need a bit more context

@kjain14
Copy link
Author

kjain14 commented Jul 28, 2025

Hi @klieret, yes the idea was to get this to work on SLURM clusters. Enroot is the default container solution for SLURM clusters (comes bundled with SLURM as far as I can tell) - https://github.com/NVIDIA/pyxis. Running docker on SLURM is a big challenge due to it not interacting well with SLURM scheduler, so it seems like supporting an Enroot backend is the only way to get SWE-ReX/SWE-agent to run on these clusters.

We have tested it on our clusters. The only part that I am not sure how to solve is the image caching (on our cluster we have a hardcoded path that I just changed to ./images). We probably want to be able to pass in the path somehow.

@kjain14
Copy link
Author

kjain14 commented Aug 1, 2025

Going to close this for now, until it is a bit more stable on our end

@kjain14 kjain14 closed this Aug 1, 2025
@klieret
Copy link
Member

klieret commented Aug 1, 2025

Hi @kjain14 ! We can also leave it open as a draft if you want! Totally fine with me.

I think having some more context of how it's being used etc. would be great before merging it in (especially because it's somewhat specific, I guess).

The alternative that I'm also open to would be to add a page in the docs/readme where we link to notable forks/blog posts/etc.

What do you think?

@kjain14
Copy link
Author

kjain14 commented Aug 1, 2025

Sure, I think either of these options works for us. I think getting it merged would be nice eventually, but also we are testing it extensively on our end so wanted this to get to more stable state (feel free to make it a draft PR).

@kjain14 kjain14 reopened this Aug 1, 2025
@kjain14 kjain14 marked this pull request as draft August 1, 2025 23:10
@klieret klieret changed the title Add enroot as potential option for deployment [Draft] Add enroot as potential option for deployment Aug 5, 2025
@kjain14 kjain14 closed this Sep 29, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants