Skip to content

feat: use cgroups to limit cpu and memory usage#273

Merged
FedericoPonzi merged 10 commits intoFedericoPonzi:masterfrom
kemingy:cgroups
Mar 26, 2025
Merged

feat: use cgroups to limit cpu and memory usage#273
FedericoPonzi merged 10 commits intoFedericoPonzi:masterfrom
kemingy:cgroups

Conversation

@kemingy
Copy link
Collaborator

@kemingy kemingy commented Mar 8, 2025

Motivation and Context

Description

How Has This Been Tested?

Types of changes

  • Bug fix (non-breaking change which fixes an issue)
  • New feature (non-breaking change which adds functionality)
  • Performance enhancement (non-breaking change which improves efficiency)
  • Code cleanup (non-breaking change which makes code smaller or more readable)
  • Breaking change (fix or feature that would cause existing functionality to change)
  • Documentation (a change to man pages or other documentation)

Checklist:

  • I have updated the documentation accordingly.
  • I have added tests to cover my changes.

Signed-off-by: Keming <kemingy94@gmail.com>
kemingy added 2 commits March 9, 2025 01:03
Signed-off-by: Keming <kemingy94@gmail.com>
Signed-off-by: Keming <kemingy94@gmail.com>
Copy link
Owner

@FedericoPonzi FedericoPonzi left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

thanks for working on this! I think we need to add some integration testing for the new property, we could gate it to skip them if user is not root. Also, I'm trying to understand if it would be possible to have an option in which horust doesn't run as root. For example, if root has setup a cgroup for the current user beforehand, horust could use it throughout the execution (I think?).
This is the sample script:

#!/bin/bash
set -e

USER_NAME=fponzi
USER_UID=1000

CGROUP_PATH="/sys/fs/cgroup/horust_${USER_UID}"


echo "Setting up cgroup v2 for user: $USER_NAME"

# Check if running as root
if [ "$(id -u)" -ne 0 ]; then
    echo "This script must be run as root!"
    exit 1
fi

# Ensure cgroup v2 is mounted
if ! mount | grep -q "cgroup2 on /sys/fs/cgroup"; then
    echo "Mounting cgroup v2..."
    mount -t cgroup2 none /sys/fs/cgroup
fi

# Create the cgroup if it doesn’t exist
if [ ! -d "$CGROUP_PATH" ]; then
    echo "Creating cgroup at $CGROUP_PATH"
    mkdir "$CGROUP_PATH"
fi

# Enable controllers for delegation
echo "+pids +memory +cpu" > /sys/fs/cgroup/cgroup.subtree_control

# Change ownership to the user so they can manage it
chown "$USER_NAME:$USER_NAME" -R "$CGROUP_PATH"
chmod 755 "$CGROUP_PATH"

# Allow user to modify pids, memory, and cpu settings
for ctrl in cgroup.procs cgroup.controllers cgroup.subtree_control; do
    chown "$USER_NAME:$USER_NAME" "$CGROUP_PATH/$ctrl"
    chmod 644 "$CGROUP_PATH/$ctrl"
done

echo "Cgroup v2 setup complete. You can now run your tests as $USER_NAME"

then we could create the cgroup under horust_uid/servicename. Again, I'm not 100% sure it will work, but I'm sharing an update just to push the outstanding comments and what I'm looking into.

Signed-off-by: Keming <kemingy94@gmail.com>
@kemingy
Copy link
Collaborator Author

kemingy commented Mar 11, 2025

Hi @FedericoPonzi, thanks for your review.

  1. I found it hard to detect if the current user has the required permissions, so I only check the Result and throw warnings on Err. If you have a better better way, please let me know.
  2. Running the cgroups related tests would require permissions. Do you think we should run those cgroups tests inside the docker containers? Otherwise, there will be a sudo to set up the test environments.

@FedericoPonzi
Copy link
Owner

Sorry for the delay!

  1. Let's merge what you have and we can see if we can find a way to avoid root user later.
  2. The GitHub ci has passwordless sudo so we could set something up (maybe in another PR). I was thinking we could still include some tests and then run them manually but maybe it's not a good idea to keep skipped tests around. Instead, can you please do a one off test with root user to verify it actually works as we expect it to? It looks good to me but we still need to test to ensure it works. If you can then update the PR description with the testing procedure and give me a ping I think we can merge it after.

@kemingy
Copy link
Collaborator Author

kemingy commented Mar 17, 2025

Hi, I think I need some time to figure out what kind of permissions that the cgroups need.
I tested on my WSL and a container, both failed to create the cgroup even with sudo. I have been busy these days. Will give you some feedback later.

Copy link
Owner

@FedericoPonzi FedericoPonzi left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this self contained script will create a testing env in /tmp/horusttest. I've tested it from my ubuntu machine and found that the current docker build is broken (I will need to bump the debian version) and tldr; the test is failing because of "unknown error":

#!/bin/bash

# Step 1: Generate Dockerfile
cat << 'EOF' > /tmp/horusttest/Dockerfile
FROM federicoponzi/horust:v0.2.0

# Copy Horust service definition into the container
COPY horust_service.toml /etc/horust/services/

# Copy the script that attempts to exceed PID limits
COPY exceed_pids.sh /usr/local/bin/

# Make the script executable
RUN chmod +x /usr/local/bin/exceed_pids.sh

# Set entrypoint to Horust
ENTRYPOINT ["/sbin/horust"]
EOF

# Step 2: Create Horust service definition
cat << 'EOF' > /tmp/horusttest/horust_service.toml
name = "exceed_pids"
command = "/usr/local/bin/exceed_pids.sh"

[resource-limit]
pids-max = 3
EOF

# Step 3: Create script to exceed PID limits
cat << 'EOF' > /tmp/horusttest/exceed_pids.sh
#!/bin/bash
# This script attempts to create 5 child processes, exceeding the pids-max limit of 3

echo "Starting 5 child processes to test PID limit enforcement..."

for i in {1..5}; do
    sleep 10 &
done

# Wait for all background processes to finish
wait
EOF

# Step 4: Create run.sh script to build and run the Docker container
cat << 'EOF' > /tmp/horusttest/run.sh
#!/bin/bash

# Navigate to the temporary directory
cd /tmp/horusttest

# Build the Docker image
docker build -t horust_pid_limit_test .

# Run the Docker container
docker run --privileged --rm --name horust_pid_limit_test_container horust_pid_limit_test
EOF

# Make the run.sh script executable
chmod +x /tmp/horusttest/run.sh

echo "Scripts have been generated in the /tmp/horusttest directory."
echo "Navigate to /tmp/horusttest and execute ./run.sh to build and run the Docker container."

As a prerequisite, I've checkedout the pr's code (gh pr checkout 273) then run make build to build the base docker image (after updating the base debian from buster-slim to the latest bookworm-slim).
Additionally I've updated the code to print out the error:

[2025-03-18T17:29:50Z WARN  horust::horust::supervisor::process_spawner] Failed to create the cgroup for service: exceed_pids, please check if the current user(0) has the required permissions. Error: Some(Failed to create the cgroup
    
    Caused by:
        an unknown error)

honestly the cgroup-rs library doesn't seem to be well maintained, I've also considered forking from it. In any case, it will need more investigating to understand what's going on, unfortunately.

@FedericoPonzi FedericoPonzi mentioned this pull request Mar 18, 2025
6 tasks
Signed-off-by: Keming <kemingy94@gmail.com>
Signed-off-by: Keming <kemingy94@gmail.com>
@kemingy
Copy link
Collaborator Author

kemingy commented Mar 22, 2025

Hi @FedericoPonzi, I changed to libcgroups which is used by youki. This works if I run with sudo locally. However, the upstream is still using nix 0.28, which cannot accept the Pid in nix 0.29 (used by horust). I have to fork the repo. If it's all right, I will try to create a PR to the upstream to see if they accept that.

I also create a PR to bump the version there: youki-dev/youki#3123

If that can be merged, will switch to the upstream version instead of a fork.

Signed-off-by: Keming <kemingy94@gmail.com>
@kemingy
Copy link
Collaborator Author

kemingy commented Mar 24, 2025

Hi @FedericoPonzi, this upstream has accepted the PR, but they will need some time for the next release. So I pined the rev to the current commit SHA. Please take a look. Let me know if you have any other concerns.

@FedericoPonzi
Copy link
Owner

FedericoPonzi commented Mar 24, 2025

I think it's fine to merge with sha-1 and update as soon as there is a new release (Thanks for taking the time to contributed to youki as well!!). That being said I've tested the branch with the script I've posted above (I've rebased your branch on top of master as I've included some fixes for docker)
When I run it I'm getting:

[2025-03-24T21:19:26Z WARN  horust::horust::supervisor::process_spawner] Failed to add the resource limit to exceed_pids: io error: failed to open /sys/fs/cgroup/horust_exceed_pids/cpu.max: No such file or directory (os error 2)

I've started an interactive session:

docker run --privileged -it --rm --name horust_pid_limit_test_container horust_pid_limit_test horust

then I've manually run horust:

root@b2b5c13af9f6:/# horust --services-path /etc/horust/services/

and checked the folder:

root@b2b5c13af9f6:/# ls /sys/fs/cgroup/horust_exceed_pids/        
cgroup.controllers  cgroup.max.descendants  cgroup.threads  io.pressure
cgroup.events	    cgroup.pressure	    cgroup.type     memory.pressure
cgroup.freeze	    cgroup.procs	    cpu.pressure
cgroup.kill	    cgroup.stat		    cpu.stat
cgroup.max.depth    cgroup.subtree_control  cpu.stat.local

while the folder is present, for some reason cpu.max is not there, I'm not sure why I haven't reviewed the code yet either.

Copy link
Owner

@FedericoPonzi FedericoPonzi left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've added some high level comments

kemingy added 2 commits March 25, 2025 12:10
Signed-off-by: Keming <kemingy94@gmail.com>
@kemingy
Copy link
Collaborator Author

kemingy commented Mar 25, 2025

Hi @FedericoPonzi, I have searched for some documents but didn't found an elegant way to test them. The current solution is docker run with --privileged --cgroupns=host. At least this works as expected. I have fixed the previous error that cpu.max may not exist.

@FedericoPonzi
Copy link
Owner

One last thing, can you update the documentation with additional information you posted in the message above? (tested with docker and link to the containerd and podman unmask).

Hi @FedericoPonzi, I have searched for some documents but didn't found an elegant way to test them. The current solution is docker run with --privileged --cgroupns=host. At least this works as expected. I have fixed the previous error that cpu.max may not exist.

* I haven't tried containerd: [Enable Writable cgroups for unprivileged containers containerd/containerd#10924](https://github.com/containerd/containerd/issues/10924)

* I didn't make it work with podman unmask: [Enable cgroupsv2 rw mount via security-opt unmask containers/podman#9536](https://github.com/containers/podman/pull/9536)

Then I think it's good to be merged! Thank you!

Signed-off-by: Keming <kemingy94@gmail.com>
@kemingy
Copy link
Collaborator Author

kemingy commented Mar 26, 2025

Other untested solutions have been added to the doc.

@FedericoPonzi FedericoPonzi merged commit be2069e into FedericoPonzi:master Mar 26, 2025
7 checks passed
@kemingy kemingy deleted the cgroups branch March 26, 2025 11:11
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Per-service resource limits

2 participants