Arbiter 3 was created for the Rocky 8 and 9 interactive systems at the Center for High Performance Computing, University of Utah. It can run in other environments, provided the interactive systems are managed by systemd. In general, the software can be installed by:
- Installing the Arbiter Django Application
- Installing a Prometheus Instance
- Installing the cgroup-warden on each interactive node
The core arbiter service is a Django application. This needs to be installed on a machine with secure network access to Prometheus and the desired login nodes.
Python 3.11+ is required for the arbiter service. It is likely that your package manager has Python 3.11 available as a package. On Rocky 9.2+, it can be installed with
sudo dnf install python3.11pip is required as well.
python3.11 -m ensurepip --default-pipStart by navigating to the directory where you want arbiter's configuration placed.
cd /path/to/config_dstCreate a virtual environment and then install the arbiter3 PyPi package.
python3.11 -m venv venv
source venv/bin/activate
pip install arbiter3This will install the arbiter modules and its dependencies in venv/lib/source-packages/, as well as the setup command arbiter-init.
First obtain the source code via git
cd /path/to/install
git clone https://github.com/chpc-uofu/arbiter
cd arbiterCreate a virtual environment and install arbiter with pip.
python3.11 -m venv venv
source venv/bin/activate
pip install .This will install the arbiter modules and its dependencies in venv/lib/source-packages/, as well as the setup command arbiter-init.
This method is if you already have some internal django server and want to add arbiter to it as an app. The other two installation methods are recommended for most circumstances.
First pip install arbiter3 in whatever python environment you have set up
pip install arbiter3from here add arbiter3.arbiter to your installed apps in your settings file.
settings.py
INSTALLED_APPS = [
...
"arbiter3.arbiter",
...
]then add the urls of arbiter in urls.py.
urlpatterns = [
...
path("arbiter/", include("arbiter3.arbiter.urls")),
...
]from here, you need to ensure all arbiter settings are configured in your settings file, we recommend just copying them from the template settings located at arbiter3/portal/settings.py then updating their values accordingly.
Lastly migrate the new tables
./manage.py migrate Initialize the default configuration files by running the respective command for your installation method. as the user you wish to run arbiter in your config directory.
#pip installation
arbiter-init
#git installation
venv/bin/python3.11 arbiter3/scripts/initialize.pyThis will generate the following files:
arbiter.py - The entry point to run arbiter evaluation loop, webserver, or database management
settings.py - The main configuration file for arbiter.
arbiter-web.service - A starting point for the service that runs the webserver.
arbiter-eval.service - A starting point for the service that runs the evaluation loop. You may want to adjust how often it evaluates, by default it evaluates usage every 30s
The settings for arbiter are configured in the settings.py file. This must be configured to run arbiter. See settings.md for details.
venv/bin/python3.11 arbiter.py migrateThis will create all the initial database tables. If using the default database of SQLite, this will create a db.sqlite3 file.
The arbiter service has two components, the web server and the core evaluation loop.
The arbiter web service can be run in a testing capacity with the following command:
venv/bin/python3.11 arbiter.py runserver Which will listen on localhost:8000. For production, Arbiter should be run with Gunicorn. For example,
gunicorn arbiter3.portal.wsgi --bind 0.0.0.0:8000 Preferably, this will be set up behind a proxy such as NGINX.
An Example for running this as a service were generated with arbiter-init, located in arbiter-web.service. Update this to suite your needs.
The arbiter evaluation loop can be run with
./arbiter.py evaluate
To run it in a loop, you can pass the --seconds, --minutes, or --hours flags.
Additionally, you may pass the --refresh-interval flag, of the format 1h15m5s, to determine the interval at which arbiter ensures reported limits are accurate. Default is 10m.
This should also be set up to run as a service, see arbiter-eval.service.
See the cgroup-warden installation guide.
It is highly recommended to communicate with cgroup-wardens in secure mode.
Each warden must be configured to use TLS and bearer token auth, and arbiter must be configured to reflect this by modifying
verify_ssl, use_tls, and bearer in the configuration
See the Prometheus installation guide. For general configuration, see here.
Each cgroup-warden instance needs to be scraped. The job looks like:
scrape_configs:
- job_name: 'cgroup-warden'
scrape_interval: 30s
static_configs:
- targets:
- login1.yoursite.edu:2112
- login2.yoursite.edu:2112
- login3.yoursite.edu:2112
- login4.yoursite.edu:2112
scheme: https
# recommended but optional, strip port from instance
relabel_configs:
- source_labels: [__address__]
target_label: instance
regex: '^(.*):[0-9]+$'
replacement: '${1}'- The recommended value of
scrape_intervalis30s. - The job name must be
cgroup-wardenor have it as a prefix.