|
1 | | -# PowerAPI Benchmark Procedure |
| 1 | +# PowerAPI Framework Benchmarks |
2 | 2 |
|
3 | | -This project benchmarks **PowerAPI** measurements and **RAPL** (using performance events) to evaluate variability. The procedures are designed for reproducibility, ensuring consistent results across future versions of PowerAPI, RAPL, and related dependencies. |
| 3 | +Here’s how the **"What it does"** section could look based on the content you provided: |
4 | 4 |
|
5 | | -## Overview |
| 5 | +--- |
6 | 6 |
|
7 | | -The benchmark involves automating several tasks, including inventory management, job creation and submission, monitoring, and results aggregation. The process is run across the G5K nodes and consists of a series of sequential steps: |
| 7 | +## What It Does |
8 | 8 |
|
9 | | -1. **Inventory Update**: Create or update a G5K node inventory with metadata (accessible via API), which is subsequently used in further steps. |
10 | | -2. **Job Generation and Submission**: For each node, generate a job submission and save it in JSON format. Each job file includes: |
11 | | - - Paths to generated bash script for the specific node |
12 | | - - Metadata file path |
13 | | - - Result directory path |
14 | | - - The `OAR_JOB_ID` of the submitted job and its state. |
15 | | - - The job's site information. |
16 | | -3. **Job Monitoring**: |
17 | | - - Loop until all jobs reach a “terminal state” ([Finishing | Terminated | Failed]). |
18 | | - - For jobs still running or waiting ([Waiting, Launching, Running]), check their status via `oarstat` and update accordingly. |
19 | | -4. **Results Aggregation** (To Be Implemented): Aggregate raw results into a centralized location. |
20 | | -5. **Report Generation** (To Be Implemented): Create a summary report from aggregated results. |
| 9 | +This repository contains the source code for generating and running benchmarks for the **PowerAPI Framework**. These benchmarks are designed to adapt to the configuration and architecture of underlying nodes in the **Grid5000** infrastructure. |
21 | 10 |
|
22 | | -Steps 4 and 5 are planned but not yet implemented. |
| 11 | +### Key Processes: |
23 | 12 |
|
24 | | ---- |
| 13 | +1. **Gather Node Information**: The benchmarks start by scraping the Grid5000 API to collect details about all available nodes. |
| 14 | +2. **Reuse Existing Job List (Optional)**: If a `jobs.yaml` file exists, the tool can leverage it to initialize the job list. |
| 15 | +3. **Generate Bash Scripts**: For each filtered node, a custom bash script is generated using templates located in the `/templates` directory. |
| 16 | +4. **Submit Jobs via OAR**: The generated scripts are submitted to corresponding nodes through SSH using **OAR**, ensuring that no more than `N` jobs are simultaneously active. |
| 17 | +5. **Monitor and Collect Results**: |
| 18 | + - The status of each submitted job is tracked until it completes (either successfully or in a failed state). |
| 19 | + - Upon completion, **rsync** is used to retrieve the results files locally. If the retrieval fails, the job’s state is marked as `UnknownState` for manual review. |
| 20 | +6. **Store Results**: Once all filtered nodes have completed their benchmark jobs, the benchmarking process concludes, and all result files are stored in the `/results.d` directory. |
25 | 21 |
|
26 | | -## Benchmark Execution Details |
| 22 | +This automated workflow ensures efficient and scalable benchmarking tailored to the dynamic nature of the Grid5000 environment. |
27 | 23 |
|
28 | | -The benchmark approach is designed to maximize efficiency and resource utilization by reserving each node only once per benchmark run. The generated scripts handle all necessary steps for each measurement. |
29 | 24 |
|
30 | | -### Measurement Collection Workflow |
| 25 | +## Why it exists. |
31 | 26 |
|
32 | | -1. **Performance Event Measurements**: |
33 | | - - Execute `perf` events for `NB_ITER` iterations with: `perf event -a -o /tmp/perf_${i}.stat -e ${PROCESSOR_CORRESPONDING_EVENTS} stress-ng --cpu ${NB_CPU} --cpu-ops ${NB_CPU_OPS}` |
34 | | - - **PROCESSOR_CORRESPONDING_EVENTS** are selected based on a hardcoded mapping. |
35 | | - - **NB_CPU** iterates through a list from 1 to the maximum CPU count. |
36 | | - - **NB_CPU_OPS** is processed to meet two conditions: |
37 | | - - Cumulative `stress-ng` run times stay below the reservation time. |
38 | | - - Each measurement uses a consistent operation count. |
| 27 | +This benchmarks aim at measuring the variability introduced by the PowerAPI framework over the RAPL interface. |
39 | 28 |
|
40 | | -2. **Aggregation of `perf` Results**: |
41 | | - - Once `${NB_ITER}` iterations are complete, aggregate `perf_${i}.stat` files into a single `perf_${NB_CPU}_${NB_CPU_OPS}.csv` stored on NFS. |
| 29 | +## Who it’s for. |
| 30 | +Currently, this work remains internal for PowerAPI staff to study the variability of the PowerAPI framework along its development. |
42 | 31 |
|
43 | | -3. **HWPC Sensor Measurements**: |
44 | | - - Execute HWPC measurements, storing the data in CSV format similar to `perf`. |
45 | 32 |
|
46 | | -4. **SmartWatts Post-Mortem Processing**: |
47 | | - - Generate PowerReports from HWPC data in post-mortem mode. |
| 33 | +## Installation |
48 | 34 |
|
49 | | -5. **Final Aggregation**: |
50 | | - - Consolidate `[HWPC|SMARTWATTS|PERF]_[NB_CPU]_[NB_CPU_LOAD].csv` files on the NFS storage. |
| 35 | +To use this repository, you need to clone it locally, ensure you have Cargo installed, and then compile and run the project. Follow the steps below: |
51 | 36 |
|
52 | | -### Key Considerations |
| 37 | +### Prerequisites |
53 | 38 |
|
54 | | -- Using **SmartWatts in post-mortem mode** aligns with the variability measurement goals. |
55 | | -- **Storage Constraints**: NFS storage limits are set at 25GB per site. With uniform node distribution, this equates to an approximate maximum file size of 3.32 MB per aggregated result file, which is manageable. |
56 | | -- **Run Repetition**: Each `stress-ng` run will be executed **30 times** to establish statistical robustness. |
| 39 | +Before proceeding, make sure your system meets the following requirements: |
| 40 | + |
| 41 | +- **Rust and Cargo**: Install Rust (which includes Cargo, Rust’s package manager and build system). If Rust is not installed, follow the instructions below: |
| 42 | + 1. Download and install Rust by running: |
| 43 | + ```bash |
| 44 | + curl --proto '=https' --tlsv1.2 -sSf https://sh.rustup.rs | sh |
| 45 | + ``` |
| 46 | + 2. Follow the on-screen instructions to complete the installation. |
| 47 | + 3. Add Cargo to your PATH (usually done by Rust installer automatically). Restart your terminal if necessary. |
| 48 | + |
| 49 | +- **Dependencies**: |
| 50 | + - A working installation of **OAR** on your Grid5000 node (or appropriate access). |
| 51 | + - SSH access configured to interact with the Grid5000 nodes (for example, `ssh rennes` ran locally shall connect you to the rennes' frontend). |
| 52 | +
|
| 53 | +### Clone the Repository |
| 54 | +
|
| 55 | +Clone this repository to your local machine: |
| 56 | +
|
| 57 | +```bash |
| 58 | +git clone https://github.com/powerapi-ng/benchmarking.git |
| 59 | +cd benchmarking |
| 60 | +``` |
| 61 | +
|
| 62 | +### Build the Project |
| 63 | +
|
| 64 | +Compile the project using Cargo: |
| 65 | +
|
| 66 | +```bash |
| 67 | +cargo build --release |
| 68 | +``` |
| 69 | +
|
| 70 | +This will produce an optimized executable located in the `target/release/` directory. |
| 71 | +
|
| 72 | +### Run the Project |
| 73 | +
|
| 74 | +Execute the compiled program: |
| 75 | +
|
| 76 | +```bash |
| 77 | +./target/release/benchmarking |
| 78 | +``` |
57 | 79 |
|
58 | | ---- |
59 | 80 |
|
60 | 81 | # Tips G5k |
61 | 82 |
|
|
0 commit comments