Welcome to the repository for my internship project focused on the analysis of energy consumption during input/output (IO) operations. This project includes a set of scripts and tools designed to calculate, visualize, and compare energy consumption using different methods. The main goal is to determine the most accurate approach for measuring energy consumed by IO operations, specifically focusing on projections and averages as well as combining them with IOR
- Project Overview
- Project Structure
- Installation
- Usage
- Scripts Explanation
- Guide to Installing and Using IOR for I/O Performance Analysis
This project was developed during my internship and aims to analyze energy consumption during IO operations on storage devices. The focus is on calculating the exact energy consumed using wattmeter readings and comparing these calculations using different methods such as projection and average energy methods. The scripts automate the entire process from data collection to analysis and visualization.
The project is organized as follows:
├── Makefile         # Automation script for various tasks
├── benchmark.sh      # Script for running IO benchmarks
├── format.sh        # Script for formatting and processing raw data
├── plotting.sh       # Script for plotting raw data
├── logs/          # Directory containing raw and processed data files
├── scripts/        # Directory containing all Python and Shell scripts
│  ├── plot/        # Directory containing scripts for plotting
│    └── ...
│  ├── format/       # Directory containing scripts for formatting
│    └── ...
│  ├── math/        # Directory containing scripts for mathematical computations
│    └── normalize.py  # Script to calculate normalized offsets
│    └── frequency.py  # Script to display frequency distributions
│    └── ...      Â
│  └── IOR/        # Directory containing IOR-related tools
│    └── iortest1.c   # C program based on iotest with IOR trace functionality
│    └── filter_traces.c # C program for filtering IOR traces
├── README.md
The iotest.c
file is a C program used to simulate IO operations. The associated header file, iotest.h
, contains the definitions and functions used within the iotest.c
file.
gcc -g iotest.c -o iotest -lm
This command compiles the C code and creates an executable named iotest
.
The benchmark.sh
script is used to run IO operations and measure energy consumption. Here's an example of how to run the benchmark:
sudo-g5k ./benchmark.sh READ RAND HDD
This script runs the IO benchmark with specified parameters (READ or WRITE mode, RANDOM or SEQUENTIAL (RAND or SEQ) access pattern, HDD or SSD storage type) and stores the results in the logs/
directory.
The ior_bench.sh
script is a specialized benchmark for measuring energy consumption using iortest1.c
. It's designed to test different read/write ratios and file sizes, and it captures energy data directly via the Grid5000 API.
This script takes one argument: the storage type (HDD or SSD).
./ior_bench.sh <storage_type>
For example, to run the benchmark on an HDD:
./ior_bench.sh HDD
The plotting.sh
script is used to generate various plots from the benchmark results. It supports different types of plots, such as baseline plots, boxplots, and IO energy consumption plots.
./plotting.sh <log_dir> <type> [<optional_arg>]
log_dir
: The directory containing log files (e.g.,logs/HDD/READ/RAND/
).type
: The type of plot to generate (e.g.,baseline
orsz_bloc
).optional_arg
: An optional argument to specify additional options (e.g.,nb_run
).
For example, to generate ALL plot for all runs per iteration:
./plotting.sh logs/HDD/READ/RAND/ plot_all nb_run
- iotest.c: This C program simulates IO operations by reading and writing data to a storage device.
- iotest.h: The header file contains function prototypes, macros, and structure definitions used in iotest.c.
Automates the process of running IO benchmarks on different storage devices.
Automates running IO benchmarks on different storage devices using the IOR library.
Generates visual plots of the energy consumption data collected during the benchmarks.
Organizes and formats raw data collected during IO tests.
./format.sh <directory_to_move>
- iortest1.c: Modified
iotest.c
including IOR trace functionality. - filter_traces.c: Parses raw trace data and formats it for replay.
- normalize.py: Calculates normalized offsets for IO operations.
- frequency.py: Displays the frequency distribution of IO requests.
Interactive node reservation using OAR:
oarsub -I -l host=1,walltime=7:00 -q default -p taurus -t deploy
kadeploy3 debian11-big
apt-get update
apt-get install -y build-essential openmpi-bin
apt-get install -y libnetcdf-dev libhdf5-dev libpnetcdf-dev
apt-get install -y m4 make gcc texinfo
wget http://ftp.gnu.org/gnu/autoconf/autoconf-2.71.tar.gz
tar xf autoconf-2.71.tar.gz
cd autoconf-2.71
./configure --prefix=/usr/local
make && sudo make install
cd ..
git clone https://github.com/hpc/ior.git
cd ior
autoconf --version
autoreconf -fi
export CPPFLAGS="$(pkg-config --cflags hdf5)"
export LDFLAGS="$(pkg-config --libs hdf5)"
./bootstrap
MPICC=mpicc ./configure --with-hdf5=/usr --with-ncmpi
make
export OMPI_ALLOW_RUN_AS_ROOT=1
export OMPI_ALLOW_RUN_AS_ROOT_CONFIRM=1
mpirun -np 1 ~/ior/src/ior -a POSIX -b 256m -s 1 -t 512 -w -i 1 -o ior_256M_testfile -k
strace -yy -f -e trace=read,write,lseek,open,openat,creat,close,unlink mpirun -np 1 ~/ior/src/ior -a POSIX -b 256m -s 1 -t 512 -r -i 1 -o ior_256M_testfile -k -z > trace.log 2>&1
gcc -o filter_trace filter_trace.c
./filter_trace trace.log > filtered_trace.txt
./iortest_replay_fixed --mode replay --trace-file filtered_trace.txt --data-file /path/to/datafile
This project includes a Makefile to automate common tasks such as compiling, running benchmarks, plotting results, and managing the environment on Grid5000. Using make
simplifies the workflow and ensures consistency.
make compile_iotest
: Compiles the C programiotest.c
to create an executable.compile_iotest: gcc -g iotest.c -o iotest -lm
make run_iotest
: Runs a single instance of theiotest
benchmark with a specific configuration (READ, 10 runs, 1 block, 1MB block size, 256MB file size).run_iotest: sudo-g5k ./iotest --mode read --nb_run 10 --nb_bloc 1 --sz_bloc 1M --filesize 256M
These targets run the main benchmark.sh
script with different configurations for HDD and SSD, for both sequential and random access patterns, and for both read and write operations.
make benchmark_taurus_HDD_RAND_READ
: Runs a random read benchmark on an HDD.make benchmark_taurus_HDD_SEQ_READ
: Runs a sequential read benchmark on an HDD.make benchmark_gros_SSD_RAND_READ
: Runs a random read benchmark on an SSD.make benchmark_gros_SSD_SEQ_READ
: Runs a sequential read benchmark on an SSD.make benchmark_taurus_HDD_RAND_WRITE
: Runs a random write benchmark on an HDD.make benchmark_taurus_HDD_SEQ_WRITE
: Runs a sequential write benchmark on an HDD.make benchmark_gros_SSD_RAND_WRITE
: Runs a random write benchmark on an SSD.make benchmark_gros_SSD_SEQ_WRITE
: Runs a sequential write benchmark on an SSD.
These targets simplify the process of reserving nodes on Grid5000.
make reservation_taurus_short
: Reserves 2 Taurus hosts for 2 hours with power monitoring enabled.make reservation_taurus_long
: Reserves 4 Taurus hosts for 12 hours with power monitoring enabled.make ior_bench_reservation_taurus
: Reserves a single Taurus host for 1 hour and 45 minutes for IOR benchmarks.
These targets automate the plotting of benchmark results. They generate different types of plots (baseline, all, boxplots) to visualize the data.
make plotting_HDD_READ_RAND
: Generates various plots for the random read HDD benchmark.make plotting_HDD_READ_SEQ
: Generates various plots for the sequential read HDD benchmark.make plotting_HDD_WRITE_RAND
: Generates various plots for the random write HDD benchmark.make plotting_HDD_WRITE_SEQ
: Generates various plots for the sequential write HDD benchmark.make format_HDD
: Formats the raw log data for the HDD.make plot_deltas_READ
: Plots the power delta (change in power consumption) for all read tests.make plot_deltas_WRITE
: Plots the power delta for all write tests.make apply_means_READ
: Calculates the mean power for all read tests.make apply_means_WRITE
: Calculates the mean power for all write tests.make calcul_HDD_READ
: Calculates energy consumption for the HDD read benchmarks usingcalcul_hdd.py
.make calcul_HDD_WRITE
: Calculates energy consumption for the HDD write benchmarks.