Skip to content
Closed
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
2 changes: 2 additions & 0 deletions .azuredevops/rocm-ci.yml
Original file line number Diff line number Diff line change
Expand Up @@ -13,6 +13,7 @@ trigger:
batch: true
branches:
include:
- develop
- amd-staging
- amd-mainline
paths:
Expand All @@ -29,6 +30,7 @@ pr:
autoCancel: true
branches:
include:
- develop
- amd-staging
- amd-mainline
paths:
Expand Down
4 changes: 4 additions & 0 deletions CHANGELOG.md
Original file line number Diff line number Diff line change
Expand Up @@ -2,6 +2,10 @@

Full documentation for ROCm Compute Profiler is available at [https://rocm.docs.amd.com/projects/rocprofiler-compute/en/latest/](https://rocm.docs.amd.com/projects/rocprofiler-compute/en/latest/).

## Unreleased

* Add Docker files to package the application and dependencies into a single portable and executable standalone binary file

## (Unreleased) ROCm Compute Profiler 3.1.0 for ROCm 6.4.0

### Added
Expand Down
24 changes: 24 additions & 0 deletions CMakeLists.txt
Original file line number Diff line number Diff line change
Expand Up @@ -372,6 +372,30 @@ add_custom_target(
"src/${PACKAGE_NAME},cmake/Dockerfile,cmake/rocm_install.sh,docker/docker-entrypoint.sh,src/rocprof_compute_analyze/convertor/mongodb/convert"
)

# Standalone binary creation
add_custom_target(
standalonebinary
# Change working directory to src
WORKING_DIRECTORY ${PROJECT_SOURCE_DIR}/src
# Check nuitka
COMMAND ${Python3_EXECUTABLE} -m pip list | grep -i nuitka > /dev/null 2>&1
# Check patchelf
COMMAND ${Python3_EXECUTABLE} -m pip list | grep -i patchelf > /dev/null 2>&1
# Create VERSION.sha file
COMMAND git -C ${PROJECT_SOURCE_DIR} rev-parse HEAD > VERSION.sha
# Build standalone binary
COMMAND
${Python3_EXECUTABLE} -m nuitka --mode=onefile
--include-data-files=${PROJECT_SOURCE_DIR}/VERSION*=./ --enable-plugin=no-qt
--include-package-data=dash_svg --include-package=dash_bootstrap_components
--include-package=plotly --include-package-data=kaleido
--include-package=rocprof_compute_soc --include-package-data=rocprof_compute_soc
--include-package-data=utils rocprof-compute
# Remove library rpath from executable
COMMAND patchelf --remove-rpath rocprof-compute.bin
# Move to build directory
COMMAND mv rocprof-compute.bin ${CMAKE_BINARY_DIR})

install(
FILES ${PROJECT_SOURCE_DIR}/LICENSE
DESTINATION ${CMAKE_INSTALL_DOCDIR}
Expand Down
19 changes: 19 additions & 0 deletions LICENSE
Original file line number Diff line number Diff line change
Expand Up @@ -19,3 +19,22 @@ AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,
OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE
SOFTWARE.

This application uses the following dependencies and their usage is governed by their respective licenses
Python 3 standard library: PSFL
Nuitka specific runtime code: Apache 2.0 license
astunparse python library: PSFL
colorlover python library: MIT
dash python library: MIT
matplotlib python library: PSFL
numpy python library: BSD
pandas python library: BSD
pymongo python library: Apache 2.0 license
pyyaml python library: MIT
tabulate python library: MIT
tqdm python library: MIT
dash-svg python library: MIT
dash-bootstrap-components python library: MIT
kaleido python library: MIT
setuptools python library: MIT
plotille python library: MIT
25 changes: 23 additions & 2 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -37,8 +37,8 @@ Users may checkout `amd-staging` to preview upcoming features.
## Testing

To quickly get the environment (bash shell) for building and testing, run the following commands:
* `cd utils/docker_env`
* `docker compose run app`
* `cd docker`
* `docker compose -f docker-compose.test.yml run test`

Inside the docker container, clean, build and install the project with tests enabled:
```
Expand All @@ -56,6 +56,27 @@ For manual testing, you can find the executable at `install/bin/rocprof-compute`

NOTE: This Dockerfile uses `rocm/dev-ubuntu-22.04` as the base image

## Standalone binary

To create a standalone binary, run the following commands:
* `cd docker`
* `docker compose -f docker-compose.standalone.yml run standalone`

You should find the rocprof-compute.bin standalone binary inside the `build` folder in the root directory of the project.

To build the binary we follow these steps:
* Use RHEL 8 image used to build ROCm as the base image
* Install python3.8
* Install dependencies for runtime and for making standalone binary
* Call the make target which uses Nuitka to build the standalone binary

NOTE: Since RHEL 8 ships with glibc version 2.28, this standalone binary can only be run on environment with glibc version greater than 2.28.
glibc version can be checked using `ldd --version` command.

NOTE: libnss3.so shared library is required when using --roof-only option which generates roofline data in PDF format

To test the standalone binary provide the `--call-binary` option to pytest.

## How to Cite

This software can be cited using a Zenodo
Expand Down
22 changes: 22 additions & 0 deletions docker/Dockerfile.standalone
Original file line number Diff line number Diff line change
@@ -0,0 +1,22 @@
FROM redhat/ubi8:8.10-1184

WORKDIR /app

RUN yum install -y curl gcc cmake git

# Allows running git commands in /app
RUN git config --global --add safe.directory /app

RUN yum install -y python38 python38-devel && \
yum clean all && \
rm -rf /var/cache/yum && \
curl -sS https://bootstrap.pypa.io/get-pip.py -o get-pip.py && \
python3.8 get-pip.py

CMD ["/bin/bash", "-c", "\
python3.8 -m pip install -r requirements.txt \
&& python3.8 -m pip install nuitka patchelf \
&& rm -rf build \
&& cmake -B build -S . \
&& make -C build standalonebinary \
"]
17 changes: 9 additions & 8 deletions utils/docker_env/Dockerfile → docker/Dockerfile.test
Original file line number Diff line number Diff line change
Expand Up @@ -6,10 +6,13 @@ WORKDIR /app

# Update package list and install prerequisites
RUN apt-get update && apt-get install -y \
software-properties-common cmake locales \
software-properties-common cmake locales git \
&& add-apt-repository ppa:deadsnakes/ppa \
&& apt-get update

# Allows running git commands in /app
RUN git config --global --add safe.directory /app

# Generate the desired locale
RUN locale-gen en_US.UTF-8

Expand All @@ -19,11 +22,9 @@ RUN apt-get install -y python3.10 python3.10-venv python3.10-dev python3-pip
# Set Python 3.10 as the default python3
RUN update-alternatives --install /usr/bin/python3 python3 /usr/bin/python3.10 1

# Copy your application code to the container
COPY . .

# Install any dependencies specified in requirements.txt
RUN pip3 install --no-cache-dir -r requirements.txt -r requirements-test.txt

# Command to run your application
CMD ["/bin/bash"]
# Run interactive bash shell
CMD ["/bin/bash", "-c", "\
python3.10 -m pip install -r requirements.txt -r requirements-test.txt \
&& exec /bin/bash \
"]
12 changes: 12 additions & 0 deletions docker/docker-compose.standalone.yml
Original file line number Diff line number Diff line change
@@ -0,0 +1,12 @@
services:
standalone:
build:
context: ../
dockerfile: docker/Dockerfile.standalone
devices:
- /dev/kfd
- /dev/dri
security_opt:
- seccomp:unconfined
volumes:
- ../:/app
12 changes: 12 additions & 0 deletions docker/docker-compose.test.yml
Original file line number Diff line number Diff line change
@@ -0,0 +1,12 @@
services:
test:
build:
context: ../
dockerfile: docker/Dockerfile.test
devices:
- /dev/kfd
- /dev/dri
security_opt:
- seccomp:unconfined
volumes:
- ../:/app
24 changes: 23 additions & 1 deletion src/rocprof_compute_profile/profiler_base.py
Original file line number Diff line number Diff line change
Expand Up @@ -26,6 +26,7 @@
import logging
import os
import re
import shutil
import sys
import time
from abc import ABC, abstractmethod
Expand Down Expand Up @@ -77,6 +78,26 @@ def join_prof(self, out=None):
out = self.__args.path + "/pmc_perf.csv"
files = glob.glob(self.__args.path + "/" + "pmc_perf_*.csv")
files.extend(glob.glob(self.__args.path + "/" + "SQ_*.csv"))

if self.get_args().hip_trace:
# remove hip api trace ouputs from this list
files = [
f
for f in files
if not re.compile(r"^.*_hip_api_trace\.csv$").match(
os.path.basename(f)
)
]

if self.get_args().kokkos_trace:
# remove marker api trace ouputs from this list
files = [
f
for f in files
if not re.compile(r"^.*_marker_api_trace\.csv$").match(
os.path.basename(f)
)
]
elif type(self.__args.path) == list:
files = self.__args.path
else:
Expand Down Expand Up @@ -266,7 +287,8 @@ def pre_processing(self):
# verify correct formatting for application binary
self.__args.remaining = self.__args.remaining[1:]
if self.__args.remaining:
if not Path(self.__args.remaining[0]).is_file():
# Ensure that command points to an executable
if not shutil.which(self.__args.remaining[0]):
console_error(
"Your command %s doesn't point to a executable. Please verify."
% self.__args.remaining[0]
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -32,6 +32,12 @@ Panel Config:
peak: (((($max_sclk * $cu_per_gpu) * 64) * 2) / 1000)
pop: None # No perf counter
tips:
MFMA FLOPs (F8):
value: None # No HW module
unit: GFLOP
peak: None # No HW module
pop: None # No HW module
tips:
MFMA FLOPs (BF16):
value: None # No perf counter
unit: GFLOPs
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -241,6 +241,12 @@ Panel Config:
max: None # No HW module
unit: (instr + $normUnit)
tips:
MFMA-F8:
avg: None # No HW module
min: None # No HW module
max: None # No HW module None # No HW module
unit: (instr + $normUnit)
tips:
MFMA-F16:
avg: None # No HW module
min: None # No HW module
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -21,16 +21,22 @@ Panel Config:
metric:
VALU FLOPs:
value: None # No perf counter
Unit: None
unit: None
peak: None
pop: None
tips:
VALU IOPs:
value: None # No perf counter
Unit: None
unit: None
peak: None
pop: None
tips:
MFMA FLOPs (F8):
value: None # No perf counter
unit: GFLOP
peak: None # No perf counter
pop: None # No perf counter
tips:
MFMA FLOPs (BF16):
value: None # No perf counter
Unit: None
Expand All @@ -39,25 +45,25 @@ Panel Config:
tips:
MFMA FLOPs (F16):
value: None # No perf counter
Unit: None
unit: None
peak: None
pop: None
tips:
MFMA FLOPs (F32):
value: None # No perf counter
Unit: None
unit: None
peak: None
pop: None
tips:
MFMA FLOPs (F64):
value: None # No perf counter
Unit: None
unit: None
peak: None
pop: None
tips:
MFMA IOPs (INT8):
value: None # No perf counter
Unit: None
unit: None
peak: None
pop: None
tips:
Expand Down Expand Up @@ -174,6 +180,12 @@ Panel Config:
max: None # No perf counter
unit: (OPs + $normUnit)
tips:
F8 OPs:
avg: None # No HW module
min: None # No HW module
max: None # No HW module
unit: (OPs + $normUnit)
tips:
F16 OPs:
avg: None # No perf counter
min: None # No perf counter
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -32,6 +32,12 @@ Panel Config:
peak: (((($max_sclk * $cu_per_gpu) * 64) * 2) / 1000)
pop: None # No perf counter
tips:
MFMA FLOPs (F8):
value: None # No HW module
unit: GFLOP
peak: None # No HW module
pop: None # No HW module
tips:
MFMA FLOPs (BF16):
value: None # No perf counter
unit: GFLOPs
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -73,13 +73,13 @@ Panel Config:
unit: Unit
tips: Tips
metric:
INT-32:
INT32:
avg: None # No perf counter
min: None # No perf counter
max: None # No perf counter
unit: (instr + $normUnit)
tips:
INT-64:
INT64:
avg: None # No perf counter
min: None # No perf counter
max: None # No perf counter
Expand Down Expand Up @@ -241,6 +241,12 @@ Panel Config:
max: None # No HW module
unit: (instr + $normUnit)
tips:
MFMA-F8:
avg: None # No HW module
min: None # No HW module
max: None # No HW module None # No HW module
unit: (instr + $normUnit)
tips:
MFMA-F16:
avg: None # No HW module
min: None # No HW module
Expand Down
Loading
Loading