Skip to content
Open
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
41 commits
Select commit Hold shift + click to select a range
5672e09
Add files via upload
mattwill-amd Jul 8, 2025
7975be4
Uploading dummy config files
mattwill-amd Jul 9, 2025
b049331
Merge branch 'amd-staging' into config
mattwill-amd Jul 14, 2025
bf7b824
AQLprofile onboarding docs review
matwilli_amdeng Jul 14, 2025
24e4c04
Merge branch 'config' of https://github.com/AMD-ROCm-Internal/aqlprof…
matwilli_amdeng Jul 14, 2025
5c908cf
Update docs/install/aqlprofile-install.rst
mattwill-amd Jul 14, 2025
3a4b6e7
Deleting dummy files
matwilli_amdeng Jul 14, 2025
83d242a
Merge branch 'config' of https://github.com/AMD-ROCm-Internal/aqlprof…
matwilli_amdeng Jul 14, 2025
f727839
Convert MDs to RST
matwilli_amdeng Jul 15, 2025
2686c69
Updating conf
matwilli_amdeng Jul 15, 2025
cedddf9
Merge branch 'amd-staging' into config
mattwill-amd Jul 18, 2025
c65cc77
Update CODEOWNERS.txt
mattwill-amd Jul 18, 2025
910f7fa
Update index.rst
mattwill-amd Jul 18, 2025
e235faa
Apply suggestions from code review
mattwill-amd Jul 18, 2025
4758357
Initial meeting feedback
matwilli_amdeng Jul 24, 2025
0b83864
Merge branch 'config' of https://github.com/AMD-ROCm-Internal/aqlprof…
matwilli_amdeng Jul 24, 2025
77ec2dd
publish branch
Jul 28, 2025
0e16673
update description
Jul 28, 2025
e65ec11
Update docs/examples/sqtt-workflow.rst
mattwill-amd Jul 29, 2025
bcd6992
Format updates
Jul 30, 2025
46b4b9d
Final edits
Jul 30, 2025
354fcee
updating readme
Jul 30, 2025
307bd74
Leo feedback
Jul 30, 2025
1a37c44
Hardware names
Jul 31, 2025
5f08780
Table fix
Jul 31, 2025
36bcc9a
Review comments
Aug 1, 2025
7671342
Merge commit '36bcc9a5b6c6c0bf0203fd46f56e02d47c7fd4f4' into import/d…
systems-assistant[bot] Aug 7, 2025
eaac4af
Update conf.py
mattwill-amd Aug 18, 2025
0db2f7c
Apply suggestions from code review
mattwill-amd Jul 18, 2025
2b3aa97
publish branch
Jul 28, 2025
1501f07
update description
Jul 28, 2025
25fd92e
Format updates
Jul 30, 2025
1a32266
Final edits
Jul 30, 2025
4b697f4
updating readme
Jul 30, 2025
10e2aa6
Leo feedback
Jul 30, 2025
83d8855
Hardware names
Jul 31, 2025
6827640
Table fix
Jul 31, 2025
0f04055
Review comments
Aug 1, 2025
f8c812c
Rebase fixes
Aug 20, 2025
95b34be
Fixing readme
Aug 20, 2025
8083907
Merge branch 'develop' into import/develop/ROCm_aqlprofile/config
mattwill-amd Aug 20, 2025
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
2 changes: 1 addition & 1 deletion README.md
Original file line number Diff line number Diff line change
Expand Up @@ -105,4 +105,4 @@ This super-repo contains multiple subprojects, each of which retains the license
- πŸ’¬ [Start a discussion](https://github.com/ROCm/rocm-systems/discussions)
- 🐞 [Open an issue](https://github.com/ROCm/rocm-systems/issues)

We're happy to help!
We're happy to help!
6 changes: 6 additions & 0 deletions projects/aqlprofile/.github/CODEOWNERS.txt
Original file line number Diff line number Diff line change
@@ -0,0 +1,6 @@
* @gbaraldi_amdeng @bingma12_amdeng @chunyang_amdeng @sauverma_amdeng @bewelton_amdeng
# Documentation files
docs/ @ROCm/rocm-documentation
*.md @ROCm/rocm-documentation
*.rst @ROCm/rocm-documentation
.readthedocs.yaml @ROCm/rocm-documentation
8 changes: 4 additions & 4 deletions projects/aqlprofile/CMakeLists.txt
Original file line number Diff line number Diff line change
Expand Up @@ -55,7 +55,7 @@ if ( "${CMAKE_BUILD_TYPE}" STREQUAL release )
endif ()

# Enable/disable test
option(AQLPROFILE_BUILD_TESTS "Build tests for AQLProfile" OFF)
option(AQLPROFILE_BUILD_TESTS "Build tests for AQLprofile" OFF)

## Build tests
if(AQLPROFILE_BUILD_TESTS)
Expand Down Expand Up @@ -197,18 +197,18 @@ include ( CPack )
cpack_add_component(
runtime
DISPLAY_NAME "Runtime"
DESCRIPTION "Dynamic libraries for the AQLProfile")
DESCRIPTION "Dynamic libraries for the AQLprofile")

cpack_add_component(
asan
DISPLAY_NAME "ASAN"
DESCRIPTION "ASAN libraries for the AQLProfile"
DESCRIPTION "ASAN libraries for the AQLprofile"
DEPENDS asan)

if(AQLPROFILE_BUILD_TESTS)
cpack_add_component(
tests
DISPLAY_NAME "Tests"
DESCRIPTION "Tests for the AQLProfile"
DESCRIPTION "Tests for the AQLprofile"
DEPENDS runtime)
endif()
3 changes: 3 additions & 0 deletions projects/aqlprofile/README.md
Original file line number Diff line number Diff line change
Expand Up @@ -2,6 +2,9 @@

AQLprofile is an open source library that enables advanced GPU profiling and tracing on AMD platforms. It works in conjunction with [rocprofiler-sdk](https://github.com/ROCm/rocprofiler-sdk) to support profiling methods such as performance counters (PMC) and SQ thread trace (SQTT). AQLprofile provides the foundational mechanisms for constructing AQL packets and managing profiling operations across multiple AMD GPU architecture families.

> [!NOTE]
> The published documentation is available at [AQLprofile documentation] in an organized, easy-to-read format, with search and a table of contents. The documentation source files reside in the `aqlprofile/docs` folder of this repository. As with all ROCm projects, the documentation is open source. For more information on contributing to the documentation, see [Contribute to ROCm documentation](https://rocm.docs.amd.com/en/latest/contribute/contributing.html).

### Background

AQLprofile builds on concepts from the Heterogeneous System Architecture (HSA) and Architected Queuing Language (AQL), which define the foundations for GPU command processing and profiling on AMD platforms. For further reading:
Expand Down
2 changes: 1 addition & 1 deletion projects/aqlprofile/build.sh
Original file line number Diff line number Diff line change
Expand Up @@ -8,7 +8,7 @@ ROCM_PATH="${ROCM_PATH:=/opt/rocm}"
LD_RUNPATH_FLAG=" -Wl,--enable-new-dtags -Wl,--rpath,$ROCM_PATH/lib:$ROCM_PATH/lib64"

usage() {
echo -e "AQLProfile Build Script Usage:"
echo -e "AQLprofile Build Script Usage:"
echo -e "\nTo run ./build.sh PARAMs, PARAMs can be the following:"
echo -e "-h | --help For showing this message"
echo -e "-b | --build For compiling"
Expand Down
61 changes: 61 additions & 0 deletions projects/aqlprofile/docs/conf.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,61 @@
# Configuration file for the Sphinx documentation builder.
#
# This file only contains a selection of the most common options. For a full
# list see the documentation:
# https://www.sphinx-doc.org/en/master/usage/configuration.html

import re

'''
html_theme is usually unchanged (rocm_docs_theme).
flavor defines the site header display, select the flavor for the corresponding portals
flavor options: rocm, rocm-docs-home, rocm-blogs, rocm-ds, instinct, ai-developer-hub, local, generic
'''
html_theme = "rocm_docs_theme"
html_theme_options = {"flavor": "rocm-docs-home"}


# This section turns on/off article info
setting_all_article_info = True
all_article_info_os = ["linux"]
all_article_info_author = ""

# Dynamically extract component version
with open('../CMakeLists.txt', encoding='utf-8') as f:
pattern = r'.*\brocm_setup_version\(VERSION\s+([0-9.]+)[^0-9.]+' # Update according to each component's CMakeLists.txt
match = re.search(pattern,
f.read())
if not match:
raise ValueError("VERSION not found!")
version_number = "1.0.0"

# for PDF output on Read the Docs
project = "AQLprofile"
author = "Advanced Micro Devices, Inc."
copyright = "Copyright (c) 2025 Advanced Micro Devices, Inc. All rights reserved."
version = version_number
release = version_number

external_toc_path = "./sphinx/_toc.yml" # Defines Table of Content structure definition path

'''
Doxygen Settings
Ensure Doxyfile is located at docs/doxygen.
If the component does not need doxygen, delete this section for optimal build time
'''
#doxygen_root = "doxygen"
#doxysphinx_enabled = False
# doxygen_project = {
# "name": "doxygen",
# "path": "doxygen/xml",
#}

# Add more addtional package accordingly
extensions = [
"rocm_docs",
# "rocm_docs.doxygen",
]

html_title = f"{project} {version_number} documentation"

external_projects_current_project = "AQLprofile"
109 changes: 109 additions & 0 deletions projects/aqlprofile/docs/examples/pmc-workflow.rst
Original file line number Diff line number Diff line change
@@ -0,0 +1,109 @@
.. meta::
:description: A typical workflow for collecting PMC data
:keywords: AQLprofile, ROCm, API, how-to, PMC

**********************************************************
Performance Monitor Control (PMC) workflow with AQLprofile
**********************************************************

This page describes a typical workflow for collecting PMC data using AQLprofile (as integrated in `ROCprofiler-SDK <https://github.com/ROCm/rocprofiler-sdk>`__).
This workflow relies on creating a profile object, generating command packets, and iterating over output buffers:

1. **Intercept kernel dispatch**: The SDK intercepts kernel dispatch packets submitted to the GPU queue.
2. **Create a profile object**: A profile/session object is created, specifying the agent (GPU), events (counters), and output buffers.
3. **Generate command packets**: Start, stop, and read command packets are generated and injected into the queue around the kernel dispatch.
4. **Submit packets and run the kernel**: The kernel and profiling packets are submitted to the GPU queue for execution.
5. **Collect the output buffer**: After execution, the output buffer is read back from the GPU.
6. **Iterate and extract the results**: The SDK iterates over the output buffer to extract and report counter results.

The SDK abstracts queue interception and packet management so tool developers can focus on results.

Key API code snippets
=====================

These API snippets use the legacy interfaces from ``hsa_ven_amd_aqlprofile.h``. These are provided for understanding purposes only.
For new development, refer to the updated APIs in ``aql_profile_v2.h``.

.. note::

The ROCprofiler-SDK is migrating to these newer interfaces in ``aql_profile_v2.h``. You should use the APIs in ``aql_profile_v2.h`` to stay up-to-date.

Define the events and profile
-----------------------------

.. code:: cpp

// Select events (counters) to collect
hsa_ven_amd_aqlprofile_event_t events[] = {
{ HSA_VEN_AMD_AQLPROFILE_BLOCK_NAME_SQ, 0, 2 }, // Example: SQ block, instance 0, counter 2
{ HSA_VEN_AMD_AQLPROFILE_BLOCK_NAME_SQ, 0, 3 }
};

// Create profile object
hsa_ven_amd_aqlprofile_profile_t profile = {
.agent = agent, // hsa_agent_t
.type = HSA_VEN_AMD_AQLPROFILE_EVENT_TYPE_PMC,
.events = events,
.event_count = sizeof(events)/sizeof(events[0]),
.parameters = nullptr,
.parameter_count = 0,
.output_buffer = {output_ptr, output_size},
.command_buffer = {cmd_ptr, cmd_size}
};


Validate events
---------------

.. code:: cpp

bool valid = false;
hsa_ven_amd_aqlprofile_validate_event(agent, &events[0], &valid);
if (!valid) {
// Handle invalid event
}


Generate command packets
-------------------------

.. code:: cpp

hsa_ext_amd_aql_pm4_packet_t start_pkt, stop_pkt, read_pkt;
hsa_ven_amd_aqlprofile_start(&profile, &start_pkt);
hsa_ven_amd_aqlprofile_stop(&profile, &stop_pkt);
hsa_ven_amd_aqlprofile_read(&profile, &read_pkt);


Submit packets and run the kernel
---------------------------------

.. code:: cpp

// Pseudocode: inject packets into HSA queue
queue->Submit(&start_pkt);
queue->Submit(&kernel_pkt);
queue->Submit(&stop_pkt);
queue->Submit(&read_pkt);


Iterate and extract results
----------------------------

.. code:: cpp

hsa_ven_amd_aqlprofile_iterate_data(
&profile,
[](hsa_ven_amd_aqlprofile_info_type_t info_type,
hsa_ven_amd_aqlprofile_info_data_t* info_data,
void* user_data) -> hsa_status_t {
if (info_type == HSA_VEN_AMD_AQLPROFILE_INFO_PMC_DATA) {
printf("Event: block %d, id %d, value: %llu\n",
info_data->pmc_data.event.block_name,
info_data->pmc_data.event.counter_id,
info_data->pmc_data.result);
}
return HSA_STATUS_SUCCESS;
},
nullptr
);
93 changes: 93 additions & 0 deletions projects/aqlprofile/docs/examples/sqtt-workflow.rst
Original file line number Diff line number Diff line change
@@ -0,0 +1,93 @@
.. meta::
:description: A typical workflow for collecting detailed instruction-level traces
:keywords: AQLprofile, ROCm, API, how-to, SQTT

***********************************************
SQ Thread Trace (SQTT) workflow with AQLprofile
***********************************************

The SQ Thread Trace workflow focuses on collecting detailed instruction-level traces.
This workflow relies on creating a profile object, generating command packets, and iterating over output buffers:

1. **Intercept the kernel dispatch**: The SDK intercepts the kernel dispatch.
2. **Create a SQTT profile object**: A profile object is created for SQTT, specifying trace parameters and output buffers.
3. **Generate SQTT command packets**: Start, stop, and read packets for SQTT are generated and injected into the queue.
4. **Submit packets and run the kernel**: The kernel and SQTT packets are submitted for execution.
5. **Collect the trace buffer**: The trace output buffer is collected after execution.
6. **Iterate and decode trace data**: The SDK iterates over the trace buffer and decodes the SQTT data for analysis.

The SDK abstracts queue interception and packet management so tool developers can focus on results.

Key API code snippets
=====================

These API snippets use the legacy interfaces from ``hsa_ven_amd_aqlprofile.h``. These are provided for understanding purposes only.
For new development, refer to the updated APIs in ``aql_profile_v2.h``.

In the `ROCprofiler-SDK <https://github.com/ROCm/rocprofiler-sdk>`__ codebase, these APIs are wrapped and orchestrated in the ``aql``, ``hsa``, and ``thread_trace`` folders for queue interception, packet construction, and result iteration.

.. note::

The`ROCprofiler-SDK is migrating to these newer interfaces in ``aql_profile_v2.h``. You should use the APIs in ``aql_profile_v2.h`` to stay up-to-date.

Define parameters and profile
------------------------------

.. code:: cpp

hsa_ven_amd_aqlprofile_parameter_t params[] = {
{ HSA_VEN_AMD_AQLPROFILE_PARAMETER_NAME_ATT_BUFFER_SIZE, 0x1000000} // 16 MB buffer
};

hsa_ven_amd_aqlprofile_profile_t profile = {
.agent = agent,
.type = HSA_VEN_AMD_AQLPROFILE_EVENT_TYPE_TRACE,
.events = nullptr,
.event_count = 0,
.parameters = params,
.parameter_count = sizeof(params)/sizeof(params[0]),
.output_buffer = {trace_ptr, trace_size},
.command_buffer = {cmd_ptr, cmd_size}
};


Generate SQTT start/stop packets
---------------------------------

.. code:: cpp

hsa_ext_amd_aql_pm4_packet_t sqtt_start_pkt, sqtt_stop_pkt;
hsa_ven_amd_aqlprofile_start(&profile, &sqtt_start_pkt);
hsa_ven_amd_aqlprofile_stop(&profile, &sqtt_stop_pkt);


Submit packets and run the kernel
---------------------------------

.. code:: cpp

queue->Submit(&sqtt_start_pkt);
queue->Submit(&kernel_pkt);
queue->Submit(&sqtt_stop_pkt);


Iterate and decode trace data
-----------------------------

.. code:: cpp

hsa_ven_amd_aqlprofile_iterate_data(
&profile,
[](hsa_ven_amd_aqlprofile_info_type_t info_type,
hsa_ven_amd_aqlprofile_info_data_t* info_data,
void* user_data) -> hsa_status_t {
if (info_type == HSA_VEN_AMD_AQLPROFILE_INFO_TRACE_DATA) {
// info_data->trace_data.ptr, info_data->trace_data.size
decode_trace(info_data->trace_data.ptr, info_data->trace_data.size);
}
return HSA_STATUS_SUCCESS;
},
nullptr
);


44 changes: 44 additions & 0 deletions projects/aqlprofile/docs/index.rst
Original file line number Diff line number Diff line change
@@ -0,0 +1,44 @@
.. meta::
:description: AQLprofile is an open source library that enables advanced GPU profiling and tracing on AMD platforms.
:keywords: AQLprofile, ROCm, tool, Instinct, accelerator, AMD

.. _index:

************************
AQLprofile documentation
************************

The Architected Queuing Language profiling library (AQLprofile) is an
open source library that enables advanced GPU profiling and tracing on
AMD platforms.

This documentation provides a comprehensive overview of the AQLprofile library.

If you're new to AQLprofile, see :doc:`What is AQLprofile? <what-is-aqlprofile>`.

AQLprofile is open source and hosted at `AQLprofile on GitHub <https://github.com/ROCm/aqlprofile>`_.

.. grid:: 2
:gutter: 3

.. grid-item-card:: Install

* :doc:`Install AQLprofile <install/aqlprofile-install>`

.. grid-item-card:: Examples

* :doc:`Performance Monitor Control (PMC) workflow <examples/pmc-workflow>`
* :doc:`SQ Thread Trace (SQTT) workflow <examples/sqtt-workflow>`

.. grid-item-card:: Reference

* :doc:`Terms <reference/terms>`
* :doc:`APIs <reference/api-list>`
* :doc:`Supported architectures <reference/supported-architectures>`


To contribute to the documentation, refer to
`Contributing to ROCm <https://rocm.docs.amd.com/en/latest/contribute/contributing.html>`_.

You can find licensing information on the
`Licensing <https://rocm.docs.amd.com/en/latest/about/license.html>`_ page.
Loading
Loading