Skip to content

[rocprofiler-sdk] Add Duration_ns field to kernel trace CSV output#1841

Open
ihhethan wants to merge 10 commits intodevelopfrom
user/ihhethan/SWDEV-561821
Open

[rocprofiler-sdk] Add Duration_ns field to kernel trace CSV output#1841
ihhethan wants to merge 10 commits intodevelopfrom
user/ihhethan/SWDEV-561821

Conversation

@ihhethan
Copy link
Contributor

@ihhethan ihhethan commented Nov 13, 2025

Motivation

Adds missing Duration_ns field to kernel trace CSV output, addressing regression from rocprof to rocprofv3. Critical for analyzing applications where the same kernel has different durations per invocation.

Technical Details

Modified Files:

  • rocpd/csv.py

    • Added Duration column to kernel trace CSV query
  • conftest.py

    • Defines pytest fixtures for JSON and rocpd DB inputs
  • validate.py

    • Implements validation logic:
      • Extract kernel dispatch records from JSON
      • Query kernel data via rocpd Python API
      • Verify:
        • Duration == End - Start
        • Exact match between JSON and DB
        • All dispatch IDs align
  • CMakeLists.txt

    • Adds generate + validation test pipeline
    • Configures environment with ROCPROF_OUTPUT_FORMAT=json,rocpd

Change:

Before: "...End_Timestamp","LDS_Block_Size"...
After: "...End_Timestamp","Duration_ns","LDS_Block_Size"...

Test Plan

Locally Tested with daxpy_5 on MI300X using ROCm 6.4.1. Verified Duration_ns field appears in column 12 and values match end - start calculation.

Test Result

All locally tests passed on my machine
Duration_ns in column 12
Values verified: 80360 ns, 63895 ns, 54680 ns
Backward compatible

Submission Checklist

Copy link
Contributor

@SrirakshaNag SrirakshaNag left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

How are we validating this new column in CSV?

"Correlation_Id",
"Start_Timestamp",
"End_Timestamp",
"Duration_ns",
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The naming here needs to be consistent. Can it be Duration_NS?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sure

record.correlation_id.internal,
record.start_timestamp,
record.end_timestamp,
(record.end_timestamp - record.start_timestamp),
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Would it be possible to add a test case for this?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I’ve added a new integration test under tests/rocprofv3/kernel-duration-ns:

  • rocprofv3-test-kernel-duration-ns-execute

    • runs rocprofv3 --kernel-trace -f csv on simple-transpose
    • generates out_kernel_trace.csv
  • rocprofv3-test-kernel-duration-ns-validation

    • runs validate.py --csv-input <.../out_kernel_trace.csv>
    • checks that the CSV header contains Start_Timestamp, End_Timestamp, and Duration_NS
    • verifies that Duration_NS == End_Timestamp - Start_Timestamp for multiple rows

Both tests pass locally:
ctest -R "kernel-duration-ns" -V

@ihhethan ihhethan force-pushed the user/ihhethan/SWDEV-561821 branch 3 times, most recently from 6ec210d to 5241dbf Compare December 3, 2025 20:23
Fixes: SWDEV-561821

This change adds the missing Duration_ns field to kernel trace CSV output,
addressing a regression from rocprof to rocprofv3.

Problem:
rocprofv3 --kernel-trace only provides timestamps for each kernel invocation,
unlike the CSV output from rocprof --stats which contained durations.
This is critical for applications like HemeLB where the same hotspot kernel
is called twice per iteration with different durations.

Solution:
Added Duration_ns field (calculated as end - start) to all trace CSV outputs.

Changes:
1. Python (rocpd/csv.py):
   - Added (end - start) AS Duration_ns to 6 trace functions:
     * write_kernel_csv
     * write_memory_copy_csv
     * write_memory_allocation_csv
     * write_counters_csv
     * write_scratch_memory_csv
     * write_region_csv
   - Ensured the column name uses the consistent "Duration_NS" spelling.

2. C++ (generateCSV.cpp):
   - Added \"Duration_ns\" to CSV header
   - Added (record.end_timestamp - record.start_timestamp) as the value
     for the Duration_NS column in kernel trace rows.

3. C++ (csv.hpp):
   - Updated kernel_trace_with_stream_csv_encoder from csv_encoder<22>
     to csv_encoder<23> to account for the new Duration_NS column.

4. Tests (rocprofv3 integration):
   - Added a new test under tests/rocprofv3/kernel-duration-ns:
     * rocprofv3-test-kernel-duration-ns-execute
       - Runs `rocprofv3 --kernel-trace -f csv` on the simple-transpose
         test application to generate out_kernel_trace.csv.
     * rocprofv3-test-kernel-duration-ns-validation
       - Runs validate.py with --csv-input=<out_kernel_trace.csv> and:
         - Asserts that Start_Timestamp, End_Timestamp, and Duration_NS
           columns exist in the header.
         - Verifies that Duration_NS == End_Timestamp - Start_Timestamp
           for multiple rows.

Testing:
- Environment: MI300X, ROCm 6.4.1, Docker container
- Test application: HPCTrainingExamples/daxpy_5
- Verified Duration_ns field appears in column 12
- Verified Duration_ns = End_Timestamp - Start_Timestamp
- Sample values: 80360 ns, 63895 ns, 54680 ns
- New tests:
  * ctest -R "kernel-duration-ns" -V
    - rocprofv3-test-kernel-duration-ns-execute
    - rocprofv3-test-kernel-duration-ns-validation
  * Both tests pass locally.

Impact:
- Restores functionality from legacy rocprof tool
- Enables efficient per-invocation kernel duration analysis
- No breaking changes to existing CSV format
- Backward compatible
Comment on lines +42 to +64
@pytest.fixture
def csv_path(request) -> Path:
"""Return the path to the kernel trace CSV file passed via --csv-input."""
filename = request.config.getoption("--csv-input")
if not filename:
raise RuntimeError("--csv-input option is required for this test")
path = Path(filename)
if not path.is_file():
raise FileNotFoundError(f"{path} does not exist")
return path


@pytest.fixture
def csv_rows(csv_path: Path):
"""Yield all rows from the kernel trace CSV as a list of dicts."""
rows = []
with csv_path.open("r", newline="") as inp:
reader = csv.DictReader(inp)
for row in reader:
rows.append(row)
if not rows:
raise RuntimeError(f"No data rows found in {csv_path}")
return rows
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we move the validation of data to validate.py file instead of validating it here?
Also it might be better to have a single fixture in conftest.py "csv_data" and follow the existing format present in other tests. Please refer this

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done. Validation logic moved to validate.py and conftest.py now has single csv_data fixture following existing format. All tests passing.

Comment on lines +1 to +72
#!/usr/bin/env python3

import argparse
import csv
from pathlib import Path


def parse_args():
parser = argparse.ArgumentParser()
parser.add_argument(
"--csv-input",
type=Path,
required=True,
help="Path to the generated kernel_trace CSV",
)
return parser.parse_args()


def main():
args = parse_args()
csv_path: Path = args.csv_input

if not csv_path.is_file():
raise FileNotFoundError(f"CSV file not found: {csv_path}")

with csv_path.open("r", newline="") as f:
reader = csv.reader(f)
header = next(reader, None)
if header is None:
raise RuntimeError("CSV has no header row")

# Build a mapping from column name to index
col_idx = {name: i for i, name in enumerate(header)}

required_cols = ["Start_Timestamp", "End_Timestamp", "Duration_NS"]
missing = [c for c in required_cols if c not in col_idx]
if missing:
raise RuntimeError(f"Missing required columns: {missing}")

s_idx = col_idx["Start_Timestamp"]
e_idx = col_idx["End_Timestamp"]
d_idx = col_idx["Duration_NS"]

# Only check the first few rows to avoid processing very large files
checked_rows = 0
for row in reader:
if not row:
continue

start = int(row[s_idx])
end = int(row[e_idx])
duration = int(row[d_idx])
calc = end - start

if duration != calc:
raise RuntimeError(
f"Duration_NS mismatch: got {duration}, expected {calc} "
f"(start={start}, end={end})"
)

checked_rows += 1
if checked_rows >= 5:
break

if checked_rows == 0:
raise RuntimeError("No data rows found to validate Duration_NS")

print(f"[OK] Validated Duration_NS for {checked_rows} row(s) in {csv_path}")


if __name__ == "__main__":
main()
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please refer to this format and update accordingly.

ammallya pushed a commit that referenced this pull request Jan 22, 2026
ammallya pushed a commit that referenced this pull request Jan 22, 2026
…edback

- Consolidate to single csv_data fixture in conftest.py
- Move validation logic to validate.py
- Replace print statements with logging
- Maintain Duration_NS validation (Duration_NS = End_Timestamp - Start_Timestamp)
@ihhethan ihhethan requested review from a team as code owners February 4, 2026 01:14
Comment on lines +89 to +93
if duration != calc:
raise RuntimeError(
f"Row {i}: Duration_NS mismatch: got {duration}, expected {calc} "
f"(start={start}, end={end})"
)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@ihhethan can you also add a check to verify whether the start and end timestamps are > 0 and replace this with an assert instead of logging?

tool::csv::kernel_trace_with_stream_csv_encoder{},
{"Kind", "Agent_Id", "Queue_Id", "Stream_Id",
"Thread_Id", "Dispatch_Id", "Kernel_Id", "Kernel_Name",
"Correlation_Id", "Start_Timestamp", "End_Timestamp", "Duration_NS",
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We are not updating direct writing of CSV anymore. Please remove.

@@ -235,6 +236,7 @@ def write_kernel_csv(importData, config) -> None:
"stack_id AS Correlation_Id",
"start AS Start_Timestamp",
"end AS End_Timestamp",
"(end - start) AS Duration_NS",
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
"(end - start) AS Duration_NS",
"(end - start) AS Duration",

Neither Start_Timestamp nor End_Timestamp have _NS suffixes even though they are also in nanoseconds. Be consistent.

NAME rocprofv3-test-kernel-duration-ns-execute
COMMAND
$<TARGET_FILE:rocprofiler-sdk::rocprofv3> -d
${CMAKE_CURRENT_BINARY_DIR}/kernel-duration-ns -o out --output-format csv
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Replace this test with one that generates JSON and rocpd output. We are not updating the legacy direct CSV output

Comment on lines +41 to +59
add_test(
NAME rocprofv3-test-kernel-duration-ns-validation
COMMAND ${Python3_EXECUTABLE} ${CMAKE_CURRENT_BINARY_DIR}/validate.py --csv-input
${CMAKE_CURRENT_BINARY_DIR}/kernel-duration-ns/out_kernel_trace.csv)

set_tests_properties(
rocprofv3-test-kernel-duration-ns-validation
PROPERTIES TIMEOUT
120
LABELS
"integration-tests;kernel-duration-ns"
ENVIRONMENT
"${rocprofv3-env}"
DEPENDS
"rocprofv3-test-kernel-duration-ns-execute"
FAIL_REGULAR_EXPRESSION
"${ROCPROFILER_DEFAULT_FAIL_REGEX}"
FIXTURES_REQUIRED
rocprofv3-test-kernel-duration-ns-run)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Use rocpd to convert to CSV and then compare with JSON output.

ihhethan and others added 3 commits February 24, 2026 13:09
## Summary
Added 'Duration' field to kernel trace CSV output from rocpd to address
regression from legacy rocprof to rocprofv3. The Duration column provides
the kernel execution time in nanoseconds, calculated as (End_Timestamp - Start_Timestamp).

## Changes
- source/lib/output/csv.hpp: Added duration field to kernel dispatch CSV schema
- source/lib/output/generateCSV.cpp: Implemented duration calculation and output
- source/lib/python/rocpd/csv.py: Added Duration column to Python CSV generator
- tests/rocprofv3/kernel-duration-ns/*: Added comprehensive test suite

## Testing
- Verified Duration column exists in CSV output
- Validated Duration = End_Timestamp - Start_Timestamp
- Confirmed backward compatibility with existing CSV fields
- All integration tests passing

## Addresses
- SWDEV-561822: Request to add durations in --kernel-trace output
- Restores functionality from legacy rocprof --stats output
@ihhethan ihhethan force-pushed the user/ihhethan/SWDEV-561821 branch 4 times, most recently from 9b05cd4 to e9709ae Compare March 3, 2026 23:41
- Generate JSON and rocpd output from same execution using ROCPROF_OUTPUT_FORMAT
- Convert rocpd database to CSV and validate Duration column
- Compare CSV Duration with JSON-derived duration (zero tolerance)
- Verify Duration = End_Timestamp - Start_Timestamp

Test validates:
- Duration column exists in CSV output
- Duration values exactly match between JSON and CSV
- Timestamps are identical (same execution source)
- Internal consistency of Duration calculation

Addresses: SWDEV-561822
@ihhethan ihhethan force-pushed the user/ihhethan/SWDEV-561821 branch from 37f41fc to 999ab09 Compare March 4, 2026 00:18
Comment on lines +33 to +38
def run_rocpd_convert(db_path, out_dir):
"""Convert rocpd database to CSV format."""
os.makedirs(out_dir, exist_ok=True)
cmd = [sys.executable, "-m", "rocpd", "convert", "-i", db_path, "--output-format", "csv", "-d", out_dir]
res = subprocess.run(cmd, stdout=subprocess.PIPE, stderr=subprocess.PIPE, text=True)
assert res.returncode == 0, f"rocpd convert failed\ncmd={' '.join(cmd)}\nstdout={res.stdout}\nstderr={res.stderr}"
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

rocpd has a Python interface. Why would you run the command line in a subprocess when you can just call the API?

Comment on lines +41 to +46
def find_kernel_trace_csv(out_dir):
"""Locate kernel_trace CSV file in output directory."""
for fn in os.listdir(out_dir):
if fn.endswith("kernel_trace.csv"):
return os.path.join(out_dir, fn)
assert False, f"kernel trace CSV not found in {out_dir}"
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If you use the Python interface, you control where the file is written. The above approach is very error-prone.

Comment on lines +49 to +56
def load_csv_rows(path):
"""Load CSV file and return rows as list of dicts."""
assert os.path.isfile(path), f"missing CSV: {path}"
with open(path, newline="") as f:
reader = csv.DictReader(f)
rows = list(reader)
assert len(rows) > 0, f"empty CSV: {path}"
return rows
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do you even need to write the CSV? I'd check with @yhuiYH ... you might just be able to call the CSV submodule and get the query that produces the data written to the CSV.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Unfortunately, CSV needs to be refactored to return the query. As it is now, it will always write out a CSV file, you can't just get the query returned yet.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You could probably just update the new field to be checked in the tests.integration.validate.rocprofv3-test-rocpd.test_csv_data test : https://github.com/ROCm/rocm-systems/blob/develop/projects/rocprofiler-sdk/tests/rocprofv3/rocpd/CMakeLists.txt#L220

But I guess this test_csv_data() is called by both the rocpd-csv & generateCSV.cpp (legacy code) so it'll fail since the C++ code isn't going to add this duration field.
https://github.com/ROCm/rocm-systems/blob/develop/projects/rocprofiler-sdk/tests/pytest-packages/tests/rocprofv3.py#L356

assert end >= start, f"end before start: start={start} end={end}"

duration = end - start
m[str(dispatch_id)] = (start, end, duration)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why str(dispatch_id)?

Add Duration column computed as (end - start) in get_kernel_csv_query().
Add kernel-duration-ns integration test validating Duration against JSON output using rocpd Python API.

Validation ensures:
- Duration == End_Timestamp - Start_Timestamp
- rocpd DB values match JSON kernel records for the same execution.
@ihhethan ihhethan force-pushed the user/ihhethan/SWDEV-561821 branch from acadc99 to b071cab Compare March 16, 2026 19:05
Copy link
Contributor

@yhuiYH yhuiYH left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

  1. Please fix all failing linting tests
  2. Please change PR title to start with [rocprofiler-sdk]
  3. Please update title of PR to be relevant to current change (drop the _ns)

-- $<TARGET_FILE:simple-transpose>)

set_tests_properties(
rocprofv3-test-kernel-trace-duration-generate
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would prefer the test folder match the test name. Please change your folder from kernel-duration-ns to kernel-trace-duration

@@ -48,8 +48,8 @@ add_subdirectory(conversion-script)
add_subdirectory(python-bindings)
add_subdirectory(rocpd)
add_subdirectory(rocpd-kernel-rename)
add_subdirectory(kernel-duration-ns)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
add_subdirectory(kernel-duration-ns)
add_subdirectory(kernel-trace-duration)

@ihhethan ihhethan changed the title Add Duration_ns field to kernel trace CSV output [rocprofiler-sdk] Add Duration_ns field to kernel trace CSV output Mar 19, 2026
@ihhethan ihhethan force-pushed the user/ihhethan/SWDEV-561821 branch 3 times, most recently from d050dba to 0225e05 Compare March 19, 2026 16:19
@ihhethan ihhethan force-pushed the user/ihhethan/SWDEV-561821 branch from 0225e05 to ac003af Compare March 19, 2026 16:25
Copy link
Contributor Author

@ihhethan ihhethan left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hi, I’ve addressed all the review comments and pushed the updates.

Would you mind taking another look when you have time? Thanks!

record.correlation_id.internal,
record.start_timestamp,
record.end_timestamp,
(record.end_timestamp - record.start_timestamp),
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I’ve added a new integration test under tests/rocprofv3/kernel-duration-ns:

  • rocprofv3-test-kernel-duration-ns-execute

    • runs rocprofv3 --kernel-trace -f csv on simple-transpose
    • generates out_kernel_trace.csv
  • rocprofv3-test-kernel-duration-ns-validation

    • runs validate.py --csv-input <.../out_kernel_trace.csv>
    • checks that the CSV header contains Start_Timestamp, End_Timestamp, and Duration_NS
    • verifies that Duration_NS == End_Timestamp - Start_Timestamp for multiple rows

Both tests pass locally:
ctest -R "kernel-duration-ns" -V

Comment on lines +42 to +64
@pytest.fixture
def csv_path(request) -> Path:
"""Return the path to the kernel trace CSV file passed via --csv-input."""
filename = request.config.getoption("--csv-input")
if not filename:
raise RuntimeError("--csv-input option is required for this test")
path = Path(filename)
if not path.is_file():
raise FileNotFoundError(f"{path} does not exist")
return path


@pytest.fixture
def csv_rows(csv_path: Path):
"""Yield all rows from the kernel trace CSV as a list of dicts."""
rows = []
with csv_path.open("r", newline="") as inp:
reader = csv.DictReader(inp)
for row in reader:
rows.append(row)
if not rows:
raise RuntimeError(f"No data rows found in {csv_path}")
return rows
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done. Validation logic moved to validate.py and conftest.py now has single csv_data fixture following existing format. All tests passing.

Comment on lines 142 to 165
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why do you pass in repeated strings into _get_first_present()? Seems a bit silly and error prone.

Comment on lines 63 to 67
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can't you just do a key.lower() to avoid passing in both strings?
I suggest you optimize this a bit.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants