Skip to content

[rocprofiler-sdk] Fix categories filtering in rocpd analysis tool#4161

Open
mcao59 wants to merge 9 commits intodevelopfrom
users/mcao/fix_rocpdsummary
Open

[rocprofiler-sdk] Fix categories filtering in rocpd analysis tool#4161
mcao59 wants to merge 9 commits intodevelopfrom
users/mcao/fix_rocpdsummary

Conversation

@mcao59
Copy link
Member

@mcao59 mcao59 commented Mar 17, 2026

Motivation

Currently, using --region-categories with rocpd2summary does not show expected filtered output.
e.g. The memory regions should not be displayed:

$ rocpd2summary  -i out_results.db  --region-categories KERNEL

KERNELS_SUMMARY:
                                                                                                                     Name  Calls  DURATION (nsec)  AVERAGE (nsec)  PERCENT (INC)  MIN (nsec)  MAX (nsec)     STD_DEV
kernel_5(HIP_vector_type<double, 2u>*, HIP_vector_type<double, 2u>*, HIP_vector_type<double, 2u>*, double, unsigned long)     26          1592042    61232.384615          100.0       52678       69263 2891.971979

MEMORY_COPIES_SUMMARY:
                      Name  Calls  DURATION (nsec)  AVERAGE (nsec)  PERCENT (INC)  MIN (nsec)  MAX (nsec)       STD_DEV
MEMORY_COPY_HOST_TO_DEVICE      6          2890969   481828.166667      62.786672      263351      591477 169201.157209
MEMORY_COPY_DEVICE_TO_HOST      3          1713462   571154.000000      37.213328      311382      701120 224969.165416

MEMORY_ALLOCATIONS_SUMMARY:
 Name  Calls  DURATION (nsec)  AVERAGE (nsec)  PERCENT (INC)  MIN (nsec)  MAX (nsec)      STD_DEV
ALLOC      6          8244389    1.374065e+06          100.0       29043     7953357 3.223399e+06

The expected behavior is

  • Single/multiple categories can be filtered
  • --region-categories NONE excludes all region domain categories, leaving only summaries for all views like kernels, memory allocations, scratch_memory, etc

Output:

$ rocpd2summary  -i out_results.db  --region-categories KERNEL

KERNELS_SUMMARY:
                                                                                                                     Name  Calls  DURATION (nsec)  AVERAGE (nsec)  PERCENT (INC)  MIN (nsec)  MAX (nsec)     STD_DEV
kernel_5(HIP_vector_type<double, 2u>*, HIP_vector_type<double, 2u>*, HIP_vector_type<double, 2u>*, double, unsigned long)     26          1592042    61232.384615          100.0       52678       69263 2891.971979
$ rocpd2summary  -i out_results.db  --region-categories KERNEL HIP

KERNELS_SUMMARY:
                                                                                                                     Name  Calls  DURATION (nsec)  AVERAGE (nsec)  PERCENT (INC)  MIN (nsec)  MAX (nsec)     STD_DEV
kernel_5(HIP_vector_type<double, 2u>*, HIP_vector_type<double, 2u>*, HIP_vector_type<double, 2u>*, double, unsigned long)     26          1592042    61232.384615          100.0       52678       69263 2891.971979

HIP_SUMMARY:
                       Name  Calls  DURATION (nsec)  AVERAGE (nsec)  PERCENT (INC)  MIN (nsec)  MAX (nsec)      STD_DEV
hipGetDevicePropertiesR0600      1        392605894    3.926059e+08      75.806535   392605894   392605894          NaN
                  hipMemcpy      3        122668921    4.088964e+07      23.685599     2481724    96256861 4.913408e+07
            hipLaunchKernel     26          1422671    5.471812e+04       0.274697        6109      824694 1.589685e+05
       hipDeviceSynchronize      1           945934    9.459340e+05       0.182646      945934      945934          NaN
                  hipMalloc      3           261659    8.721967e+04       0.050523       33310      177874 7.897668e+04
$ rocpd2summary  -i out_results.db  --region-categories  NONE

KERNELS_SUMMARY:
                                                                                                                     Name  Calls  DURATION (nsec)  AVERAGE (nsec)  PERCENT (INC)  MIN (nsec)  MAX (nsec)     STD_DEV
kernel_5(HIP_vector_type<double, 2u>*, HIP_vector_type<double, 2u>*, HIP_vector_type<double, 2u>*, double, unsigned long)     26          1592042    61232.384615          100.0       52678       69263 2891.971979

MEMORY_COPIES_SUMMARY:
                      Name  Calls  DURATION (nsec)  AVERAGE (nsec)  PERCENT (INC)  MIN (nsec)  MAX (nsec)       STD_DEV
MEMORY_COPY_HOST_TO_DEVICE      6          2890969   481828.166667      62.786672      263351      591477 169201.157209
MEMORY_COPY_DEVICE_TO_HOST      3          1713462   571154.000000      37.213328      311382      701120 224969.165416

MEMORY_ALLOCATIONS_SUMMARY:
 Name  Calls  DURATION (nsec)  AVERAGE (nsec)  PERCENT (INC)  MIN (nsec)  MAX (nsec)      STD_DEV
ALLOC      6          8244389    1.374065e+06          100.0       29043     7953357 3.223399e+06

Technical Details

  • Fix categories filtering
  • Add tests
    • New tests to validate category filtering, including
      • a single category with Kernel and HIP
      • multiple categories
      • "NONE"

JIRA ID

Resolves AIPROFSDK-14 partially

Test Plan

Added new tests for --region-categories

Test Result

New tests pass.

Submission Checklist

@mcao59 mcao59 force-pushed the users/mcao/fix_rocpdsummary branch from 0b79065 to db12fcf Compare March 18, 2026 16:47
Copy link
Contributor

@yhuiYH yhuiYH left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

When running ctest -R rocpd, I'm seeing it tried adding 4 additional tests that fail but I don't think you intended those to be added. Can you fix pls?

        623 - tests.integration.validate.rocprofv3-test-rocpd-region-category-validation.test_perfetto_data (Failed) integration-tests validation
        624 - tests.integration.validate.rocprofv3-test-rocpd-region-category-validation.test_otf2_data (Failed) integration-tests validation
        625 - tests.integration.validate.rocprofv3-test-rocpd-region-category-validation.test_otf2_system_tree_node_data (Failed) integration-tests validation
        626 - tests.integration.validate.rocprofv3-test-rocpd-region-category-validation.test_csv_data (Failed) integration-tests validation

mcao59 and others added 2 commits March 18, 2026 21:18
Co-authored-by: Young Hui - AMD <145490163+yhuiYH@users.noreply.github.com>
@mcao59
Copy link
Member Author

mcao59 commented Mar 19, 2026

When running ctest -R rocpd, I'm seeing it tried adding 4 additional tests that fail but I don't think you intended those to be added. Can you fix pls?

@yhuiYH Good catch. I was reusing validate.py, so the target also registered the existing generic validation tests (otf2, cvs, etc) in that file. I moved the region-category checks into a separate validate_summary.py, so that the target only registers the 4 intended tests, similar to what I did in #3854 (where I created validate_annotations.py).
Please see ca37bc6. I'm open to other naming schemes for those dedicated validate python files.
ctest -R rocpd now show all passing tests and ctest -R region shows only expected tests are run

@mcao59 mcao59 requested a review from yhuiYH March 19, 2026 16:06
Copy link
Contributor

@yhuiYH yhuiYH left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM now.
FYI, --region-categories was supposed to just filter out the API regions, that's why it ignored the views. But I'm ok with this PR overloading the filter functionality for the users.

Was thinking of renaming the flag --region-categories to a generic --filter-categories, but I think we can just leave it as is for now, to minimize changes.

Jonathan may mention we were planning on adding generic filtering to all the rocpd modules. However, that implementation is based on the newer rocpd_info_category table, so we will move there with the future schema changes: 654e02b

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants