Skip to content

[rocprofiler-sdk] Add SPM experimental public API header#3681

Open
SrirakshaNag wants to merge 3 commits intodevelopfrom
users/SrirakshaNag/spm-patch-chain/spm_api
Open

[rocprofiler-sdk] Add SPM experimental public API header#3681
SrirakshaNag wants to merge 3 commits intodevelopfrom
users/SrirakshaNag/spm-patch-chain/spm_api

Conversation

@SrirakshaNag
Copy link
Contributor

@SrirakshaNag SrirakshaNag commented Mar 3, 2026

Motivation

Add SPM experimental public API header

Technical Details

The new header defines:

  1. Config types: rocprofiler_spm_counter_config_id_t, rocprofiler_spm_configuration_t (frequency, buffer_size, timeout)
  2. Record types: rocprofiler_spm_record_flag_t, rocprofiler_spm_dispatch_counting_service_data_t, rocprofiler_spm_counter_record_t
  3. Callbacks: rocprofiler_spm_dispatch_counting_record_cb_t, rocprofiler_spm_dispatch_counting_service_cb_t
  4. API functions: rocprofiler_spm_create_counter_config, rocprofiler_spm_destroy_counter_config, rocprofiler_iterate_spm_supported_counters, rocprofiler_configure_callback_spm_dispatch_service, rocprofiler_configure_buffer_spm_dispatch_service

JIRA ID

Test Plan

Existing tests should pass

Test Result

Submission Checklist

* @retval ROCPROFILER_STATUS_ERROR_AGENT_ARCH_NOT_SUPPORTED agent has no supported SPM counter
*/
ROCPROFILER_SDK_EXPERIMENTAL rocprofiler_status_t
rocprofiler_iterate_spm_supported_counters(rocprofiler_agent_id_t agent_id,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

for this call, i remember there was a discussion about just using the counter collection supported counters call. What was the reasoning behind using this instead?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, that needs a unified API.

ROCPROFILER_SDK_EXPERIMENTAL rocprofiler_status_t
rocprofiler_iterate_spm_supported_counters(rocprofiler_agent_id_t agent_id,
rocprofiler_available_counters_cb_t cb,
void* user_data) ROCPROFILER_API ROCPROFILER_NONNULL(2);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is this covering user_data? couldn't this be null? or is the non-null on cb?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

non-null is on cb

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@bwelton NONNULL starts at 1 for the arguments. The index of 0 is for the return type I believe

@jrmadsen
Copy link
Contributor

jrmadsen commented Mar 5, 2026

Instead of removing <rocprofiler-sdk/spm.h>, use a #warning that the experimental header should be included for now and then just include the experimental header. Protect the #warning with a #ifdef that can be used to suppress the #warning message

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

File level: What happened to the device-based API?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I have added it back, but the file just includes experimental/spm.h now

size_t record_count,
int flags,
rocprofiler_user_data_t userdata,
void* record_callback_args);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not a big deal, but I don't think this is necessary.
If the user wants to get their record_callback_args, they can link userdata to it from the dispatch_callback.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The record callback args is used by the user to send user data from configure function to this callback and userdata is the data passed from dispatch callback to this callback. I don't think we can remove record_callback_args.

Copy link
Contributor

@ApoKalipse-V ApoKalipse-V Mar 12, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Let's say user wants to pass a value (my_args) from configure() to dispatch_counting_record_cb_t. Call:

rocprofiler_configure_callback_spm_dispatch_service(..., my_args);

Then in rocprofiler_spm_dispatch_counting_service_cb_t(...):

user_data->ptr = callback_data_args;

Then in rocprofiler_spm_dispatch_counting_record_cb_t():

void* my_args = userdata.ptr;

Not saying it's good, just saying it's possible. It's how data was passed in SQTT. If we are going to have it this way in SPM, maybe we should change the SQTT API for standardization.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I have made API consistent with counter collection

@SrirakshaNag SrirakshaNag force-pushed the users/SrirakshaNag/spm-patch-chain/spm_api branch from 3fe8fd2 to 71b724f Compare March 10, 2026 23:13
@SrirakshaNag SrirakshaNag force-pushed the users/SrirakshaNag/spm-patch-chain/spm_api branch from 71b724f to c2ab2ff Compare March 10, 2026 23:44
ROCPROFILER_SPM_RECORD_FLAG_DATA_LOST = 0, ///< records with data loss
ROCPROFILER_SPM_RECORD_FLAG_DATA, ///< records with data
ROCPROFILER_SPM_RECORD_FLAG_END, ///< End of agent service
ROCPROFILER_SPM_RECORD_FLAG_DATA_NONE, //< flag value none
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What does flag_none mean here? Is this new or have I just forgot?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

NONE is usually first... LOST should be something > 0

ROCPROFILER_SDK_EXPERIMENTAL rocprofiler_status_t
rocprofiler_iterate_spm_supported_counters(rocprofiler_agent_id_t agent_id,
rocprofiler_available_counters_cb_t cb,
void* user_data) ROCPROFILER_API ROCPROFILER_NONNULL(2);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@bwelton NONNULL starts at 1 for the arguments. The index of 0 is for the return type I believe

///< to determine sample interval.
uint64_t buffer_size; ///< Buffer size of user mode buffer in KB
uint64_t timeout; ///< Timeout for the user mode buffer in ms

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change

size_t counters_count,
rocprofiler_spm_configuration_t* parameters,
rocprofiler_spm_counter_config_id_t* config_id)
ROCPROFILER_API ROCPROFILER_NONNULL(2);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
ROCPROFILER_API ROCPROFILER_NONNULL(2);
ROCPROFILER_API ROCPROFILER_NONNULL(2, 4);

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

SPM parameters can be null, there are set to default values if not provided

rocprofiler_counter_id_t* counters_list,
size_t counters_count,
rocprofiler_spm_configuration_t* parameters,
rocprofiler_spm_counter_config_id_t* config_id)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We are re-using rocprofiler_counter_id_t so why not re-use rocprofiler_spm_counter_config_id_t?

* @param [in] config_id
* @return ::rocprofiler_status_t
* @retval ROCPROFILER_STATUS_SUCCESS if config destroyed
* @retval ROCPROFILER_STATUS_ERROR if config could not be destroyed
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It seems like there should be a return value of ROCPROFILER_STATUS_ERROR_PROFILE_NOT_FOUND.

In another PR, we should alias ROCPROFILER_STATUS_ERROR_PROFILE_NOT_FOUND to ROCPROFILER_STATUS_ERROR_CONFIG_NOT_FOUND... e.g.:

    ...
    ROCPROFILER_STATUS_ERROR_CONFIG_NOT_FOUND, // replace the original PROFILE
    ...
    ROCPROFILER_STATUS_ERROR_PROFILE_NOT_FOUND = ROCPROFILER_STATUS_ERROR_CONFIG_NOT_FOUND,  // add this to end with comment that it is deprecated.

Also, the rocprofiler_get_status_name should be updated to return ROCPROFILER_STATUS_ERROR_CONFIG_NOT_FOUND and rocprofiler_get_status_string should be updated (if necessary) to use "config" instead of "profile"

ROCPROFILER_SPM_RECORD_FLAG_DATA_LOST = 0, ///< records with data loss
ROCPROFILER_SPM_RECORD_FLAG_DATA, ///< records with data
ROCPROFILER_SPM_RECORD_FLAG_END, ///< End of agent service
ROCPROFILER_SPM_RECORD_FLAG_DATA_NONE, //< flag value none
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

NONE is usually first... LOST should be something > 0

ROCPROFILER_SPM_RECORD_FLAG_DATA_LOST = 0, ///< records with data loss
ROCPROFILER_SPM_RECORD_FLAG_DATA, ///< records with data
ROCPROFILER_SPM_RECORD_FLAG_END, ///< End of agent service
ROCPROFILER_SPM_RECORD_FLAG_DATA_NONE, //< flag value none
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
ROCPROFILER_SPM_RECORD_FLAG_DATA_NONE, //< flag value none
ROCPROFILER_SPM_RECORD_FLAG_DATA_NONE, ///< flag value none

{
uint64_t size; ///< Size of this struct
rocprofiler_async_correlation_id_t correlation_id; ///< Correlation ID for this dispatch
rocprofiler_kernel_dispatch_info_t dispatch_info; ///< Dispatch info
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Where is the timestamp like in rocprofiler_dispatch_counting_service_data_t?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The start and end timestamp of kernel is not available in SPM due to it be streaming

} rocprofiler_spm_counter_record_t;

/**
* @brief (experimental) Callback to receive SPM data
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

receive SPM data from? Callback? Buffer records?

@jrmadsen jrmadsen changed the title Add SPM experimental public API header [rocprofiler-sdk] Add SPM experimental public API header Mar 17, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants