Skip to content

Conversation

@mawad-amd
Copy link
Collaborator

@mawad-amd mawad-amd commented Apr 30, 2025

  • WIP experimentation with JIT and Trace.
  • CoreFunction is a wrapper around the C++ core code (ignore for now)

Notes from the discussion with Jack:

  • enable_trace needs to internally figure out that the trace buffer is the last argument. At the moment, it is assumed to be the fifth argument and hence we have this dummy argument at kernel launch.

Run with:

python programming_examples/basic/vector_scalar_mul/vector_scalar_mul.py

rt = Runtime()
with rt.sequence(tensor_ty, scalar_ty, tensor_ty) as (A, F, C):
rt.enable_trace(trace_size)
rt.enable_trace(trace.numel() * np.dtype(trace.dtype).itemsize)
Copy link
Collaborator Author

@mawad-amd mawad-amd Apr 30, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@jackl-xilinx Does enable_trace take the number of bytes?

@mawad-amd
Copy link
Collaborator Author

@jackl-xilinx, whenever you have time, could you please try running this? Here is the trace I get too trace.txt. After that we can see how we can get rid of the dummy variable.

@mawad-amd
Copy link
Collaborator Author

Looking at this again, I think we might as well completely hide the tracing. Consider something like:

iron.enable_tracing()

# Magically insert the trace at kernel launch and into the RT sequence
vector_scalar_mul(input, factor, output)

iron.stop_tracing("trace.bin")

@mawad-amd
Copy link
Collaborator Author

Better approach is implemented in #2541. Closing this one.

@mawad-amd mawad-amd closed this Aug 28, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant