-
Notifications
You must be signed in to change notification settings - Fork 15.4k
[AArch64] A simple tool for generating a scheduling model draft from a SWOG #131525
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
|
I am very much interested in autogenerating (most of the) schedmodels. I haven't looked too deeply into this change and its output, but I think it would be good to first have a discussion on how complete and useful this is. I.e., for this discussion, can we compare the output of the tool to a well established and existing scheduling model (e.g. the Neoverse V2 that I am most familiar with)? So, I think we need to first establish how useful this is, what is lacking, and what the plan is to address them (if any). I came across this old review for a X86 CPU: https://reviews.llvm.org/D130897. I have also not yet studied that in detail, but there seems to be a lot more going on in that patch. |
The tool in this PR is much simpler since it only generates a draft rather than a working sched model. It does that based on a simple rule: for each row in the instruction tables, match the throughput assuming all utilized units are fully utilized. Take the following row in the Neoverse N3 instruction tables as an example: The throughput is 2, that means 2 instructions are executed in a cycle. The pipeline B and C each has 2 units, that means 2 B uops and 2 S uops are executed in a cycle. So each instruction has 1 B uop and 1 S uop. For any instructions the above rule does not applies to, we need to manually modify their descriptions. It does not map instruction names in SWOG to names in LLVM. It does not define any forwarding rules. |
|
Nice tool. I'm impressed that it manages to parse the tables as cleanly as it does.
That sounds like it might be the hard bit, at least it looks like it would require some manual effort. There has always been the question of which is better - trying to collect the data from a known good source or trying to measure it directly on real hardware. Both have advantages and disadvantages, and in the end of the day come from the same source (the SWOG's just have someone who knows what the right answer should be looking over the results after they are measured). From looking at https://reviews.llvm.org/D144388#4149183, and what I've seen of the difficulties in measuring some values reliably, perhaps we will end up needing a mixture of both approaches, with one checking the results of the other. |
We have a utility tool that takes an instruction name and maps it to LLVM opcodes. For example, given Maybe it could be useful to extend this script? |
If the utility can get LLVM opcodes from |
|
Maybe the file name should be changed to arm_sched_model_gen_from_swog.py. Otherwise, it might be mistaken as a general-purpose one. |
Refer to comments in the source code for how this tool is used and how it works.
Below is an example output:
Utilizing this tool rather than writing a scheduling model from scratch should save some efforts.