Team sync‐ups

27-may-25:

Exegesis:
- CMake for building benchmarks on Simulator (ticket)
- Waiting for the Green team to implement Konata outputting
MCA:
- FU number ticket
- Need to widen the support for MOPs (in this ticket)
Exams:
- 29, 05, ~10 -- Arseny, Dmitry Zubakhin
- 27, 02, 05 -- Denis
- 29, ~10, 16 -- Sergey
- 27, 05, ~10 -- Dmitry Sokolov

21-may-25 (Syntacore & ITMO & the Green team):

Exegesis:
- Generated vadd benchmark, need to pass vmerge to Green team
- Also waiting for the actual Konata log to be created by the Green team simulator
MCA:
- Prototyped MOP adding for several vector instructions, obtained "Konata log" (-timeline option in LLVM-MCA)
- Feedback from Syntacore: need to merge instructions + MOPs logs
- LLVM Tablegen backend not changed, though the right way is to patch it to automate MOP generation

14-may-25 (Syntacore & ITMO):

Exegesis:
- We now need to go to the Green team and compare their Konata results on Latency/Throughput with results from Exegesis
- They already have automated binary runs, however, for full Konata results they need one more week
MCA:
- The LMUL should be extracted anyway for PreProcessRegion
- Instruction multiplication can be merged with the "Vector" pipeline & other things
- MOPs number are strictly CPU-specific. For example, for LMUL=8 the MOPNumber can be 4. This means that MOPs work with two registers at a time. So ideally we should have map in TableGen's Table: MOPNumber[LMUL]=N. However, this can be reduced to LMUL for now
Next meeting will be with the Green team & Syntacore simultaneously: Wednesday 1:00 PM

11-may-25:

Exegesis:
- SiFive's patch now works, including some changes in ProcResourcePressure measurements
- We now need to use it to get Lat/Thpt/FUnum parameters from the Green team emulator
MCA:
- We have PreProcess region method now, however, it works well for a known constant. Getting LMUL from there can be a problem, so instead we can use information from table
- I also need to merge this instruction multiplication patch with the MOP & vector pipeline POC patch

28-apr-25:

The Green team only have Chipyard bring-upped (not Ara/Saturn)
- We know how to run it, however, the artifacts purpose is unknown (probably Konata log?)
- It is possible to get the output from the binary
We probably can share resources with the Green team? We'll discuss this once they bring up everything
Merging SiFive patch is on the final stage. We also need to apply this patch for libpfm for Exegesis to link properly

23-apr-25 (Syntacore & ITMO):

MCA:
- Instruction copying (LMUL times) is almost done
- Our custom "Vector" pipeline should be implemented. It should have separate from the OoO pipeline units, etc.
- We should think about how to add MOPs info to specific targets (not a mandatory now)
Exegesis:
- LLVM should be patched additionally for Exegesis with the new patch to run (there are some libpfm problems)
- There were discovered some conflicts from the SiFive patch
  - Resolved successfully, at least now

21-apr-25:

MCA:
- Implemented POC with another pipeline
- The code is to be cleaned-up & refactored, non-MOP instructions should be filtered
- Next up: to add MOPs to all other instructions
Exegesis
- SiFive patch merge continues

16-apr-25 (Syntacore):

Exegesis:
- We should start with running instructions with different opcodes, not validating them right away. For this purpose use emulator from the Green team
- For now no conflicts applying SiFive patch on current main
MCA:
- Decided to try to add a separate "vector" pipeline
- We're on our way to implementing custom pseudo instruction
- For the purpose of adding FU count parameter we can try to add a RISC-V version of postProcessInstruction

13-apr-25:

We need to add meeting with Syntacore to our calendar
Exegesis:
- It is possible to use the following algorithm to validate Exegesis-generated code snippets:
  1. Get the exact tested code snippet (or don't cut anything and just jump to the code tested right away. It may be quite slower though)
  2. QEMU + debugger / Use emulator from the Green team
- Here's the branch for applying SiFive's patch to our main
- We should consider resolving the issues from the original MR
  - Not before we resolve all the conflicts
MCA:
- It is possibly to copy instructions using the Region instead of MCAInst builder
- Branch with prototype:
- Comment with further plans for Denis
- We should take a look at how MCInsts are initialized in terms of FU count: it doesn't have such a field, while from a SchedModel we should be able to tell the FU count. MCAInsts also have this field
- We should investigate on disabling RISCVCustomBehaviour: what happens with the numbers when the original instructions are used, not their Pseudo-... versions

09-apr-25 (Syntacore + ITMO):

Transferred current context with LLVM-MCA
The green team are on their way for bring-up'ing the simulator itself, so soon we can come to them and try some of our Exegesis-generated snippets
Good input for Exegesis-generated code snippets validation is that we can use simulator from the Green team

06-apr-25:

ITMO sprint took place:
- The picture from the Syntacore presentation should be added to the one for ITMO
- The Exegesis tasks should be clarified a bit IMO (what is the open-source patch exactly?)
MCA:
- The proposal for SchedModel widening was written. The main idea was to add additional resources for vector instructions' MOPs. For the feedback see below.
- The MCA doesn't go to TableGen at all. The codegen feeds it with the MCInsts. So basically the proposal in its current state is not applicable.
- The MCA is able to replace one instruction with the other: once it gets MCInst from codegen, it creates MCAInst based on it. For RVV it creates proper version of instruction, based on the LMUL
- Current plan is the following:
  - To understand how the instructions are chosen
  - To teach MCA to create multiple MCAInsts based on LMUL
  - To support it inside SchedModel (fix numbers in subtargets a bit)
Exegesis:
- Sergey and Dmitry Sokolov started to apply the patch to our current main

30-mar-25:

MCA:
- MCA mostly interacts with common LLVM classes, not TableGen directly
- The only suitable exception is RISCVCustomBehaviour, it should be possible to extract there (see the opcodeHasEEWAndEMULInfo as an example)
- SchedModel widening: added custom class for MOP resources occupied by every vector instruction. The subtargets can then define them accordingly
Exegesis:
- Sergey and Dmitry Sokolov will take the patch and try to apply it to current main branch
- Dmitry Zubakhin will investigate current snippets generation for different instructions

28-mar-25 (ITMO):

Transferred context to ITMO curator
We should think about doing one project at a time (Exegesis or MCA), not two tools simultaneously

26-mar-25 (Syntacore):

The working plan and decomposition for LLVM-Exegesis is fine
The job which should be done for MCA tool is not the one we thought:
- We need to emulate MOPs as they are in Kanata:

One instruction corresponds to LMUL MOPs, who essentially are just LMUL completely similar instructions one-by-one

The interactions with the Green team are close...

23-mar-25 (MCA):

Some info on Sched Models: https://github.com/LLVM-Exegesis-MCA-RVV/llvm-project/issues/2
Denis is looking for the whole MCA single instruction pipeline
- Looks like TableGen-generated code can call some C++ code? Curious
Next steps are:
- Try to understand how uOps can help us with vector instructions in particular (ask Anton)
- Understand how a single instruction is processed
- Next, try to connect our approach with resource division on pipeline stages with some simple instruction

22-mar-25 (Exegesis):

Article about RVV: https://fprox.substack.com/p/risc-v-vector-extension-in-a-nutshell-part-1
Patch from SiFive:
- Problem is: same instructions produce different results based on vsetvl
- Current solution is:
  - Generate all possible vector unit programming cases in snippets for benchmarking using multiple passes:
    - RISCVInsertVSETVIPass
    - RISCVInsertWriteVXRMPass
    - Post-processing pass
  - Patch introduced opcodes in MCInstr
Dmitry Zubakhin will check some "unsupported" instructions
Arseny will reach out to one of Winter School curators to ask what was done

16-mar-25 (Exegesis):

Three patches for RVV for Exegesis:
- Initial support (merged): https://github.com/llvm/llvm-project/pull/128767
- Improvements (opened, no activity): https://github.com/llvm/llvm-project/pull/114149
- Winter school (not finished): https://github.com/e1turin/llvm-project/pull/1
We need to understand what was already done: so a to-do is to check which instructions are currently supported
A good path would be to track what's going on in current implementation for some know-to-work instruction (like vmul)
CI will wait (a month approximately)

16-mar-25 (MCA):

Read: https://habr.com/ru/articles/474460/
grep how MCA uses SchedModel (I guess we should not change any interfaces)
scalable vectorization in other CPUs in LLVM: What's number of microops?
take a look at MCA for SiFive CPUs
Need to add FU count in a sched model
RISCVSchedSiFiveP400.td for reference
- Try to feed to MCA, how it handles RVV

12-mar-25 (Syntacore):

More interactions with green team
performance tools (llvm-based):
- llvm-mca: code analysis (any random code snippet)
- llvm-exegesis: hardware analysis
We need to try to union them in some sort
Exegesis doesn't use LLVM's SchedModel as full as possible. Initial RVV is supported
Traditional LLVM's SchedModel doesn't suite RVV well: we need to widen it (e.g. functional units number)
MCA RVV: no rvv at all
Decomp: mca rvv , schedmodel for exegesis , widening schedmodel for RVV, bring-up (saturn-vectors, real hardware)
Need to do decomposition
Org
- Biweekly meetings
- Monthly -- both teams
- Next meeting: 19-mar-25
Need to get familiar with both tools

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Team sync‐ups

Uh oh!

Uh oh!

Uh oh!

Clone this wiki locally