Skip to content

Team sync‐ups

Arseny Bochkarev edited this page May 27, 2025 · 39 revisions

27-may-25:

  • Exegesis:
    • CMake for building benchmarks on Simulator (ticket)
    • Waiting for the Green team to implement Konata outputting
  • MCA:
    • FU number ticket
    • Need to widen the support for MOPs (in this ticket)
  • Exams:
    • 29, 05, ~10 -- Arseny, Dmitry Zubakhin
    • 27, 02, 05 -- Denis
    • 29, ~10, 16 -- Sergey
    • 27, 05, ~10 -- Dmitry Sokolov

21-may-25 (Syntacore & ITMO & the Green team):

  • Exegesis:
    • Generated vadd benchmark, need to pass vmerge to Green team
    • Also waiting for the actual Konata log to be created by the Green team simulator
  • MCA:
    • Prototyped MOP adding for several vector instructions, obtained "Konata log" (-timeline option in LLVM-MCA)
    • Feedback from Syntacore: need to merge instructions + MOPs logs
    • LLVM Tablegen backend not changed, though the right way is to patch it to automate MOP generation

14-may-25 (Syntacore & ITMO):

  • Exegesis:
    • We now need to go to the Green team and compare their Konata results on Latency/Throughput with results from Exegesis
    • They already have automated binary runs, however, for full Konata results they need one more week
  • MCA:
    • The LMUL should be extracted anyway for PreProcessRegion
    • Instruction multiplication can be merged with the "Vector" pipeline & other things
    • MOPs number are strictly CPU-specific. For example, for LMUL=8 the MOPNumber can be 4. This means that MOPs work with two registers at a time. So ideally we should have map in TableGen's Table: MOPNumber[LMUL]=N. However, this can be reduced to LMUL for now
  • Next meeting will be with the Green team & Syntacore simultaneously: Wednesday 1:00 PM

11-may-25:

  • Exegesis:
    • SiFive's patch now works, including some changes in ProcResourcePressure measurements
    • We now need to use it to get Lat/Thpt/FUnum parameters from the Green team emulator
  • MCA:
    • We have PreProcess region method now, however, it works well for a known constant. Getting LMUL from there can be a problem, so instead we can use information from table
    • I also need to merge this instruction multiplication patch with the MOP & vector pipeline POC patch

28-apr-25:

  • The Green team only have Chipyard bring-upped (not Ara/Saturn)
    • We know how to run it, however, the artifacts purpose is unknown (probably Konata log?)
    • It is possible to get the output from the binary
  • We probably can share resources with the Green team? We'll discuss this once they bring up everything
  • Merging SiFive patch is on the final stage. We also need to apply this patch for libpfm for Exegesis to link properly

23-apr-25 (Syntacore & ITMO):

  • MCA:
    • Instruction copying (LMUL times) is almost done
    • Our custom "Vector" pipeline should be implemented. It should have separate from the OoO pipeline units, etc.
    • We should think about how to add MOPs info to specific targets (not a mandatory now)
  • Exegesis:
    • LLVM should be patched additionally for Exegesis with the new patch to run (there are some libpfm problems)
    • There were discovered some conflicts from the SiFive patch
      • Resolved successfully, at least now

21-apr-25:

  • MCA:
    • Implemented POC with another pipeline
    • The code is to be cleaned-up & refactored, non-MOP instructions should be filtered
    • Next up: to add MOPs to all other instructions
  • Exegesis
    • SiFive patch merge continues

16-apr-25 (Syntacore):

  • Exegesis:
    • We should start with running instructions with different opcodes, not validating them right away. For this purpose use emulator from the Green team
    • For now no conflicts applying SiFive patch on current main
  • MCA:
    • Decided to try to add a separate "vector" pipeline
    • We're on our way to implementing custom pseudo instruction
    • For the purpose of adding FU count parameter we can try to add a RISC-V version of postProcessInstruction

13-apr-25:

  • We need to add meeting with Syntacore to our calendar
  • Exegesis:
    • It is possible to use the following algorithm to validate Exegesis-generated code snippets:
      1. Get the exact tested code snippet (or don't cut anything and just jump to the code tested right away. It may be quite slower though)
      2. QEMU + debugger / Use emulator from the Green team
    • Here's the branch for applying SiFive's patch to our main
    • We should consider resolving the issues from the original MR
      • Not before we resolve all the conflicts
  • MCA:
    • It is possibly to copy instructions using the Region instead of MCAInst builder
    • Branch with prototype:
    • Comment with further plans for Denis
    • We should take a look at how MCInsts are initialized in terms of FU count: it doesn't have such a field, while from a SchedModel we should be able to tell the FU count. MCAInsts also have this field
    • We should investigate on disabling RISCVCustomBehaviour: what happens with the numbers when the original instructions are used, not their Pseudo-... versions

09-apr-25 (Syntacore + ITMO):

  • Transferred current context with LLVM-MCA
  • The green team are on their way for bring-up'ing the simulator itself, so soon we can come to them and try some of our Exegesis-generated snippets
  • Good input for Exegesis-generated code snippets validation is that we can use simulator from the Green team

06-apr-25:

  • ITMO sprint took place:
    • The picture from the Syntacore presentation should be added to the one for ITMO
    • The Exegesis tasks should be clarified a bit IMO (what is the open-source patch exactly?)
  • MCA:
    • The proposal for SchedModel widening was written. The main idea was to add additional resources for vector instructions' MOPs. For the feedback see below.
    • The MCA doesn't go to TableGen at all. The codegen feeds it with the MCInsts. So basically the proposal in its current state is not applicable.
    • The MCA is able to replace one instruction with the other: once it gets MCInst from codegen, it creates MCAInst based on it. For RVV it creates proper version of instruction, based on the LMUL
    • Current plan is the following:
      • To understand how the instructions are chosen
      • To teach MCA to create multiple MCAInsts based on LMUL
      • To support it inside SchedModel (fix numbers in subtargets a bit)
  • Exegesis:
    • Sergey and Dmitry Sokolov started to apply the patch to our current main

30-mar-25:

  • MCA:
    • MCA mostly interacts with common LLVM classes, not TableGen directly
    • The only suitable exception is RISCVCustomBehaviour, it should be possible to extract there (see the opcodeHasEEWAndEMULInfo as an example)
    • SchedModel widening: added custom class for MOP resources occupied by every vector instruction. The subtargets can then define them accordingly
  • Exegesis:
    • Sergey and Dmitry Sokolov will take the patch and try to apply it to current main branch
    • Dmitry Zubakhin will investigate current snippets generation for different instructions

28-mar-25 (ITMO):

  • Transferred context to ITMO curator
  • We should think about doing one project at a time (Exegesis or MCA), not two tools simultaneously

26-mar-25 (Syntacore):

  • The working plan and decomposition for LLVM-Exegesis is fine
  • The job which should be done for MCA tool is not the one we thought:
    • We need to emulate MOPs as they are in Kanata:

One instruction corresponds to LMUL MOPs, who essentially are just LMUL completely similar instructions one-by-one

  • The interactions with the Green team are close...

23-mar-25 (MCA):

  • Some info on Sched Models: https://github.com/LLVM-Exegesis-MCA-RVV/llvm-project/issues/2
  • Denis is looking for the whole MCA single instruction pipeline
    • Looks like TableGen-generated code can call some C++ code? Curious
  • Next steps are:
    • Try to understand how uOps can help us with vector instructions in particular (ask Anton)
    • Understand how a single instruction is processed
    • Next, try to connect our approach with resource division on pipeline stages with some simple instruction

22-mar-25 (Exegesis):

  • Article about RVV: https://fprox.substack.com/p/risc-v-vector-extension-in-a-nutshell-part-1
  • Patch from SiFive:
    • Problem is: same instructions produce different results based on vsetvl
    • Current solution is:
      • Generate all possible vector unit programming cases in snippets for benchmarking using multiple passes:
        • RISCVInsertVSETVIPass
        • RISCVInsertWriteVXRMPass
        • Post-processing pass
      • Patch introduced opcodes in MCInstr
  • Dmitry Zubakhin will check some "unsupported" instructions
  • Arseny will reach out to one of Winter School curators to ask what was done

16-mar-25 (Exegesis):

16-mar-25 (MCA):

  • Read: https://habr.com/ru/articles/474460/
  • grep how MCA uses SchedModel (I guess we should not change any interfaces)
  • scalable vectorization in other CPUs in LLVM: What's number of microops?
  • take a look at MCA for SiFive CPUs
  • Need to add FU count in a sched model
  • RISCVSchedSiFiveP400.td for reference
    • Try to feed to MCA, how it handles RVV

12-mar-25 (Syntacore):

  • More interactions with green team
  • performance tools (llvm-based):
    • llvm-mca: code analysis (any random code snippet)
    • llvm-exegesis: hardware analysis
  • We need to try to union them in some sort
  • Exegesis doesn't use LLVM's SchedModel as full as possible. Initial RVV is supported
  • Traditional LLVM's SchedModel doesn't suite RVV well: we need to widen it (e.g. functional units number)
  • MCA RVV: no rvv at all
  • Decomp: mca rvv , schedmodel for exegesis , widening schedmodel for RVV, bring-up (saturn-vectors, real hardware)
  • Need to do decomposition
  • Org
    • Biweekly meetings
    • Monthly -- both teams
    • Next meeting: 19-mar-25
  • Need to get familiar with both tools
Clone this wiki locally