Skip to content

Commit a20a6c4

Browse files
authored
Clearly specify FENCE.I ordering requirements (riscv#2167)
* Clearly specify FENCE.I ordering requirements The ratified FENCE.I definition specifies a sufficient code sequence for multiprocessor instruction-memory modification (on the producer, store to instruction memory, fence, then communicate with the consumer; on the consumer, observe communication, fence.i, then execute the new code). However, the spec does not explain _why_ this sequence suffices. The explanation is that FENCE.I orders older loads on the same thread before younger instruction fetches. (Otherwise, nothing would force the consumer's observation of the communication to occur before the next fetch.) Combined with the existing statement in the spec that older stores on the same hart are ordered before younger fetches, the overall implied rule is easy to state: FENCE.I orders all older explicit memory accesses before all younger fetches. * Change register allocation for clarity
1 parent 06cf0ef commit a20a6c4

File tree

1 file changed

+29
-0
lines changed

1 file changed

+29
-0
lines changed

src/zifencei.adoc

Lines changed: 29 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -78,6 +78,35 @@ instruction memory visible to all RISC-V harts, the writing hart also
7878
has to execute a data FENCE before requesting that all remote RISC-V
7979
harts execute a FENCE.I.
8080

81+
A FENCE.I instruction orders all explicit memory accesses that precede the
82+
FENCE.I in program order before all instruction fetches that follow the
83+
FENCE.I in program order.
84+
85+
[NOTE]
86+
====
87+
In the following litmus test, for example, the outcome `a0`=1, `a1`=0 on
88+
the consumer hart is forbidden, assuming little-endian RV32IC harts:
89+
90+
```
91+
Initially, flag = 0.
92+
93+
Producer hart: Consumer hart:
94+
95+
la t0, patch_me la t2, flag
96+
li t1, 0x4585 lw a0, (t2)
97+
sh t1, (t0) # patch_me := c.li a1, 1 fence.i
98+
fence w, w # order flag write patch_me:
99+
la t0, flag c.li a1, 0
100+
li t1, 1
101+
sw t1, (t0) # flag := 1
102+
```
103+
104+
Note that this example is only meant to illustrate the aforementioned ordering
105+
property.
106+
In a realistic producer-consumer code-generation scheme, the consumer would loop
107+
until `flag` becomes 1 before executing the FENCE.I instruction.
108+
====
109+
81110
The unused fields in the FENCE.I instruction, _funct12_, _rs1_, and
82111
_rd_, are reserved for finer-grain fences in future extensions. For
83112
forward compatibility, base implementations shall ignore these fields,

0 commit comments

Comments
 (0)