Skip to content
This repository was archived by the owner on Aug 26, 2025. It is now read-only.

SECOND QUESTION: cache index CMOs, e.g. (set,way) vs "microarchitecture index range" #10

@AndyGlew

Description

@AndyGlew

Just like an earlier issue discusses address range CMOs vs per-cache-line CMOs... but this time for operations that are typically used for things like "flush the entire I$ or D$".

Such "cache microarchitecture dependent CMOs" have been done in some earlier processors a cache line at a time --- but this is less well established than for peer-cache-line-address-at-a-time. Quite a few RISC processors have "full cache flushes", etc.

First, if operating a cache line at a time, there must be a way of indicating which cache line is involved. Typically this is (set,way), but not all caches have sets and ways - indeed, it is not really clear what the set and ways are for something like a skewed associative cache.

But that's okay, we can abstract that as a "cache entry index number", which might be Set*Nways+Way for a traditional set associative cache, or whatever is appropriate.

Then, a per-cache-index loop typically looks like

FOR i from 0 to  #cache_entries-1 DO
     CMO.cache_index  i

or

FOR s from 0 to  Nsets-1 DO
FOR w from 0 to Nways-1 DO
     CMO.by_set_way  s,w

That's the traditional approaxch.

The draft proposal (by me, Andy Glew, TBD link here3) defines "microarchitecture range CMOs" that look like

        x1 := 0
loop:
        x1 := CMO.UR x1
        BNEZ x1, loop

which looks remarkably like the per-cache-index loop

except that, like in the CMO.AR proposal, the next cache index is returned by the CMO.UR instruction.

This allows severral implementations

(1) per (set,way) cache line at a time - traditional

(2) trap to M-mode efficiently, less overhead

(3) state machines that iterate over the entire cache, e.g. for EVICT, to write out dirty data

also (3.1) non-state machine impl;ementations, as in bulk invalidations that set all valid bits to 0 as a single operation.


I mark this as a SECONDARY QUESTION:

in the title, because I want it to be blaringly obvious

also becausde I am in a hurry, and will apply this issue tracker's priority scheme later

but mainly because I think there will be less discussion about this CMO.UR cache index range than there will be for the CMO.AR address range instruction.

since there are already quite a few implementations that are "full cache invalidations", and we want RISC-V to support such hardware when it is available.

--

again, this issue is not for the details of the CMO.UR. It is mostly for the idea of a midfroarchitwecure or cache index range.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions