-
Notifications
You must be signed in to change notification settings - Fork 78
Implement vset* and vector CSRs #467
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Changes from 23 commits
a4c0deb
c922f47
536ad31
3c1878e
5211e8a
7a29980
c833986
35b70f7
35c2da9
9e7ea5b
a713489
e7b2d04
fe42aad
f1f3cfc
31b051f
b4fd2b8
cb984f9
46c52cd
b474edf
d143311
712c3ac
855a04a
7fe516b
879a548
a88e079
fe0a0da
c0b3b3c
34e2bb0
94daf9d
2921e24
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -0,0 +1,30 @@ | ||
| # Copyright (c) Qualcomm Technologies, Inc. and/or its subsidiaries. | ||
| # SPDX-License-Identifier: BSD-3-Clause-Clear | ||
|
|
||
| # yaml-language-server: $schema=../../schemas/csr_schema.json | ||
|
|
||
| $schema: "csr_schema.json#" | ||
| kind: csr | ||
| name: vcsr | ||
| long_name: Vector Control and Status Register | ||
| address: 0x00F | ||
| writable: true | ||
| priv_mode: U | ||
| length: MXLEN | ||
| description: Allows access to vxrm and vxsat CSRs | ||
jmawet marked this conversation as resolved.
Outdated
Show resolved
Hide resolved
|
||
| definedBy: V | ||
| fields: | ||
| VXRM: | ||
| location: 2-1 | ||
| description: See vxrm. | ||
| type: RW-RH | ||
| sw_write(csr_value): | | ||
| return csr_value.VXRM; | ||
jmawet marked this conversation as resolved.
Show resolved
Hide resolved
|
||
| reset_value: UNDEFINED_LEGAL | ||
| VXSAT: | ||
| location: 0 | ||
| description: See vxsat. | ||
| type: RW-RH | ||
| sw_write(csr_value): | | ||
| return csr_value.VXSAT; | ||
jmawet marked this conversation as resolved.
Show resolved
Hide resolved
|
||
| reset_value: UNDEFINED_LEGAL | ||
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -0,0 +1,30 @@ | ||
| # Copyright (c) Qualcomm Technologies, Inc. and/or its subsidiaries. | ||
| # SPDX-License-Identifier: BSD-3-Clause-Clear | ||
|
|
||
| # yaml-language-server: $schema=../../schemas/csr_schema.json | ||
|
|
||
| $schema: "csr_schema.json#" | ||
| kind: csr | ||
| name: vl | ||
| long_name: Vector Length | ||
| address: 0xC20 | ||
| writable: false | ||
| priv_mode: U | ||
| length: MXLEN | ||
| description: Holds an unsigned integer specifying number of elements to be updated with results from a vector instruction. | ||
| definedBy: V | ||
| fields: | ||
| VALUE: | ||
| location_rv32: 31-0 | ||
| location_rv64: 63-0 | ||
| description: | | ||
| The vl register holds an unsigned integer specifying the number of elements to be updated with | ||
| results from a vector instruction, as further detailed in Section Section 31.5.4. | ||
| [NOTE] | ||
| The number of bits implemented in vl depends on the implementation's maximum vector | ||
| length of the smallest supported type. The smallest vector implementation with VLEN=32 | ||
| and supporting SEW=8 would need at least six bits in vl to hold the values 0-32 | ||
| (VLEN=32, with LMUL=8 and SEW=8, yields VLMAX=32). | ||
| type: RO-H | ||
| reset_value: 0 | ||
jmawet marked this conversation as resolved.
Outdated
Show resolved
Hide resolved
|
||
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -0,0 +1,25 @@ | ||
| # Copyright (c) Qualcomm Technologies, Inc. and/or its subsidiaries. | ||
| # SPDX-License-Identifier: BSD-3-Clause-Clear | ||
|
|
||
| # yaml-language-server: $schema=../../schemas/csr_schema.json | ||
|
|
||
| $schema: "csr_schema.json#" | ||
| kind: csr | ||
| name: vlenb | ||
| long_name: Vector Byte Length | ||
ThinkOpenly marked this conversation as resolved.
Show resolved
Hide resolved
|
||
| address: 0xC22 | ||
| writable: false | ||
| priv_mode: U | ||
| length: MXLEN | ||
| description: Holds the value VLEN/8, the vector register length in bytes. | ||
| definedBy: V | ||
| fields: | ||
| VALUE: | ||
| location_rv32: 31-0 | ||
| location_rv64: 63-0 | ||
| description: | | ||
| The value in vlenb is a design-time constant in any implementation. | ||
| Without this CSR, several instructions are needed to calculate VLEN in bytes, and the code | ||
| has to disturb current vl and vtype settings which require them to be saved and restored. | ||
| type: RO | ||
| reset_value(): return VLEN / 8; | ||
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -0,0 +1,74 @@ | ||
| # Copyright (c) Qualcomm Technologies, Inc. and/or its subsidiaries. | ||
| # SPDX-License-Identifier: BSD-3-Clause-Clear | ||
|
|
||
| # yaml-language-server: $schema=../../schemas/csr_schema.json | ||
|
|
||
| $schema: "csr_schema.json#" | ||
| kind: csr | ||
| name: vstart | ||
| long_name: Vector Start Index | ||
| address: 0x008 | ||
| writable: true | ||
| priv_mode: U | ||
| length: MXLEN | ||
| description: Specifies the index of the first element to be executed by a vector instruction. | ||
| definedBy: V | ||
| fields: | ||
| VALUE: | ||
| location_rv32: 31-0 | ||
| location_rv64: 63-0 | ||
| description: | | ||
| Normally, vstart is only written by hardware on a trap on a vector instruction, with the vstart value | ||
| representing the element on which the trap was taken (either a synchronous exception or an | ||
| asynchronous interrupt), and at which execution should resume after a resumable trap is handled. | ||
| All vector instructions are defined to begin execution with the element number given in the vstart | ||
| CSR, leaving earlier elements in the destination vector undisturbed, and to reset the vstart CSR to | ||
| zero at the end of execution. | ||
| [NOTE] | ||
| All vector instructions, including vset{i}vl{i}, reset the vstart CSR to zero. | ||
| vstart is not modified by vector instructions that raise illegal-instruction exceptions. | ||
| The vstart CSR is defined to have only enough writable bits to hold the largest element index (one | ||
| less than the maximum VLMAX). | ||
| [NOTE] | ||
| The maximum vector length is obtained with the largest LMUL setting (8) and the smallest | ||
| SEW setting (8), so VLMAX_max = 8*VLEN/8 = VLEN. For example, for VLEN=256, | ||
| vstart would have 8 bits to represent indices from 0 through 255. | ||
| The use of vstart values greater than the largest element index for the current vtype setting is | ||
| reserved. | ||
| [NOTE] | ||
| It is recommended that implementations trap if vstart is out of bounds. It is not required | ||
| to trap, as a possible future use of upper vstart bits is to store imprecise trap | ||
| information. | ||
| The vstart CSR is writable by unprivileged code, but non-zero vstart values may cause vector | ||
| instructions to run substantially slower on some implementations, so vstart should not be used by | ||
| application programmers. A few vector instructions cannot be executed with a non-zero vstart value | ||
| and will raise an illegal instruction exception as defined below. | ||
| [NOTE] | ||
| Making vstart visible to unprivileged code supports user-level threading libraries. | ||
| Implementations are permitted to raise illegal instruction exceptions when attempting to execute a | ||
| vector instruction with a value of vstart that the implementation can never produce when executing | ||
| that same instruction with the same vtype setting. | ||
| [NOTE] | ||
| For example, some implementations will never take interrupts during execution of a vector | ||
| arithmetic instruction, instead waiting until the instruction completes to take the | ||
| interrupt. Such implementations are permitted to raise an illegal instruction exception | ||
| when attempting to execute a vector arithmetic instruction when vstart is nonzero. | ||
| [NOTE] | ||
| When migrating a software thread between two harts with different microarchitectures, | ||
| the vstart value might not be supported by the new hart microarchitecture. The runtime | ||
| on the receiving hart might then have to emulate instruction execution up to the next | ||
| supported vstart element position. Alternatively, migration events can be constrained to | ||
| only occur at mutually supported vstart locations. | ||
| sw_write(csr_value): | | ||
| return csr_value.VALUE & (VLEN - 1); | ||
|
||
| type: RW-RH | ||
| reset_value: UNDEFINED_LEGAL | ||
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -0,0 +1,118 @@ | ||
| # Copyright (c) Qualcomm Technologies, Inc. and/or its subsidiaries. | ||
| # SPDX-License-Identifier: BSD-3-Clause-Clear | ||
|
|
||
| # yaml-language-server: $schema=../../schemas/csr_schema.json | ||
|
|
||
| $schema: "csr_schema.json#" | ||
| kind: csr | ||
| name: vtype | ||
| long_name: Vector Type | ||
| address: 0xC21 | ||
| writable: false | ||
| priv_mode: U | ||
| length: MXLEN | ||
| description: Provides the default type used to interpret the contents of the vector register file. | ||
| definedBy: V | ||
| fields: | ||
| VILL: | ||
| location_rv32: 31 | ||
| location_rv64: 63 | ||
| description: | | ||
| The vill bit is used to encode that a previous vset{i}vl{i} instruction attempted to write an | ||
| unsupported value to vtype. | ||
| [NOTE] | ||
| The vill bit is held in bit XLEN-1 of the CSR to support checking for illegal values with a | ||
| branch on the sign bit. | ||
| If the vill bit is set, then any attempt to execute a vector instruction that depends upon vtype will | ||
| raise an illegal-instruction exception. | ||
| When the vill bit is set, the other XLEN-1 bits in vtype shall be zero. | ||
| It is recommended that at reset, vill is set. | ||
| type: RO-H | ||
| reset_value: 1 | ||
jmawet marked this conversation as resolved.
Outdated
Show resolved
Hide resolved
|
||
| VMA: | ||
| location: 7 | ||
| description: | | ||
| Vector mask agnostic bit. Modifies the behavior of destination inactive masked-off elements during the | ||
| execution of vector instructions. | ||
| A value of 0 means inactive elements are undisturbed, meaning the corresponding set of destination elements | ||
| in a vector register group retain the value they previously held. | ||
| A value of 1 means inactive elements are agnostic, meaning the corresponding set of destination elements | ||
| in any vector destination operand can either retain the value they previously held, or are overwritten with 1s. | ||
| Within a single vector instruction, each destination element can be either left undisturbed or overwritten | ||
| with 1s, in any combination, and the pattern of undisturbed or overwritten with 1s is not required to be | ||
| deterministic when the instruction is executed with the same inputs. | ||
| It is recommended that at reset, vill is set, and the remaining bits in vtype are zero. | ||
| type: RO-H | ||
| reset_value: 0 | ||
| VTA: | ||
| location: 6 | ||
| description: | | ||
| Vector tail agnostic bit. Modifies the bahavior of destination tail elements during the execution of vector | ||
| instructions. | ||
| A value of 0 means tail elements are undisturbed, meaning the corresponding set of destination elements | ||
| in a vector register group retain the value they previously held. | ||
| A value of 1 means tail elements are agnostic, meaning the corresponding set of destination elements | ||
| in any vector destination operand can either retain the value they previously held, or are overwritten with 1s. | ||
| Within a single vector instruction, each destination element can be either left undisturbed or overwritten | ||
| with 1s, in any combination, and the pattern of undisturbed or overwritten with 1s is not required to be | ||
| deterministic when the instruction is executed with the same inputs. | ||
| It is recommended that at reset, vill is set, and the remaining bits in vtype are zero. | ||
| type: RO-H | ||
| reset_value: 0 | ||
| VSEW: | ||
| location: 5-3 | ||
| description: | | ||
| The value in vsew sets the dynamic selected element width (SEW). | ||
| [separator="!"] | ||
| !=== | ||
| ! vsew[2:0] ! SEW ! Elements per vector register | ||
| ! 000 ! 8 ! 16 | ||
| ! 001 ! 16 ! 8 | ||
| ! 010 ! 32 ! 4 | ||
| ! 011 ! 64 ! 2 | ||
| ! 1XX ! Reserved ! Reserved | ||
| !=== | ||
| It is recommended that at reset, vill is set, and the remaining bits in vtype are zero. | ||
| type: RO-H | ||
| reset_value: 0 | ||
| VLMUL: | ||
| location: 2-0 | ||
| description: | | ||
| Vector register group multiplier. | ||
| Multiple vector registers can be grouped together, so that a single vector instruction can operate on | ||
| multiple vector registers. The term vector register group is used herein to refer to one or more vector | ||
| registers used as a single operand to a vector instruction. Vector register groups can be used to provide | ||
| greater execution efficiency for longer application vectors, but the main reason for their inclusion is to | ||
| allow double-width or larger elements to be operated on with the same vector length as single-width | ||
| elements. The vector length multiplier, LMUL, when greater than 1, represents the default number of | ||
| vector registers that are combined to form a vector register group. Implementations must support | ||
| LMUL integer values of 1, 2, 4, and 8. | ||
| [NOTE] | ||
| The vector architecture includes instructions that take multiple source and destination | ||
| vector operands with different element widths, but the same number of elements. The | ||
| effective LMUL (EMUL) of each vector operand is determined by the number of registers | ||
| required to hold the elements. For example, for a widening add operation, such as add 32- | ||
| bit values to produce 64-bit results, a double-width result requires twice the LMUL of the | ||
| single-width inputs. | ||
| LMUL can also be a fractional value, reducing the number of bits used in a single vector register. | ||
| Fractional LMUL is used to increase the number of effective usable vector register groups when | ||
| operating on mixed-width values. | ||
| It is recommended that at reset, vill is set, and the remaining bits in vtype are zero. | ||
| type: RO-H | ||
| reset_value: 0 | ||
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -0,0 +1,47 @@ | ||
| # Copyright (c) Qualcomm Technologies, Inc. and/or its subsidiaries. | ||
| # SPDX-License-Identifier: BSD-3-Clause-Clear | ||
|
|
||
| # yaml-language-server: $schema=../../schemas/csr_schema.json | ||
|
|
||
| $schema: "csr_schema.json#" | ||
| kind: csr | ||
| name: vxrm | ||
| long_name: Vector Fixed-Point Rounding Mode | ||
| address: 0x00A | ||
| writable: true | ||
| priv_mode: U | ||
| length: MXLEN | ||
| description: Holds a 2-bit read-write rounding-mode field in the least-significant bits | ||
| definedBy: V | ||
| sw_read(): | | ||
| return CSR[vcsr].VXRM; | ||
jmawet marked this conversation as resolved.
Show resolved
Hide resolved
|
||
| fields: | ||
| VALUE: | ||
| alias: vcsr.VXRM | ||
| location_rv32: 31-0 | ||
| location_rv64: 63-0 | ||
| description: | | ||
| The vector fixed-point rounding-mode register holds a two-bit read-write rounding-mode field in the | ||
| least-significant bits (vxrm[1:0]). The upper bits, vxrm[XLEN-1:2], should be written as zeros. | ||
| The vector fixed-point rounding-mode is given a separate CSR address to allow independent access, | ||
| but is also reflected as a field in vcsr. | ||
| [NOTE] | ||
| A new rounding mode can be set while saving the original rounding mode using a single csrwi instruction. | ||
| The fixed-point rounding algorithm is specified as follows. Suppose the pre-rounding result is v, and d | ||
| bits of that result are to be rounded off. Then the rounded result is (v >> d) + r, where r depends on | ||
| the rounding mode as specified in the following table of vxrm[1:0] values. | ||
| [separator="!"] | ||
| !=== | ||
| ! vxrm[1:0] ! Abbreviation ! Rounding Mode ! Rounding increment, r | ||
| ! 00 ! rnu ! round-to-nearest-up (add +0.5 LSB) ! v[d-1] | ||
| ! 01 ! rne ! round-to-nearest-even ! v[d-1] & (v[d-2:0]\!=0 | v[d]) | ||
| ! 10 ! rdn ! round-down (truncate) ! 0 | ||
| ! 11 ! rod ! round-to-odd (OR bits into LSB, aka "jam") ! \!v[d] & v[d-1:0]\!=0 | ||
| sw_write(csr_value): | | ||
| return csr_value.VALUE & (VLEN - 1); | ||
jmawet marked this conversation as resolved.
Outdated
Show resolved
Hide resolved
|
||
| type: RW-H | ||
| reset_value: UNDEFINED_LEGAL | ||
| Original file line number | Diff line number | Diff line change | ||||
|---|---|---|---|---|---|---|
| @@ -0,0 +1,31 @@ | ||||||
| # Copyright (c) Qualcomm Technologies, Inc. and/or its subsidiaries. | ||||||
| # SPDX-License-Identifier: BSD-3-Clause-Clear | ||||||
|
|
||||||
| # yaml-language-server: $schema=../../schemas/csr_schema.json | ||||||
|
|
||||||
| $schema: "csr_schema.json#" | ||||||
| kind: csr | ||||||
| name: vxsat | ||||||
| long_name: Vector Fixed-Point Saturate Flag | ||||||
|
Collaborator
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more.
Suggested change
|
||||||
| address: 0x009 | ||||||
| writable: true | ||||||
| priv_mode: U | ||||||
| length: MXLEN | ||||||
| description: Indicates if a fixed-point instruction has had to saturate an output value to fit into a destination format | ||||||
| definedBy: V | ||||||
| sw_read(): | | ||||||
| return CSR[vcsr].VXSAT; | ||||||
jmawet marked this conversation as resolved.
Show resolved
Hide resolved
|
||||||
| fields: | ||||||
| VALUE: | ||||||
| location_rv32: 31-0 | ||||||
| location_rv64: 63-0 | ||||||
| description: | | ||||||
| The vxsat CSR has a single read-write least-significant bit (vxsat[0]) that indicates if a fixed-point | ||||||
| instruction has had to saturate an output value to fit into a destination format. Bits vxsat[XLEN-1:1] | ||||||
| should be written as zeros. | ||||||
| The vxsat bit is mirrored in vcsr. | ||||||
| sw_write(csr_value): | | ||||||
| return csr_value.VALUE & (VLEN - 1); | ||||||
jmawet marked this conversation as resolved.
Outdated
Show resolved
Hide resolved
|
||||||
| type: RW-H | ||||||
| reset_value: UNDEFINED_LEGAL | ||||||
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm basing this on the spec's convention to call the section for the register "[long_name] (mnemonic) Register", like:
and so on.
vcsris listed thus: