-
Notifications
You must be signed in to change notification settings - Fork 27
Open
Labels
performanceThis affects protocol performanceThis affects protocol performance
Description
intermediates_to_table_indices works as follows:
- It calls
bits_to_table_indices, which takes threeu128s each containing the value of one of three intermediates for 128 multiplications, and returns fouru128s containing a table index in each nibble. - It then reorders those nibbles into bytes as its output. (Originally, the table lookup was done here, but additional optimization moved the table lookup elsewhere.)
It appears that bits_to_table_indices compiles to <200 instructions (fully unrolled with no loops or branches), while the rearranging of nibbles compiles to >1000 instructions (again, fully unrolled with no loops or branches). Implementing a single transpose-like operation covering both steps would probably be more efficient.
Metadata
Metadata
Assignees
Labels
performanceThis affects protocol performanceThis affects protocol performance