From 885873c2e42755b24ab2d80349328c14612f65ec Mon Sep 17 00:00:00 2001 From: zohassadar Date: Wed, 26 Feb 2025 16:20:31 +0000 Subject: [PATCH 1/2] Reduce cycles and byte usage 1. Orientation table split & reduced Previously the orientation table consisted of 12 byte runs, 1 for each orientation. The 12 bytes consisted of the pattern Y offset, Tile index, Y offset. This group of 3 was repeated for each of the 4 minos in a piece. The X/Y offsets have been changed to each have their own table consisting of 4 byte runs, 1 run for each orientation. The 4 bytes are for each of the 4 minos in a piece. This means the multiplication by 12 that was previously needed in tileModifierForCurrentPiece, isPositionValid and playState_lockTetrimino has been reduced to a multiplication by 4, a much simpler operation. The tiles are also in their own table but the data has been reduced by 1/4. The currentPiece value is used as-is for lookup into this table and stored in an aliased temp variable generalCounter5. 2. Use table for Y * 10 Previously the tetriminoY value (and its offsets) would be multiplied by 10 with shifting/adding. Using multBy10Table was not an option as the Y value could be either -1 or -2. By pinning multBy10Table to the beginning of a page and the two negative values at the end of the page, the table can be used for all potential values of Y (-2..=20). Assert statements have been added to ensure that the tables remain where they should be. The orientation tables and multiplication tables have been placed in between to make use of nearly the entire page. The page alignment has the additional benefit of reducing cycle count due to no page boundary crossing. 3. Recycle harddrop's ghost piece sprite staging The game checks for hard drop prior to the normal shift/rotate/drop. If the inputs are present for either a hard or sonic drop, the piece will end up where the ghost piece was staged in the previous frame. Saving this calculated Y value and reusing it in harddrop_tetrimino eliminates the redundant repeated calls to isPositionValid. 4. Take advantage of negative bit in Tiles The minos in use by the game are all positive (7B,7C,7D,7E) while an empty tile is negative (EF). The negative flag can be used to quickly determine if there's a mino or not in the playfield. 5. Count down instead of up A few operations in the hard drop routine handle one row at a time, either moving it or checking all values. Whether the row is read left to right or right to left does not matter. By counting down instead of up, the negative flag can be used to signal the end of the loop instead of a comparison. These changes reduce the maximum measured hard drop cycles from 21,876 to 12,877, and the maximum measured ghost piece staging cycles from 9,653 to 7,456. Unused ROM space has been increased from 9,148 bytes to 9,216. --- src/data/mult.asm | 9 ---- src/data/mult_orient.asm | 105 +++++++++++++++++++++++++++++++++++++++ src/data/orientation.asm | 39 --------------- src/main.asm | 6 +-- src/playstate/active.asm | 25 ++++------ src/playstate/lock.asm | 37 +++++--------- src/playstate/util.asm | 39 +++++---------- src/ram.asm | 3 +- src/sprites/piece.asm | 32 ++++++------ tests/src/harddrop.rs | 1 - 10 files changed, 158 insertions(+), 138 deletions(-) delete mode 100644 src/data/mult.asm create mode 100644 src/data/mult_orient.asm delete mode 100644 src/data/orientation.asm diff --git a/src/data/mult.asm b/src/data/mult.asm deleted file mode 100644 index 82e4f9b6..00000000 --- a/src/data/mult.asm +++ /dev/null @@ -1,9 +0,0 @@ -multBy10Table: - .byte $00,$0A,$14,$1E,$28,$32,$3C,$46 - .byte $50,$5A,$64,$6E,$78,$82,$8C,$96 - .byte $A0,$AA,$B4,$BE -multBy32Table: - .byte 0,32,64,96,128,160,192,224 -multBy100Table: - .byte $0, $64, $c8, $2c, $90 - .byte $f4, $58, $bc, $20, $84 diff --git a/src/data/mult_orient.asm b/src/data/mult_orient.asm new file mode 100644 index 00000000..a1bcd73f --- /dev/null +++ b/src/data/mult_orient.asm @@ -0,0 +1,105 @@ +; multiplication and orientation tables +; Combined to share a common page. Crossing page boundaries in table lookups +; costs an extra cycle and causes timing variance that can add up. One advantage +; of these tables taking nearly a full page is being able to use the end of the +; page for lookups to the multBy10Table. The game logic multiplies tetriminoY (or +; offsets determined by the orientation table) by 10 frequently. During these +; calculations, this value is never less than -2 and never greater than 20. A 256 +; byte lookup table would be mostly wasteful except the first 20 and last 2 bytes. +; The repeated calls to isPositionValid used for harddrop and 0 arr see a massive +; impact from these optimisations. + +; mult10Tail at end of this page allows table lookup for all possible values of tetriminoY (-2..=20) +multOrientBegin: +.assert