This project provides two versions of an assembly program that implements the 3n + 1 algorithm (Collatz Conjecture) on a DLX architecture. The code calculates a sequence starting from a given integer and generates key statistics on the sequence, including its maximum, mean, and derived values stored in a list. The project includes an optimized and a non-optimized version of the program, demonstrating various assembly-level optimization techniques.
- OPTIMIZADO.S: Optimized assembly version of the 3n + 1 algorithm. This version includes optimizations for reduced execution time, lower instruction count, and minimized stalls.
- SINOPTIMIZAR.S: Non-optimized assembly version of the 3n + 1 algorithm. This version executes the algorithm in a straightforward manner, without optimizations for speed or efficiency.
- utils: Contains a shell script to compile and execute the code in the DLX architecture simulator.
- WINDLX.BAT: A batch file in the WINDLX directory to run the DLX architecture simulator on Windows.
In the utils folder, there is a shell script that compiles the code and runs it in the DLX architecture simulator. For Windows users, the WINDLX.BAT file in the WINDLX directory launches the simulator directly.
This project uses a simulated DLX CPU with the following hardware specifications:
-
DLX CPU with a 5-stage pipelined architecture (image available in PIPELINE).
-
Registers: 32 general-purpose registers (GPRs) and 32 floating-point registers (FPRs), each 32 bits. FPRs can be paired for 64-bit double-precision values.
-
Memory: Byte-addressable, big-endian format with 32-bit addresses.
-
Instructions: All instructions are 32-bit and must be aligned in memory.
-
Special Registers: A few special registers can be transferred to/from the GPRs.
The program computes the following sequence statistics:
- Maximum Value: The largest value encountered in the sequence.
- Mean Value: The average of the sequence values.
- Sequence Length: The total number of terms in the sequence.
- List Calculations: Various derived values based on the initial, mean, maximum values, and sequence length.
Both versions of the algorithm follow similar structures with variations in efficiency and instruction count:
- Sequence Generation: For a given integer
n, the sequence is generated by the rules:- If
nis even, divide it by 2. - If
nis odd, multiply it by 3 and add 1.
- If
- Termination Condition: The sequence generation continues until the value
1is reached.
- Maximum and Mean Calculations: The program tracks the maximum sequence value and the sum of terms to calculate the mean.
- List Calculations: Various products and ratios of the initial, mean, maximum values, and sequence length are computed and stored in a list, including
(Initial/Maximum) * Lengthand(Mean/Initial) * Length.
The optimized version applies several techniques to improve performance:
- Reduced Instruction Count: Redundant instructions are minimized, using fewer registers and combining operations.
- Loop Unrolling: The main sequence loop is unrolled to reduce branch instructions and increase parallelism.
- Multiplication and Division Reductions: Integer and floating-point calculations are simplified by replacing divisions with multiplications where possible (e.g., multiplying by
0.1111111instead of dividing by 9). - Stall Minimization: Dependencies between instructions are reduced by reordering calculations, improving pipeline efficiency and reducing RAW stalls.
- Conditional Branch Reduction: Only one conditional branch is used per loop iteration.
The optimized version demonstrates significant improvements in execution cycles and stall reduction:
- Execution Cycles: The optimized program executes fewer cycles due to minimized instruction count.
- Instruction Count: Reduced through efficient loop unrolling and instruction reordering.
- Stalls: Fewer RAW stalls and control stalls, as the optimized program incurs only 17 RAW stalls versus 702 in the non-optimized version.
- Compile and Run: Compile the
.Sfiles using an assembler compatible with DLX architecture and load them into a DLX simulator. - Input Configuration: Set
valor_inicialto the desired starting integer. The program initializes this value and computes the sequence and statistics automatically. - Results: Upon execution, the sequence statistics are saved in memory locations defined in the
.datasection, includingsecuencia_valor_medio(mean of the sequence),secuencia_maximo(maximum of the sequence), andlista_valor_medio(mean of derived list values).
