Skip to content

McDaMastR/CollatzConjectureSimulator

Repository files navigation

Collatz Conjecture Simulator

A program to efficiently determine the total stopping time of the Collatz sequence of any 128-bit integer, and find all starting values $n$ whose total stopping time is the greatest of all integers in the interval $[1,\,n]$.

Mathematical Notation

This document assumes familiarity with mathematical concepts including sequences, functions, recursion, and modular arithmetic. Additionally, the following conventions and notation will be used.

  • $0\notin\mathbb N$.
  • $\mathbb N^*=\mathbb N\cup\{0,\infty\}$.
  • $f^0(x)=x$ and $f^n(x)=(f\circ f^{n-1})(x)$ for any function $f$, real $x$, and natural $n$.

The Collatz Conjecture

The Collatz Conjecture is an iconic unsolved problem in number theory. It regards the Collatz function $f:\mathbb N\to\mathbb N$, commonly defined as

$$f(x)= \begin{cases} \frac12x &\text{if }x\text{ is even,}\\\ 3x+1 &\text{if }x\text{ is odd.} \end{cases}$$

The Collatz sequence of some $n\in\mathbb N$ is the infinite sequence $(f^i(n))_{i=0}^\infty$ in which the element at index $i$ is given by the $i$-th recursive application of the Collatz function evaluated at the starting value $n$. A cycle is a finite contiguous subsequence $(a_i)_{i=0}^k$ of a Collatz sequence in which every element is distinct and $f(a_k)=a_0$. The trivial cycle is the cycle $(1,4,2)$.

If the Collatz sequence of $n$ contains some element less than $n$, then the index of the first such element is the stopping time of $n$. So $k$ is the stopping time of $n$ iff $f^k(n)<n$ and $f^i(n)\ge n$ for all $i<k$. Similarly, if the Collatz sequence of $n$ contains the value $1$, then the index of the first element of value $1$ is the total stopping time of $n$. So $k$ is the total stopping time of $n$ iff $f^k(n)=1$ and $f^i(n)>1$ for all $i<k$.

The following additional notation and terminology will be used. A step is a single application of the Collatz function. The step count function $s:\mathbb N\to\mathbb N^*$ outputs the total stopping time of the argument, if it exists. Otherwise, it outputs positive infinity.

The Collatz Conjecture posits that every natural number has a finite total stopping time. This means

$$\forall n\in\mathbb N\,,\;\exists k\in\mathbb N\,,\;f^k(n)=1\,.$$

The Simulation

Collatz Conjecture Simulator aims to find the starting values with the greatest total stopping times. That is, positive integers $n$ such that $s(n)>s(m)$ for all $m<n$. Total stopping times are calculated by iterating through Collatz sequences and counting each step until a value of $1$ is found. Because each such iteration is computationally independent, the program uses the GPU to iterate through multiple Collatz sequences in parallel. The GPU is accessed via the Vulkan API and uses compute shaders to perform the iterations. Collatz Conjecture Simulator is primarily written in C, however the shaders are written in GLSL and compiled to SPIR-V.

Program Requirements

The general environment and system requirements that must be met for Collatz Conjecture Simulator to build and run correctly are listed below. The full requirements of the GPU are given in device_requirements.md.

  • CMake 3.24.
  • pthreads.
  • glslang.
  • SPIR-V Tools.
    • spirv-link
    • spirv-opt
    • spirv-dis
  • C11.
    • _Atomic (Optional C11 feature)
    • __int128 (GNU C extension)
  • Vulkan 1.1.
    • VK_KHR_copy_commands2
    • VK_KHR_map_memory2
    • VK_KHR_synchronization2
    • VK_KHR_timeline_semaphore

Building and Running

Collatz Conjecture Simulator is built via CMake. Comprehensive documentation regarding usage of CMake can be found at the CMake website. To generate the build system, navigate the terminal to the project directory and execute the following command.

cmake -S . -B build

Several options can be specified to customise the build system by appending -D OPTION=CONFIG to the above command.

  • CMAKE_BUILD_TYPE specifies the build variant and can be set to Debug, Release, MinSizeRel, or RelWithDebInfo. If not set, it defaults to Debug.
  • EXCESS_WARNINGS specifies whether to compile the program with a potentially excessive amount of warnings, and defaults to OFF.
  • STATIC_ANALYSIS specifies whether to statically analyse the program during compilation if compiling with GCC, and defaults to OFF.
  • DEBUG_SHADERS specifies whether to include debug information in generated SPIR-V, and defaults to OFF.
  • OPTIMISE_SHADERS specifies whether to optimise generated SPIR-V using spirv-opt, and defaults to ON.
  • DISASSEMBLE_SHADERS specifies whether to disassemble generated SPIR-V using spirv-dis, and defaults to OFF.

Once the above command has finished, a build directory will have been created containing the build system. To now build Collatz Conjecture Simulator, execute the following command.

cmake --build build

By default, only the executable will be built. To instead build the SPIR-V, add --target spirv. To build both, also add --target cltz. To specify the build configuration, add --config CONFIG (only applies for multi-config generators).

The above command will create a bin directory containing the SPIR-V and executable. If built in debug, the executable will be named cltz-dbg. Otherwise, it will be named cltz. The executable must be run from within the bin directory, else it will be unable to locate the generated SPIR-V.

The executable provides a command line interface and uses the initial command line parameters to specify the operation of the program. Parameters beginning with a double hyphen (--) reference options. Some options themselves accept a parameter, which must be given immediately following the option as the next CLI parameter. To view a comprehensive list of options, use the -h or --help option.

In most cases, the executable will initiate the program's main loop. If during this process the Enter or Return keys are pressed, the program will break from the main loop and begin to exit. Each iteration of the main loop will output information regarding the computations performed, most prominently including benchmarking data for various subprocesses.

Common Problems

If running the program results in a VK_ERROR_DEVICE_LOST error message, it may be due to the compute shaders taking too long to execute. On many operating systems and graphics drivers, if the GPU spends too much time processing a single request, the operation can timeout to prevent the GPU from freezing. Such a scenario is explicitly mentioned in the Vulkan specification (version 1.4.326, section 5.2.3 Lost Device).

Typical reasons for device loss will include things like execution timing out (to prevent denial of service), power management events, platform resource management, implementation errors.

In this case, the error can be fixed by running the program with a lower proportion of accessible GPU memory, resulting in less computation time per compute dispatch. This is done by adding the --max-memory option, followed by the proportion of available GPU memory that will be accessible to the program. For example, --max-memory 0.2 means the program will use at most 20% of the available GPU memory.

Inout-buffers

To facilitate this use of the GPU, inout-buffers are used. Inout-buffers are ranges of GPU memory within VkBuffer objects and consist of an in-buffer and out-buffer. In-buffers are shader storage buffer objects (SSBOs) and contain an array of 128-bit unsigned integer starting values. Out-buffers are also SSBOs and contain an array of 16-bit unsigned integer total stopping times.

The main loop consists of the CPU writing starting values to in-buffers; the GPU reading starting values from in-buffers, iterating through Collatz sequences, and writing total stopping times to out-buffers; and the CPU reading total stopping times from out-buffers. The number of inout-buffers is dependent on the system's specifications. There are one or more inout-buffers per VkBuffer object, one VkBuffer object per VkDeviceMemory object, and two or more VkDeviceMemory objects.

Collatz Conjecture Simulator attempts to minimise the time spent idle by the CPU and GPU due to one waiting for the other. Such as the GPU waiting for starting values, or the CPU waiting for total stopping times. This is done by having an even number of VkDeviceMemory objects, where half contain memory close to the GPU (device local memory) and half contain memory visible to both the CPU and GPU (host visible memory). There are thus four types of memory ranges: host visible in-buffers (HV-in), host visible out-buffers (HV-out), device local in-buffers (DL-in), and device local out-buffers (DL-out).

Rather than the CPU and GPU taking turns executing, both processors spend time running in parallel. The CPU reads from and writes to host visible inout-buffers, and the GPU reads from and writes to device local inout-buffers, simultaneously. Starting values are written to HV-in, copied from HV-in to DL-in, and read from DL-in. Total stopping times are written to DL-out, copied from DL-out to HV-out, and read from HV-out.

flowchart LR
  CPU-->In-buffer
  In-buffer-->GPU
  GPU-->Out-buffer
  Out-buffer-->CPU
  subgraph In-buffer
    direction LR
    HV-in-->DL-in
  end
  subgraph Out-buffer
    direction BT
    DL-out-->HV-out
  end
Loading

Starting Value Selection

A property of the step count function is that if a starting value $n$ has a Collatz sequence containing the starting value $m$ at index $k$, then $s(n)$ can be expressed as the sum of $s(m)$ and $k$. That is, $f^k(n)=m$ implies $s(n)=s(m)+k$. It can be mathematically demonstrated that particular starting values will always have Collatz sequences with finite stopping times. So for such starting values $n$, there must be some index $k$ such that $f^k(n)=m$ for some $m<n$. As such, by knowing the values of $s(m)$ and $k$, it is possible to calculate the total stopping time of $n$ as their sum rather than by iterating through the Collatz sequence of $n$.

This can be advantageous since a single addition operation is significantly less computationally intensive than iterating through a Collatz sequence. It can also allow for greater parallelism if such calculations are performed by the CPU, as this would have both the CPU and GPU calculating total stopping times and thereby more evenly splitting the workload between the two processors. For these reasons, Collatz Conjecture Simulator partitions the workload for calculating total stopping times between the CPU and GPU according to this logic. In particular, two sets of starting values are allocated to the CPU.

The first set is the even starting values; all $n\in\mathbb N$ such that $n\equiv0\pmod2$ and hence $n=2m$ for some $m\in\mathbb N$. A single step from $n$ results in $f(2m)=m$, meaning $s(2m)=s(m)+1$.

The second set is the starting values $n$ such that $n\equiv1\pmod4$ and hence $n=4m+1$ for some $m\in\mathbb N$. Stepping three times from $n$ results in $f^3(4m+1)=f^2(12m+4)=f(6m+2)=3m+1$, meaning $s(4m+1)=s(3m+1)+3$.

The union of these two sets excludes only starting values $n$ such that $n\equiv3\pmod4$, whose total stopping times are calculated by the GPU. The remaining starting values' total stopping times are calculated by the CPU.

Licensing

Collatz Conjecture Simulator is licensed under version 3 of the GNU General Public License (GPLv3). A copy of this licence should be included with Collatz Conjecture Simulator. Otherwise, the licence may be viewed at the GNU website.

The building of Collatz Conjecture Simulator is dependent on a number of external works, each with their own licence or licences. The below table lists each such work with its corresponding licence or licences.

Work Licences
volk MIT
Vulkan-Headers MIT, Apache-2.0
Vulkan-Utility-Libraries Apache-2.0

Artificial Intelligence

The author of Collatz Conjecture Simulator is not a lawyer, but strongly believes the usage of GPLv3-licensed works in the training and development of proprietary AI is necessarily violating of said licence. However, in the event the GPLv3 does not in itself encompass the above restriction, the following shall apply.

Collatz Conjecture Simulator includes in its terms and conditions regarding copying, distribution, and modification, in addition to those provided by version 3 of the GNU General Public Licence, the strict prohibition of its usage by artificial intelligence software not licensed in their entirety, as a whole, under version 3 or later of the GNU General Public Licence, including but not limited to the training, prompting, or generation of such artificial intelligence models or algorithms.