Skip to content

Commit 415abbf

Browse files
committed
add two_pow member function to MontgomeryForm, and improve asm
1 parent 1def335 commit 415abbf

Some content is hidden

Large Commits have some content hidden by default. Use the searchbox below for content that may be hidden.

47 files changed

+7343
-1501
lines changed

CMakeLists.txt

Lines changed: 6 additions & 19 deletions
Original file line numberDiff line numberDiff line change
@@ -8,7 +8,8 @@ if(TARGET hurchalla_modular_arithmetic)
88
return()
99
endif()
1010

11-
cmake_minimum_required(VERSION 3.14)
11+
# later versions are probably fine, but are untested
12+
cmake_minimum_required(VERSION 3.14...4.03)
1213

1314
project(hurchalla_modular_arithmetic VERSION 1.0.0 LANGUAGES CXX)
1415

@@ -19,30 +20,18 @@ if(CMAKE_CURRENT_SOURCE_DIR STREQUAL CMAKE_SOURCE_DIR)
1920
endif()
2021

2122

22-
# TODO: this section seems slightly messy for detecting/setting up testing
23-
# --------------------
24-
option(TEST_HURCHALLA_LIBS
25-
"Build the tests for all Hurchalla library projects."
26-
OFF)
27-
2823
# if this is the top level CMakeLists.txt, add testing options, and enable
2924
# testing when testing options have been set to ON.
3025
if(CMAKE_CURRENT_SOURCE_DIR STREQUAL CMAKE_SOURCE_DIR)
3126
option(TEST_HURCHALLA_MODULAR_ARITHMETIC
3227
"Build the tests for the Hurchalla modular arithmetic library project."
3328
ON)
34-
if(TEST_HURCHALLA_MODULAR_ARITHMETIC OR TEST_HURCHALLA_LIBS)
29+
option(FORCE_TEST_HURCHALLA_CPP11_STANDARD
30+
"If testing this library, ensure we build googletest and tests using -std=c++11")
31+
if(TEST_HURCHALLA_MODULAR_ARITHMETIC)
3532
enable_testing()
3633
# include(CTest)
3734
endif()
38-
elseif(TEST_HURCHALLA_LIBS)
39-
# If TEST_HURCHALLA_LIBS is set to ON, enable_testing() should have been
40-
# called either directly or indirectly by the top level project. (Note that
41-
# if a project calls include(CTest), the included CTest.cmake defines a
42-
# BUILT_TESTING option and calls enable_testing if BUILD_TESTING is ON.)
43-
if (NOT CMAKE_TESTING_ENABLED)
44-
message(FATAL_ERROR "Fatal error: TEST_HURCHALLA_LIBS was set, but enable_testing() was never called")
45-
endif()
4635
endif()
4736

4837

@@ -92,9 +81,7 @@ target_include_directories(hurchalla_modular_arithmetic
9281

9382
# if this is the top level CMakeLists.txt
9483
if(CMAKE_CURRENT_SOURCE_DIR STREQUAL CMAKE_SOURCE_DIR)
95-
if(TEST_HURCHALLA_MODULAR_ARITHMETIC OR TEST_HURCHALLA_LIBS)
84+
if(TEST_HURCHALLA_MODULAR_ARITHMETIC)
9685
add_subdirectory(test)
9786
endif()
98-
elseif(TEST_HURCHALLA_LIBS)
99-
add_subdirectory(test)
10087
endif()

README.md

Lines changed: 16 additions & 6 deletions
Original file line numberDiff line numberDiff line change
@@ -1,12 +1,22 @@
1-
# The "Clockwork" Modular Arithmetic Library
1+
# The Clockwork Modular Arithmetic library
22

33
![Alt text](images/clockxtrasmall_border2.jpg?raw=true "Clock Gears, photo by Krzysztof Golik, licensed CC BY-SA 4.0")
44

5-
Clockwork is a high performance, easy to use Modular Arithmetic library for C++ provided as a "header-only" library, for up to 128 bit integer types, with extensive support for Montgomery arithmetic. If you want or need Montgomery arithmetic in this range, or general modular arithmetic functions, Clockwork is almost certainly the fastest and easiest library you could use. For best performance make sure you define the standard C++ macro NDEBUG.
5+
Clockwork is a high performance, easy to use Modular Arithmetic library for C++ provided as a "header-only" library, supporting up to 128 bit integer types, and providing extensive support for Montgomery arithmetic. If you want or need Montgomery arithmetic in this range, or general modular arithmetic functions, Clockwork is almost certainly the fastest and easiest library you could use. For best performance make sure you define the standard C++ macro NDEBUG.
6+
7+
The library requires only C++11, and works with all higher versions of the C++ standard.
68

79
## Design goals
810

9-
Clockwork is designed to be a flexible and bulletproof library with the best performance achievable for modular arithmetic of native (on the CPU) integer types. For integer types that are double the native bit width (e.g. 128 bit), performance should still be close to ideal, though not as completely optimized as for native types. Larger than 128 bit types are permissible by [specialization](https://github.com/hurchalla/util/blob/master/include/hurchalla/util/traits/ut_numeric_limits.h); however a library like [GMP](https://gmplib.org/) is likely to be a better choice for such sizes.
11+
Clockwork is designed to be a flexible and bulletproof library with the best performance achievable for modular arithmetic using standard C++ language integer types (e.g. uint32_t or uint64_t) and the language extension types \_\_uint128_t and \_\_int128_t. Larger than 128 bit types are permissible by [specialization](https://github.com/hurchalla/util/blob/master/include/hurchalla/util/traits/ut_numeric_limits.h); however a library like [GMP](https://gmplib.org/) is likely to be a better choice for such sizes.
12+
13+
## Requirements
14+
15+
The Clockwork library requires only compiler support for C++11, which is essentially supported universally at this point.
16+
17+
For good performance you *must* ensure that the standard macro NDEBUG (see <cassert>) is defined when compiling.
18+
19+
Compilers that are confirmed to build this library without warnings or errors on Ubuntu linux (x64) include clang6, clang10, clang18, gcc7, gcc10, gcc13, and intel compiler 19. On Windows, Microsoft Visual C++ 2017, 2019, 2022 are all confirmed to build the library without warnings or errors. On MacOS, clang16 and gcc14 are confirmed to build without warnings or errors. The library is intended for use on all architectures (e.g. x86/64, ARM, RISC-V), but has so far been tested only with x86, x64 (Windows and Ubuntu), and ARM64 (MacOS).
1020

1121
## Status
1222

@@ -36,7 +46,7 @@ It may help to see a simple [example project with CMake](examples/example_with_c
3646

3747
### Without CMake
3848

39-
If you're not using CMake for your project, you'll need to install/copy Clockwork's modular arithmetic headers and dependencies to some directory in order to use them. To do this, first clone this git repository onto your system. You'll need CMake on your system (at least temporarily), so install CMake if you don't have it. Then from your shell run the following commands:
49+
If you're not using CMake for your project, you'll need to install Clockwork's modular arithmetic headers and its dependencies to some directory in order to use them. To do this, first clone this git repository onto your system. You'll need to have CMake (at least temporarily) on your system, so install CMake if you don't have it. Then from your shell run the following commands:
4050

4151
>cd *path_of_the_cloned_modular_arithmetic_repository*
4252
>mkdir tmp
@@ -63,9 +73,9 @@ From the modular_arithmetic group, the files *absolute_value_difference.h*, *mod
6373
*hurchalla::modular_addition_prereduced_inputs(T a, T b, T modulus)*. Returns (a+b)%modulus, performed as if a and b have infinite precision and thus as if (a+b) is never subject to integer overflow.
6474
*hurchalla::modular_multiplication_prereduced_inputs(T a, T b, T modulus)*. Returns (a\*b)%modulus, performed as if a and b have infinite precision.
6575
*hurchalla::modular_multiplicative_inverse(T a, T modulus)*. Returns the multiplicative inverse of a if it exists, and otherwise returns 0.
66-
*hurchalla::modular_pow(T base, T exponent, T modulus)*. Returns the modular exponentiation of base^exponent (mod modulus).
76+
*hurchalla::modular_pow(T base, T exponent, T modulus)*. Returns the modular exponentiation of base to the exponent (mod modulus).
6777

68-
From the montgomery_arithmetic group, the file *MontgomeryForm.h* provides the easy to use (and zero cost abstraction) class *hurchalla::MontgomeryForm*, which has member functions for effortlessly performing operations in the Montgomery domain. These operations include converting to/from Montgomery domain, add, subtract, multiply, square, [fused-multiply-add/sub](https://jeffhurchalla.com/2022/05/01/the-montgomery-multiply-accumulate), pow, gcd, and more. For improved performance in some situations, the file *montgomery_form_aliases.h* provides simple aliases for faster (with limitations on allowed modulus) instantiations of the class MontgomeryForm.
78+
From the montgomery_arithmetic group, the file *MontgomeryForm.h* provides the easy to use (and zero cost abstraction) class *hurchalla::MontgomeryForm*, which has simple member functions for performing operations in the Montgomery domain. These operations include converting to/from Montgomery domain, add, subtract, multiply, square, [fused-multiply-add/sub](https://jeffhurchalla.com/2022/05/01/the-montgomery-multiply-accumulate), pow, gcd, and more. For improved performance, if you can guarantee your modulus will be under half or under a quarter of the maximum value of your integer type T, the file *montgomery_form_aliases.h* provides aliases of the class MontgomeryForm which typically run ~5-10% faster.
6979

7080
For an easy demonstration of MontgomeryForm, you can see one of the [examples](examples/example_without_cmake).
7181

build_tests.sh

Lines changed: 21 additions & 9 deletions
Original file line numberDiff line numberDiff line change
@@ -526,12 +526,29 @@ exit_on_failure () {
526526
script_dir="$( cd "$( dirname "${BASH_SOURCE[0]}" )" >/dev/null 2>&1 && pwd )"
527527

528528

529+
# We don't usually want to force c++11 standard, since that requires that we use
530+
# an older version of googletest that was the final version to support C++11.
531+
# That googletest version's CMakeLists.txt isn't updated for more recent CMake
532+
# versions, and so we have to work around CMake deprecation warnings and errors
533+
# (you can see how this is done in FetchGoogleTest.cmake), which isn't a good
534+
# normal practice to do. However, it is good to prove that our library is C++11
535+
# compatible from time to time. To force c+11, change the line below to true.
536+
force_cpp11=false
537+
538+
539+
if [ "$force_cpp11" = true ]; then
540+
force_cpp11_testing="ON"
541+
else
542+
force_cpp11_testing="OFF"
543+
fi
544+
529545

530546
if [ "${mode,,}" = "release" ]; then
531547
pushd script_dir > /dev/null 2>&1
532548
build_dir=build/release_$compiler_name$compiler_version
533549
mkdir -p $build_dir
534-
cmake -S. -B./$build_dir -DTEST_HURCHALLA_LIBS=ON \
550+
cmake -S. -B./$build_dir -DTEST_HURCHALLA_MODULAR_ARITHMETIC=ON \
551+
-DFORCE_TEST_HURCHALLA_CPP11_STANDARD=$force_cpp11_testing \
535552
-DCMAKE_BUILD_TYPE=Release \
536553
-DCMAKE_CXX_FLAGS="$cpp_standard $cpp_stdlib \
537554
$test_avoid_cselect $test_heavyweight \
@@ -546,7 +563,8 @@ elif [ "${mode,,}" = "debug" ]; then
546563
pushd script_dir > /dev/null 2>&1
547564
build_dir=build/debug_$compiler_name$compiler_version
548565
mkdir -p $build_dir
549-
cmake -S. -B./$build_dir -DTEST_HURCHALLA_LIBS=ON \
566+
cmake -S. -B./$build_dir -DTEST_HURCHALLA_MODULAR_ARITHMETIC=ON \
567+
-DFORCE_TEST_HURCHALLA_CPP11_STANDARD=$force_cpp11_testing \
550568
-DCMAKE_BUILD_TYPE=Debug \
551569
-DCMAKE_EXE_LINKER_FLAGS="$clang_ubsan_link_flags" \
552570
-DCMAKE_CXX_FLAGS="$cpp_standard $cpp_stdlib \
@@ -566,17 +584,11 @@ fi
566584

567585

568586
# -DCMAKE_VERBOSE_MAKEFILE:BOOL=ON
569-
# cmake -S. -B./build_tmp -DCMAKE_CXX_FLAGS="-std=c++17" -DTEST_HURCHALLA_LIBS=ON -DCMAKE_BUILD_TYPE=Debug -DCMAKE_CXX_COMPILER=icpc -DCMAKE_C_COMPILER=icc
587+
# cmake -S. -B./build_tmp -DCMAKE_CXX_FLAGS="-std=c++17" -DTEST_HURCHALLA_MODULAR_ARITHMETIC=ON -DCMAKE_BUILD_TYPE=Debug -DCMAKE_CXX_COMPILER=icpc -DCMAKE_C_COMPILER=icc
570588
# cmake --build ./build_tmp --config Debug
571589

572590

573591
if [ "$run_tests" = true ]; then
574-
# ./$build_dir/test_ndebug_programming_by_contract --gtest_break_on_failure
575-
# exit_on_failure
576-
# ./$build_dir/test_programming_by_contract --gtest_break_on_failure
577-
# exit_on_failure
578-
./$build_dir/test_hurchalla_util --gtest_break_on_failure
579-
exit_on_failure
580592
./$build_dir/test_hurchalla_modular_arithmetic --gtest_break_on_failure
581593
exit_on_failure
582594
fi

modular_arithmetic/CMakeLists.txt

Lines changed: 3 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -8,7 +8,8 @@ if(TARGET hurchalla_basic_modular_arithmetic)
88
return()
99
endif()
1010

11-
cmake_minimum_required(VERSION 3.14)
11+
# later versions are probably fine, but are untested
12+
cmake_minimum_required(VERSION 3.14...4.03)
1213

1314
project(hurchalla_basic_modular_arithmetic VERSION 1.0.0 LANGUAGES CXX)
1415

@@ -73,7 +74,7 @@ include(FetchContent)
7374
FetchContent_Declare(
7475
hurchalla_util
7576
GIT_REPOSITORY https://github.com/hurchalla/util.git
76-
GIT_TAG master
77+
GIT_TAG 9163344ee69f21a21cb8928dd30fc2ff15e94f0c
7778
)
7879
FetchContent_MakeAvailable(hurchalla_util)
7980

modular_arithmetic/include/hurchalla/modular_arithmetic/detail/impl_modular_pow.h

Lines changed: 5 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -22,11 +22,13 @@ namespace hurchalla { namespace detail {
2222
// For details, see http://en.wikipedia.org/wiki/Modular_exponentiation
2323
// note: uses a static member function to disallow ADL.
2424
struct impl_modular_pow {
25-
template <typename T>
26-
HURCHALLA_FORCE_INLINE static T call(T base, T exponent, T modulus)
25+
template <typename T, typename U>
26+
HURCHALLA_FORCE_INLINE static T call(T base, U exponent, T modulus)
2727
{
2828
static_assert(ut_numeric_limits<T>::is_integer, "");
2929
static_assert(!(ut_numeric_limits<T>::is_signed), "");
30+
static_assert(ut_numeric_limits<U>::is_integer, "");
31+
static_assert(!(ut_numeric_limits<U>::is_signed), "");
3032
HPBC_PRECONDITION2(modulus > 1);
3133

3234
namespace hc = ::hurchalla;
@@ -51,7 +53,7 @@ struct impl_modular_pow {
5153
T result = hc::conditional_select((exponent & 1u), base, static_cast<T>(1));
5254
while (exponent > 1)
5355
{
54-
exponent = static_cast<T>(exponent >> 1);
56+
exponent = static_cast<U>(exponent >> 1);
5557
base = hc::modular_multiplication_prereduced_inputs(base, base, modulus);
5658
if (exponent & 1u) {
5759
result = hc::modular_multiplication_prereduced_inputs(

0 commit comments

Comments
 (0)