Open
Conversation
…redesign_solveivp
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Objective
This PR serves as foundation for both orbit and state arrays. It focusses on functionalities provided by the
coremodule and their invocation. All relevantcorefunctions are designed to work equally on CPUs and GPUs as either universal functions or generalized universal functions. As a "side-effect", all relevantcorefunctions allow parallel operation with full broadcasting semantics.Summary
All dependencies of
Orbitand state classes towardscoreare refactored as follows:numba.vectorizeandnumba.guvectorizeserve as the only interface between regular uncompiled Python code andcorenumba.vectorizeandnumba.guvectorizeonly call functions decorated bynumba.jit/numba.cuda.jitnumba.jit/numba.cuda.jitcan only call each other.numba.vectorize,numba.guvectorizeandnumba.jit/numba.cuda.jit:mathmodule, but notnumpy- except for certain details like enforcing floating point precisionThe above mentioned "hierarchy" of decorators is imposed by CUDA-compatibility. While functions decorated by
numba.jit(targetscpuandparallel) can be called from uncompiled Python code, functions decorated bynumba.cuda.jit(targetcuda) are considered "device functions" and can not be called by uncompiled Python code directly. They are supposed to be called by CUDA-kernels (or other device functions) only (slightly simplifying the actual situation as implemented bynumba). If the target is set tocuda, functions decorated bynumba.vectorizeandnumba.guvectorizebecome CUDA kernels.Eliminating
numpyas a dependency serves two purposes. While it also contributes to CUDA-compatiblity, it additionally makes the code significantly faster on CPUs.New decorators are introduced, wrapping
numba.jit,numba.cuda.jit,numba.vectorizeandnumba.guvectorize, centralizing compiler options and target switching (cpu,parallelorcuda) as well as simplifying typing:-
vjit: Wrapsnumba.vectorize. Functions decorated by it carry the suffix_vf.-
gjit: Wrapsnumba.guvectorize. Functions decorated by it carry the suffix_gf.-
hjit: Wrapsnumba.jitornumba.cuda.jit, depending on compiler target. Functions decorated by it carry the suffix_hf.-
djit: Variation ofhjitwith fixed function signature for user-provided functions used byCowellAll mentioned wrappers are found in
core.jit.As a result of name suffixes, a number of
coremodule functions have been renamed making the package intentionally backwards-incompatible. Functions not yet using the new infrastructure can be recognized based on lack of suffix.corefunctions dynamically generating (and compiling) other functions carry_hb,_vband_gbsuffixes.Math
The former
_mathmodule has become a first-class citizen ascore.math, fully compiled by the above mentioned infrastructure.All compiled code now enforces a single floating point precision level, which can be configured by users. The default is FP64 / double precision. For simplicity, the type shortcut is
f. Additional infrastructure can be found incore.math.ieee754.core.mathcontains a number of replacements fornumpyoperations, mostly found incore.math.linalg. All of those functions do not allocate memory and are free of side-effects including a lack of changes to their parameters. 3D vectors are expressed as tuples (type shortcutV, replacingTuple([f,f,f])). Matrices are expressed as tuples of tuples (type shortcutM, replacingTuple([V,V,V])).core.mathalso replaces (some) requiredscipyfunctions:- scipy.interpolate.interp1d is replaced by
core.math.interpolate.interp_hb. It custom-compiles 1D linear interpolators, embedding data statically into the compiled functions.- scipy.integrate.solve_ivp, scipy.integrate.DOP853 and scipy.optimize.brentq are replaced by
core.math.ivp.Style
coremodules now explicitly export APIs via__all__.Settings
This PR introduces a new
settingsmodule. It currently serves to control compiler options, e.g. the compiler target (cpu,parallelandcuda). Settings can be switched by either setting environment variables or importing thesettingsmodule before any other (sub-) module is imported.Logging
This PR introduces basic logging functionality. Logger name equals package name. The logger can also be imported from the new
debugmodule. At the moment, it only logs compiler status messages and issues.Blocking
numbaissues for CUDA-compatibilitymath.nextafter:nextafter(via bothmathandnumpy) missing for CUDA numba/numba#9435guvectorize: Allow multiple outputs forguvectorizeon CUDA target numba/numba#8303__name__attribute for 'CUDAUFuncDispatcher': 'CUDAUFuncDispatcher' object has no attribute '__name__' numba/numba#8272Non-blocking
numbaissues for CUDA-compatibility with workaround presentnumba.cuda.jitnumba/numba#7870Non-blocking
numbaissues unrelated to CUDA with workaround present() -> (n)in output argument: @guvectorize not accepting new size variable (i.e. () -> (n)) in output argument numba/numba#2797TODO
core.math.ivp📚 Documentation preview 📚: https://hapsira--7.org.readthedocs.build/en/7/