-
Notifications
You must be signed in to change notification settings - Fork 120
Update ASTRA integration #703
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: dev
Are you sure you want to change the base?
Conversation
This includes supporting CuPy inputs (via zero-copy GPU data exchange). ci: add ASTRA to dev-gpu requirements
00fe264 to
e3996ca
Compare
|
Thanks @askorikov! Let me reply to the questions below and then I will do a more general review of the code.
It is kind of expected that the code becomes a bit more complicated for operators having to handle multiple backend when it's not a pure numpy/cupy switch, so in principle I have no problem. So far PyLops' philosophy has been that we don't want operators to be magic black boxes so we want users to always provide numpy arrays when the operator works on CPU and cupy (or Jax) arrays for GPUs. In special cases where you cannot solve the entire problem on the GPU, we have an auxiliary operator called
Well, in general we strive for forward-adjoint pairs that pass the dot test, for fp64 quite tighly (say atol=1e-6) and for fp32 of course much less.... if you know that your operator isn't perfectly matched we can still add it and add a test with the threshold you expect, I would prefer that to skip (or at least if you skip it I would like the tests to run both the forward and adjoint somehow to make sure we test they at least run 😄
See above; in general, we do not assert but we have this as a convention... or in other words, if one expects to pass a CuPy array the other input parameters of the operator should also be passed as CuPy arrays - though I don't think this applies to
We do not aim a 100% coverage with JAX - see this tables https://pylops.readthedocs.io/en/latest/gpu.html#supported-operators. So if you support it great, if not we just need to make sure the row for CT2D is updated accordingly.
Mmh I guess this was me making this choice... I trust your judgement of what you think is best. Maybe we can just follow the ASTRA default?
I think this is fine. I would suggest we do something like that so when we chain operators if one wants to use fp64 we don't break the chain by all of a sudden passing out a fp32. We can add a note to the doc saying that even if you pass fp64 the internal operations are done in fp32. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Very nice addition!
Overall the code changes look great, just left a few minor suggestions 😄
baa4d61 to
0a5666d
Compare
* Separate basic functionality and adjointness tests, disable adjointness tests for CUDA backend of ASTRA * Add tests for data that is not natively compatible with ASTRA * Remove projector_type tests (too messy for the purpose, and CUDA backend doesn't support them at all) * Use fixture pattern
* Probably related to float32 dtype used by ASTRA
2.2+ is needed for NumPy 2 support, 2.3+ for GPU data exchange support
|
A couple more questions:
|
So but in practice we are not so strict, in the sense that the below does not really respect the dtype of the operator.... I have been always tempted to make this more strict but i) it would require a major version bump, ii) it would require some conventions to be set (ie if the input and operator do not match in dtype, which one wins) or being very strict raising errors all the time there is no match.. so not sure at this point 😉
Hope this makes sense? |
This PR adds support for CUDA backend of ASTRA, including the support for CuPy inputs (partially addressing #701). As you can see, the code becomes quite a bit more messy, but that is to cover several scenarios at the same time:
engineorprojector_typetocuda)With this, I have the following observations/questions:
cpuengine and on GPU forcudaengine?projector_typenow (strip) favors accuracy at the expense of performance. For experimental data the accuracy difference is usually not noticeable, so the fasterlinearprojector is more commonly used. It's also the only projector available in CUDA backend. Let me know what you think fits the expected audience better.