-
Notifications
You must be signed in to change notification settings - Fork 352
Description
Required prerequisites
- Search the issue tracker to check if your feature has already been mentioned or rejected in other issues.
Describe the feature
GPU sampling ignores explicit measurement order unless explicit_measurements=True
Description
In my CUDA-Q program, I need the output bitstring ordering to follow the order of the measurement statements.
We are running a noisy simulation (included noise_model) in sample() parameters of a large circuit for many shots, using the nvidia backend target. Below is a simplified example of the issue:
import cudaq
@cudaq.kernel
def circuit():
q = cudaq.qvector(4)
x(q[2]) # Apply X gate to the 3rd qubit (index 2)
mz(q[2])
mz(q[0])
mz(q[1])
mz(q[3])
if __name__ == "__main__":
cudaq.set_target('nvidia')
result = cudaq.sample(circuit)
result_explicit = cudaq.sample(circuit, explicit_measurements=True)For this circuit:
cudaq.sample(circuit)returns0010cudaq.sample(circuit, explicit_measurements=True)returns1000
The 1000 result is the behavior I want, since the measured 1 comes from q[2], which is measured first and should therefore appear in the first bit position of the returned bitstring.
Issue
Without explicit_measurements=True, CUDA-Q appears to ignore the explicit order of the mz(...) statements when constructing output bitstrings.
Using explicit_measurements=True fixes the ordering, but in my case it appears to no longer use the GPU effectively. Performance drops dramatically and GPU utilization falls significantly compared to the default sampling path.
Request
It would be very helpful if CUDA-Q could support explicit measurement ordering while still using the fast GPU sampling path.
If this is not currently supported, it would also be helpful to clarify whether explicit_measurements=True disables GPU-accelerated sampling, and whether there is another recommended way to preserve measurement order while retaining GPU performance.