|
| 1 | +# Parallel Blocks and Optimization |
| 2 | + |
| 3 | +This guide explains the `Parallel` block construct in PECOS's SLR (Structured Language Representation) and the optimization transformations available for parallel quantum operations. |
| 4 | + |
| 5 | +## Overview |
| 6 | + |
| 7 | +The `Parallel` block is a semantic construct that indicates operations within it can be executed simultaneously on quantum hardware. While standard quantum circuit representations execute gates sequentially, real quantum hardware often supports parallel gate execution on disjoint qubits. |
| 8 | + |
| 9 | +When using `SlrConverter`, parallel optimization is enabled by default. This means operations within `Parallel` blocks will be automatically reordered to maximize parallelism while respecting quantum gate dependencies. Programs without `Parallel` blocks are unaffected. |
| 10 | + |
| 11 | +## Basic Usage |
| 12 | + |
| 13 | +### Creating Parallel Blocks |
| 14 | + |
| 15 | +```python |
| 16 | +from pecos.slr import Main, Parallel, QReg |
| 17 | +from pecos.qeclib import qubit as qb |
| 18 | + |
| 19 | +prog = Main( |
| 20 | + q := QReg("q", 4), |
| 21 | + Parallel( |
| 22 | + qb.H(q[0]), |
| 23 | + qb.H(q[1]), |
| 24 | + qb.X(q[2]), |
| 25 | + qb.Y(q[3]), |
| 26 | + ), |
| 27 | +) |
| 28 | +``` |
| 29 | + |
| 30 | +### Nested Structures |
| 31 | + |
| 32 | +Parallel blocks can contain other blocks for logical grouping: |
| 33 | + |
| 34 | +```python |
| 35 | +prog = Main( |
| 36 | + q := QReg("q", 6), |
| 37 | + Parallel( |
| 38 | + Block( # Bell pair 1 |
| 39 | + qb.H(q[0]), |
| 40 | + qb.CX(q[0], q[1]), |
| 41 | + ), |
| 42 | + Block( # Bell pair 2 |
| 43 | + qb.H(q[2]), |
| 44 | + qb.CX(q[2], q[3]), |
| 45 | + ), |
| 46 | + ), |
| 47 | +) |
| 48 | +``` |
| 49 | + |
| 50 | +## Parallel Optimization |
| 51 | + |
| 52 | +The `ParallelOptimizer` transformation pass analyzes operations within `Parallel` blocks and reorders them to maximize parallelism while respecting quantum gate dependencies. |
| 53 | + |
| 54 | +### How It Works |
| 55 | + |
| 56 | +1. **Dependency Analysis**: The optimizer tracks which qubits each operation acts on |
| 57 | +2. **Operation Grouping**: Operations on disjoint qubits are grouped by gate type |
| 58 | +3. **Reordering**: Groups are arranged to maximize parallel execution opportunities |
| 59 | + |
| 60 | +### Example Transformation |
| 61 | + |
| 62 | +**Before optimization:** |
| 63 | +```python |
| 64 | +Parallel( |
| 65 | + Block(H(q[0]), CX(q[0], q[1])), |
| 66 | + Block(H(q[2]), CX(q[2], q[3])), |
| 67 | + Block(H(q[4]), CX(q[4], q[5])) |
| 68 | +) |
| 69 | +``` |
| 70 | + |
| 71 | +**After optimization:** |
| 72 | +```python |
| 73 | +Block( |
| 74 | + Parallel(H(q[0]), H(q[2]), H(q[4])), # All H gates |
| 75 | + Parallel(CX(q[0],q[1]), CX(q[2],q[3]), CX(q[4],q[5])) # All CX gates |
| 76 | +) |
| 77 | +``` |
| 78 | + |
| 79 | +### Using the Optimizer |
| 80 | + |
| 81 | +#### With SlrConverter |
| 82 | + |
| 83 | +The simplest way to use the optimizer is through `SlrConverter`: |
| 84 | + |
| 85 | +```python |
| 86 | +from pecos.slr import SlrConverter |
| 87 | + |
| 88 | +# With optimization (default) |
| 89 | +qasm = SlrConverter(prog).qasm() |
| 90 | + |
| 91 | +# Without optimization |
| 92 | +qasm_unoptimized = SlrConverter(prog, optimize_parallel=False).qasm() |
| 93 | +``` |
| 94 | + |
| 95 | +#### Direct Usage |
| 96 | + |
| 97 | +For more control, use the optimizer directly: |
| 98 | + |
| 99 | +```python |
| 100 | +from pecos.slr.transforms import ParallelOptimizer |
| 101 | + |
| 102 | +optimizer = ParallelOptimizer() |
| 103 | +optimized_prog = optimizer.transform(prog) |
| 104 | +``` |
| 105 | + |
| 106 | +### QASM Output Comparison |
| 107 | + |
| 108 | +Given the Bell state example above: |
| 109 | + |
| 110 | +**Without optimization:** |
| 111 | +```qasm |
| 112 | +h q[0]; |
| 113 | +cx q[0], q[1]; |
| 114 | +h q[2]; |
| 115 | +cx q[2], q[3]; |
| 116 | +h q[4]; |
| 117 | +cx q[4], q[5]; |
| 118 | +``` |
| 119 | + |
| 120 | +**With optimization:** |
| 121 | +```qasm |
| 122 | +h q[0]; |
| 123 | +h q[2]; |
| 124 | +h q[4]; |
| 125 | +cx q[0], q[1]; |
| 126 | +cx q[2], q[3]; |
| 127 | +cx q[4], q[5]; |
| 128 | +``` |
| 129 | + |
| 130 | +## Limitations and Conservative Behavior |
| 131 | + |
| 132 | +The optimizer is conservative to ensure correctness: |
| 133 | + |
| 134 | +### Control Flow |
| 135 | + |
| 136 | +Parallel blocks containing control flow (`If`, `Repeat`) are not optimized: |
| 137 | + |
| 138 | +```python |
| 139 | +Parallel( |
| 140 | + qb.H(q[0]), |
| 141 | + If(c[0] == 1).Then(qb.X(q[1])), # Control flow prevents optimization |
| 142 | + qb.H(q[2]), |
| 143 | +) |
| 144 | +``` |
| 145 | + |
| 146 | +### Dependencies |
| 147 | + |
| 148 | +Operations with qubit dependencies maintain their order: |
| 149 | + |
| 150 | +```python |
| 151 | +Parallel( |
| 152 | + qb.H(q[0]), |
| 153 | + qb.CX(q[0], q[1]), # Depends on H(q[0]) |
| 154 | + qb.X(q[1]), # Depends on CX |
| 155 | +) |
| 156 | +# These operations cannot be reordered |
| 157 | +``` |
| 158 | + |
| 159 | +## Implementation Details |
| 160 | + |
| 161 | +### Transformation Process |
| 162 | + |
| 163 | +1. **Bottom-up traversal**: Inner blocks are transformed first |
| 164 | +2. **Conservative checking**: Blocks with control flow are skipped |
| 165 | +3. **Dependency graph**: Built based on qubit usage |
| 166 | +4. **Topological sorting**: Ensures dependency preservation |
| 167 | +5. **Type-based grouping**: Operations grouped by gate type |
| 168 | + |
| 169 | +### Code Structure |
| 170 | + |
| 171 | +- `pecos/slr/misc.py` - Contains the `Parallel` class definition |
| 172 | +- `pecos/slr/transforms/parallel_optimizer.py` - Optimization implementation |
| 173 | +- `pecos/slr/gen_codes/gen_qasm.py` - QASM generation (treats Parallel as Block) |
| 174 | +- `pecos/slr/gen_codes/gen_qir.py` - QIR generation (treats Parallel as Block) |
| 175 | + |
| 176 | +## Future Enhancements |
| 177 | + |
| 178 | +Potential improvements for the Parallel block system: |
| 179 | + |
| 180 | +1. **Barrier semantics**: Use `Barrier` statements as optimization boundaries |
| 181 | +2. **Classical operation handling**: Special treatment for measurements and classical ops |
| 182 | +3. **Hardware-aware optimization**: Consider specific hardware connectivity |
| 183 | +4. **Scheduling hints**: Allow users to specify scheduling preferences |
| 184 | +5. **Performance metrics**: Report estimated parallelism improvements |
| 185 | + |
| 186 | +## Testing |
| 187 | + |
| 188 | +Comprehensive tests are available in: |
| 189 | +- `python/tests/pecos/unit/test_parallel_optimizer.py` - Core functionality tests |
| 190 | +- `python/tests/pecos/unit/test_parallel_optimizer_verification.py` - Transformation verification |
| 191 | +- `python/tests/pecos/regression/test_qasm/random_cases/test_control_flow.py` - QASM generation tests |
| 192 | + |
| 193 | +## Best Practices |
| 194 | + |
| 195 | +1. **Use Parallel blocks for independent operations**: Only wrap operations that can truly execute in parallel |
| 196 | +2. **Group related operations**: Use nested blocks for logical grouping (e.g., Bell pairs) |
| 197 | +3. **Optimization is on by default**: Use `optimize_parallel=False` to disable when needed |
| 198 | +4. **Verify transformations**: Check generated QASM/QIR to ensure desired optimization |
| 199 | +5. **Consider hardware constraints**: Real devices have limited parallelism capabilities |
| 200 | + |
| 201 | +## Example: Quantum Fourier Transform |
| 202 | + |
| 203 | +Here's a more complex example showing parallel phase gates: |
| 204 | + |
| 205 | +```python |
| 206 | +from pecos.slr import Main, Parallel, QReg |
| 207 | +from pecos.qeclib import qubit as qb |
| 208 | + |
| 209 | +def qft_layer(q, n, k): |
| 210 | + """Generate parallel controlled rotations for QFT layer k""" |
| 211 | + operations = [] |
| 212 | + for j in range(k+1, n): |
| 213 | + angle = np.pi / (2 ** (j - k)) |
| 214 | + operations.append(qb.CRZ[angle](q[j], q[k])) |
| 215 | + return Parallel(*operations) if len(operations) > 1 else operations[0] |
| 216 | + |
| 217 | +# QFT with parallel phase gates |
| 218 | +prog = Main( |
| 219 | + q := QReg("q", 4), |
| 220 | + qb.H(q[0]), |
| 221 | + qft_layer(q, 4, 0), |
| 222 | + qb.H(q[1]), |
| 223 | + qft_layer(q, 4, 1), |
| 224 | + qb.H(q[2]), |
| 225 | + qft_layer(q, 4, 2), |
| 226 | + qb.H(q[3]), |
| 227 | +) |
| 228 | +``` |
| 229 | + |
| 230 | +This structure makes the inherent parallelism in QFT explicit and allows the optimizer to group operations effectively. |
0 commit comments