The high-level model should support multiple (16) reads/writes in parallel, just like the real system does. To implement this, we need to add some more parallelism to the model, such that requests get their individual status and card DDR access is sequentialised and AXI requests on the DDR bus get done after each other.
Estimated time to implement this: 4PWs.