Something fishy with FBL after the one-kernel rewrite. Commit e56e9f5 works deterministically, but the most recent master as of today (with the single kernel) gives banded errors on my laptop for "large" domain sizes. Uncovered by running the testCasesDemos/NumericalOrder.ipynb notebook.
The symptoms are "random" values in horizontal bands, perhaps (likely?) related to the block size. Perhaps the last row of shared memory of one variable is not initialized properly? Initial superficial debugging lead to nothing, so I suggest running cuda-memcheck to uncover the issue.