Currently, MLK_CONFIG_SERIAL_FIPS202_ONLY only disables using the 4-way batched shake128 for rejection sampling. However, we still use shake256x4 for sampling error/secret vectors.
While this is not big problem (as shake256x4 can easily be mapped to serial computation by including a custom fips202x4.h), it still forces a consumer requiring MLK_CONFIG_SERIAL_FIPS202_ONLY to provide a custom fips202x4.h header (see e.g., the current OpenTitan integration).
It would be more convienent to not include fips202x4 at all if MLK_CONFIG_SERIAL_FIPS202_ONLY is set and handle it internally instead.
I implemented that in mldsa-native (as it all fips202x4 APIs there are incompatible with MLK_CONFIG_SERIAL_FIPS202_ONLY anyway) here: pq-code-package/mldsa-native#558