First, thanks for the hard work on this package. It looks like a great way to get higher-order interactions to potentially improve on the standard FM models/packages.
It looks like the constant offsets/intercepts are not learned. Is this a to-do item, or is it something that's easy to fix by, for example, doing a global demean of the training outputs y_train in the case of regression? What about classification? Does it matter at all in that case?