straight forward implementation of `__add__`, `__mul__`, `__sub__` etc for the `KernelMatrix` class, with some smart optimizations when possible.