-
Notifications
You must be signed in to change notification settings - Fork 32
Description
As has been noted by @barucden in #88, SVMs with user-defined/callable kernels are generally not (de-)serializable. Since the issue has recently been brought up again in conjunction with downstream changes in JuliaAI/MLJLIBSVMInterface.jl#13 it would probably be worth having an issue one can reference to track the problem and collate discussion.
Current situation:
An SVM with a user-defined/callable kernel can be serialized and deserialized without problem, while the kernel function is available:
using LIBSVM
using Serialization
X = [-2 -1 -1 1 1 2;
-1 -1 -2 1 2 1]
y = [1, 1, 1, 2, 2, 2]
kernel(x1, x2) = x1' * x2
model = svmtrain(X, y, kernel=kernel)
ỹ, _ = svmpredict(model, X)
print(y == ỹ) #true
serialize("serialized_svm.jls", model)
model = deserialize("serialized_svm.jls")
T = [-1 2 3;
-1 2 2]
ŷ, _ = svmpredict(model, T)
print([1, 2, 2] == ŷ) #ŧrueAfter exiting and re-entering REPL, kernel is undefined:
using LIBSVM
using Serialization
model = deserialize("serialized_svm.jls") #errorexecution fails with
model = deserialize("serialized_svm.jls")
ERROR: UndefVarError: #kernel not defined
Stacktrace:
[1] open(f::typeof(deserialize), args::String; kwargs::Base.Pairs{Symbol, Union{}, Tuple{}, NamedTuple{(), Tuple{}}})
@ Base ./io.jl:330
[2] open
@ ./io.jl:328 [inlined]
[3] deserialize(filename::String)
@ Serialization /usr/share/julia/stdlib/v1.7/Serialization/src/Serialization.jl:798
[4] top-level scope
@ REPL[3]:1
If kernel is defined at the time deserialize is called, the code works:
using LIBSVM
using Serialization
kernel(x1, x2) = x1' * x2
model = deserialize("serialized_svm.jls")
T = [-1 2 3;
-1 2 2]
ỹ, _ = svmpredict(model, T)
print([1, 2, 2] == ỹ) #trueIn contrast, serialization using built-in kernels works without a problem:
using LIBSVM
using Serialization
X = [-2 -1 -1 1 1 2;
-1 -1 -2 1 2 1]
y = [1, 1, 1, 2, 2, 2]
model = svmtrain(X, y, kernel=Kernel.Linear)
ỹ, _ = svmpredict(model, X)
print(y == ỹ) #true
serialize("serialized_svm.jls", model)After exiting and re-entering REPL:
using LIBSVM
using Serialization
model = deserialize("serialized_svm.jls")
T = [-1 2 3;
-1 2 2]
ỹ, _ = svmpredict(model, T)
print([1, 2, 2] == ỹ) #truePossible Courses
I don't have too much experience with Julia and Serialization.jl in particular, but I see a few ways of tackling this issue:
- Leaving the current state, since there is no "misleading" behaviour. The error message seems pretty clear, at least to me.
- Additionally, adding a note in the README mentioning that serialization doesn't work for user-defined/callable kernels (I haven't gotten around to Doc: add callable kernel example in README #89, will try to work on that over easter or slightly after)
- Maybe it is possible to provide custom serialization strategies that allow us to properly serialize trained models with custom/user-defined kernels. This is probably not possible using
Serialization.jl, since its functionality seems to be rather restricted, but JLD.jl can do it, I think?