Skip to content

LIBSVM call on remote workers requires waiting #107

@aquaresima

Description

@aquaresima

Hello,

I have been getting a bit crazy on this (now solved) problem.

I am running a SVM classification on remote workers (doesn't matter if on SLURM or my local machine), the LIBSVM is imported under the hood
of another module, let’s call it ClassifierModule, so that it can preprocess my raw data.

The code looks something like this:

using Distributed
addprocs(2)
@everywhere module MyClassifier 
    using Distributed
    using MLJ
    using LIBSVM
    # Define a function to run on the remote node
    function run_svm()
        # Your LIBSVM code here
        SVMClassifier = MLJ.@load SVC pkg=LIBSVM verbosity=0
        svm = SVMClassifier(kernel=LIBSVM.Kernel.Linear)
        return svm
    end
    export run_svm
end

# Run the function on a remote node
future = @spawnat 2 MyClassifier.run_svm()
model = fetch(future) # ! Here it gets the error

Running this code gives me this error:

ERROR: MethodError: no method matching MLJLIBSVMInterface.SVC(; kernel::LIBSVM.Kernel.KERNEL)
The type `MLJLIBSVMInterface.SVC` exists, but no method is defined for this combination of argument types when trying to construct it.

Closest candidates are:
  MLJLIBSVMInterface.SVC(::Any, ::Float64, ::Float64, ::Float64, ::Int32, ::Float64, ::Float64, ::Bool) got unsupported keyword argument "kernel" (method too new to be called from this world context.)
   @ MLJLIBSVMInterface ~/.julia/packages/MLJLIBSVMInterface/zUY3E/src/MLJLIBSVMInterface.jl:56
  MLJLIBSVMInterface.SVC(::Any, ::Any, ::Any, ::Any, ::Any, ::Any, ::Any, ::Any) got unsupported keyword argument "kernel" (method too new to be called from this world context.)
   @ MLJLIBSVMInterface ~/.julia/packages/MLJLIBSVMInterface/zUY3E/src/MLJLIBSVMInterface.jl:56
  MLJLIBSVMInterface.SVC(; kernel, gamma, cost, cachesize, degree, coef0, tolerance, shrinking) (method too new to be called from this world context.)
   @ MLJLIBSVMInterface ~/.julia/packages/MLJLIBSVMInterface/zUY3E/src/MLJLIBSVMInterface.jl:66
however, if I wait enough, let’s say, 10seconds, and rerun the code, it works. Thus, I believe it is some sort of problem of precompilation, for which I am calling the function while an independent process is still compiling it.

My simple solution was to do this:

ready = false
while !ready
    test_workers = []
    @sync for p in workers()
        k = @spawnat p begin
            try
                MyClassifier.run_svm()
                @info "SVMLIB loaded on worker $(getpid())"
                return true
            catch e
                return false
            end
        end
        push!(test_workers, k)
    end
    ready = all([fetch(k) for k in test_workers])
    if !ready
        @info "Workers not ready yet"
        sleep(5)
    end
end

But it s not very elegant and it is annoying to rewrite it every time I want to run a SVM classification on some data. How could I do to solve this problem in few lines? possibly from within the MyClassifier module?

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions