Skip to content

Dump the Single <: KindOfProxy subtypes #53

@ablaom

Description

@ablaom

We propose an API simplification, which is not limiting in any way, but is technically breaking.

Context

For an explanation of target proxies, and the associated abstract typeKindOfProxy, see here.

Learners may implement predict (or transform) with no data argument. That is, a call looks like predict(model[, kind_of_proxy]).
The main use case is density estimation.

At present, the KindOfProxy types for such a learner are limited to subtypes of Single, instead of the more usual IID subtypes; see here. The reasoning for this was that the contract looks a little different: in regular IID learners, predict returns as many observations as in the data argument, whereas in the Single case, there is no data argument and exactly one observation is returned by predict.

However, the Single types are so far just replicates of existing IID types, and I don't really see why this pattern should not continue.

Proposal

  1. We dump the Single types and tweak the IID contract to be more inclusive (see below)
  2. We add a new trait to flag models where predict consumes no data (and LearnAPI.features(learner, data) == nothing). What to call the trait is not obvious to me. LearnAPI.consumes_features(learner) = true, overridden to false for such learners??

Contract change for IID

Existing contract

LearnAPI.IID <: LearnAPI.KindOfProxy

Abstract subtype of LearnAPI.KindOfProxy. If kind_of_proxy is an instance of
LearnAPI.IID then, given data consisting of n observations, the
following must hold:

  • ŷ = LearnAPI.predict(model, kind_of_proxy, data) is
    data also consisting of n observations.

  • ŷ = LearnAPI.predict(model, kind_of_proxy) is
    data also consisting of 1 observation, in the case LearnAPI.consumes_features(learner) == false

  • The jth observation of ŷ, for any j, depends only on the jth
    observation of the provided data (no correlation between observations).

New contract

Make the addition:

Alternatively, in the case LearnAPI.consumes_features(learner) == false (predict consumes no data, and fit consumes no features) , we require only that:

  • LearnAPI.predict(model, kind_of_proxy) consists of a single observation.

Dealing with breakage

As a practical matter, we aren't going to break anything, as I assume (and can check at General) that there are no 3rd party implementations of LearnAPI.jl yet. So we could just tag this as minor or patch and call it a bug fix. Or, we make the doc changes, add the new trait, and have the constructors SingleDistribution() etc emit deprecation warnings. The latter would be semver compliant, but seems like overkill.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions