-
Notifications
You must be signed in to change notification settings - Fork 2
Description
We propose an API simplification, which is not limiting in any way, but is technically breaking.
Context
For an explanation of target proxies, and the associated abstract typeKindOfProxy
, see here.
Learners may implement predict
(or transform
) with no data
argument. That is, a call looks like predict(model[, kind_of_proxy])
.
The main use case is density estimation.
At present, the KindOfProxy
types for such a learner are limited to subtypes of Single
, instead of the more usual IID
subtypes; see here. The reasoning for this was that the contract looks a little different: in regular IID
learners, predict
returns as many observations as in the data
argument, whereas in the Single
case, there is no data
argument and exactly one observation is returned by predict
.
However, the Single
types are so far just replicates of existing IID
types, and I don't really see why this pattern should not continue.
Proposal
- We dump the
Single
types and tweak theIID
contract to be more inclusive (see below) - We add a new trait to flag models where
predict
consumes nodata
(andLearnAPI.features(learner, data) == nothing
). What to call the trait is not obvious to me.LearnAPI.consumes_features(learner) = true
, overridden tofalse
for such learners??
Contract change for IID
Existing contract
LearnAPI.IID <: LearnAPI.KindOfProxy
Abstract subtype of LearnAPI.KindOfProxy
. If kind_of_proxy
is an instance of
LearnAPI.IID
then, given data
consisting of n
observations, the
following must hold:
-
ŷ = LearnAPI.predict(model, kind_of_proxy, data)
is
data also consisting ofn
observations. -
ŷ = LearnAPI.predict(model, kind_of_proxy)
is
data also consisting of1
observation, in the caseLearnAPI.consumes_features(learner) == false
-
The
j
th observation ofŷ
, for anyj
, depends only on thej
th
observation of the provideddata
(no correlation between observations).
New contract
Make the addition:
Alternatively, in the case LearnAPI.consumes_features(learner) == false
(predict
consumes no data, and fit
consumes no features) , we require only that:
LearnAPI.predict(model, kind_of_proxy)
consists of a single observation.
Dealing with breakage
As a practical matter, we aren't going to break anything, as I assume (and can check at General) that there are no 3rd party implementations of LearnAPI.jl
yet. So we could just tag this as minor or patch and call it a bug fix. Or, we make the doc changes, add the new trait, and have the constructors SingleDistribution()
etc emit deprecation warnings. The latter would be semver compliant, but seems like overkill.