@@ -27,11 +27,12 @@ with automatic code loading" below.
27
27
28
28
The extent of testing is controlled by `level`:
29
29
30
- |`level` | description | tests (full list below) |
31
- |:----------------|:---------------------------------|:------------------------|
32
- | 1 | test code loading | `:model_type` |
33
- | 2 (default) | basic test of model interface | first four tests |
34
- | 3 | comprehensive | all applicable tests |
30
+ |`level` | description | tests (full list below) |
31
+ |:----------------|:----------------------------------|:------------------------|
32
+ | 1 | test code loading | `:model_type` |
33
+ | 2 (default) | basic test of model interface | first four tests |
34
+ | 3 | comprehensive CPU1() | all CPU1() tests |
35
+ | 4 | comprehensive CPU1()/CPUThreads() | all tests |
35
36
36
37
By default, exceptions caught in tests are not thrown. If
37
38
`throw=true`, testing will terminate at the first execption
@@ -131,6 +132,10 @@ These additional tests are applied to `Supervised` models:
131
132
(metric), evaluate the performance of the model using `evaluate!`
132
133
and a `Holdout` set.
133
134
135
+ - `:accelerated_evaluation`: Assuming the model appears to make
136
+ repeatable predictions on retraining, repeat the `:evaluation` test
137
+ using `CPUThreads()` acceleration and check agreement with `CPU1()` case.
138
+
134
139
- `:tuned_pipe_evaluation`: Repeat the `:evauation` test but first
135
140
insert model in a pipeline with a trivial pre-processing step
136
141
(applies the identity transformation) and wrap in `TunedModel` (only
@@ -156,11 +161,12 @@ function test(model_proxies, data...; mod=Main, level=2, throw=false, verbosity=
156
161
:fitted_machine ,
157
162
:operations ,
158
163
:evaluation ,
164
+ :accelerated_evaluation ,
159
165
:tuned_pipe_evaluation ,
160
166
:threshold_prediction ,
161
167
:ensemble_prediction ,
162
168
:iteration_prediction
163
- ), NTuple{11 , String}}}(undef, nproxies)
169
+ ), NTuple{12 , String}}}(undef, nproxies)
164
170
165
171
# summary table row corresponding to all tests skipped:
166
172
row0 = (
@@ -171,6 +177,7 @@ function test(model_proxies, data...; mod=Main, level=2, throw=false, verbosity=
171
177
fitted_machine = " -" ,
172
178
operations = " -" ,
173
179
evaluation = " -" ,
180
+ accelerated_evaluation = " -" ,
174
181
tuned_pipe_evaluation = " -" ,
175
182
threshold_prediction = " -" ,
176
183
ensemble_prediction = " -" ,
@@ -269,10 +276,56 @@ function test(model_proxies, data...; mod=Main, level=2, throw=false, verbosity=
269
276
270
277
# evaluation:
271
278
evaluation, outcome =
272
- MLJTestIntegration. evaluation (measure, model_instance, data... ; throw, verbosity)
279
+ MLJTestIntegration. evaluation (
280
+ measure,
281
+ model_instance,
282
+ [CPU1 (),],
283
+ data... ;
284
+ throw,
285
+ verbosity,
286
+ )
273
287
row = update (row, i, :evaluation , evaluation, outcome)
274
288
outcome == " ×" && continue
275
289
290
+ # determine computational resources to test; we only test more
291
+ # than CPU1() if model evaluations are independent of training
292
+ # run (assuming this means models are "deterministic", ie,
293
+ # RNGs):
294
+ resources = MLJ. AbstractResource[] # fallback
295
+ if level > 3
296
+ per_fold = evaluation. per_fold[1 ]
297
+ per_folds = map (1 : (N_MODELS_FOR_REPEATABILITY_TEST - 1 )) do _
298
+ e, o = MLJTestIntegration. evaluation (
299
+ measure,
300
+ model_instance,
301
+ [CPU1 (),],
302
+ data... ;
303
+ throw= false ,
304
+ verbosity,
305
+ )
306
+ o == " ✓" || return nothing
307
+ e. per_fold[1 ]
308
+ end
309
+ if all (≈ (per_fold), per_folds)
310
+ resources = RESOURCES
311
+ end
312
+ end
313
+
314
+ if length (resources) > 1
315
+ # accelerated_evaluation:
316
+ evaluation, outcome =
317
+ MLJTestIntegration. evaluation (
318
+ measure,
319
+ model_instance,
320
+ resources,
321
+ data... ;
322
+ throw,
323
+ verbosity,
324
+ )
325
+ row = update (row, i, :accelerated_evaluation , evaluation, outcome)
326
+ outcome == " ×" && continue
327
+ end
328
+
276
329
# tuned_pipe_evaluation:
277
330
tuned_pipe_evaluation, outcome =
278
331
MLJTestIntegration. tuned_pipe_evaluation (
@@ -287,15 +340,26 @@ function test(model_proxies, data...; mod=Main, level=2, throw=false, verbosity=
287
340
288
341
# ensemble_prediction:
289
342
ensemble_prediction, outcome =
290
- MLJTestIntegration. ensemble_prediction (model_instance, data... ; throw, verbosity)
343
+ MLJTestIntegration. ensemble_prediction (
344
+ model_instance,
345
+ data... ;
346
+ throw,
347
+ verbosity,
348
+ )
291
349
row = update (row, i, :ensemble_prediction , ensemble_prediction, outcome)
292
350
outcome == " ×" && continue
293
351
294
352
isnothing (iteration_parameter (model_instance)) && continue
295
353
296
354
# iteration prediction:
297
355
iteration_prediction, outcome =
298
- MLJTestIntegration. iteration_prediction (measure, model_instance, data... ; throw, verbosity)
356
+ MLJTestIntegration. iteration_prediction (
357
+ measure,
358
+ model_instance,
359
+ data... ;
360
+ throw,
361
+ verbosity,
362
+ )
299
363
row = update (row, i, :iteration_prediction , iteration_prediction, outcome)
300
364
outcome == " ×" && continue
301
365
end
0 commit comments