|
16 | 16 | * [Analyzer with CUDA input](#analyzer-with-cuda-input) |
17 | 17 | * [Configuration](#configuration) |
18 | 18 | * [GPU-only configuration](#gpu-only-configuration) |
19 | | - * [Automatic switching between CPU and GPU modules](#automatic-switching-between-cpu-and-gpu-modules) |
20 | 19 | * [More details](#more-details) |
21 | 20 | * [Device choice](#device-choice) |
22 | 21 | * [Data model](#data-model) |
@@ -355,10 +354,9 @@ void ProducerInputOutputCUDA::produce(edm::StreamID streamID, edm::Event& iEvent |
355 | 354 |
|
356 | 355 | Analyzer with CUDA input is similar to [producer with CUDA |
357 | 356 | input](#producer-with-cuda-input). Note that currently we do not have |
358 | | -a mechanism for portable configurations with analyzers (like |
359 | | -[`SwitchProducer`](#automatic-switching-between-cpu-and-gpu-modules) |
360 | | -for producers). This means that a configuration with a CUDA analyzer |
361 | | -can only run on a machine with CUDA device(s). |
| 357 | +a mechanism for portable configurations with analyzers. This means |
| 358 | +that a configuration with a CUDA analyzer can only run on a machine |
| 359 | +with CUDA device(s). |
362 | 360 |
|
363 | 361 | ```cpp |
364 | 362 | class AnalyzerInputCUDA: public edm::global::EDAnalyzer<> { |
@@ -408,54 +406,10 @@ void AnalyzerInputCUDA::analyze(edm::Event const& iEvent, edm::EventSetup& iSetu |
408 | 406 | For a GPU-only configuration there is nothing special to be done, just |
409 | 407 | construct the Paths/Sequences/Tasks from the GPU modules. |
410 | 408 |
|
411 | | -#### Automatic switching between CPU and GPU modules |
412 | | - |
413 | | -The `SwitchProducer` mechanism can be used to switch automatically |
414 | | -between CPU and GPU modules based on the availability of GPUs on the |
415 | | -machine where the configuration is done. Framework decides at the |
416 | | -beginning of the job which of the modules to run for a given module |
417 | | -label. |
418 | | - |
419 | | -Framework requires that the modules in the switch must produce the |
420 | | -same types of output products (the closer the actual results are the |
421 | | -better, but the framework can not enforce that). This means that for a |
422 | | -chain of GPU modules, it is the module that transforms the SoA data |
423 | | -format back to the legacy data formats (possibly, but not necessarily, |
424 | | -transferring the SoA data from GPU to CPU) that should be switched |
425 | | -between the legacy CPU module. The rest of the GPU modules should be |
426 | | -placed to a `Task`, in which case framework runs them only if their |
427 | | -output is needed by another module. |
428 | | - |
429 | | -```python |
430 | | -from HeterogeneousCore.CUDACore.SwitchProducerCUDA import SwitchProducerCUDA |
431 | | -process.foo = SwitchProducerCUDA( |
432 | | - cpu = cms.EDProducer("FooProducer"), # legacy CPU |
433 | | - cuda = cms.EDProducer("FooProducerFromCUDA", |
434 | | - src="fooCUDA" |
435 | | - ) |
436 | | -) |
437 | | -process.fooCUDA = cms.EDProducer("FooProducerCUDA") |
438 | | - |
439 | | -process.fooTaskCUDA = cms.Task(process.fooCUDA) |
440 | | -process.fooTask = cms.Task( |
441 | | - process.foo, |
442 | | - process.fooTaskCUDA |
443 | | -) |
444 | | -``` |
445 | | - |
446 | | -For a more complete example, see [here](../CUDATest/test/testCUDASwitch_cfg.py). |
447 | | - |
448 | | - |
449 | | - |
450 | | - |
451 | | - |
452 | 409 | ## More details |
453 | 410 |
|
454 | 411 | ### Device choice |
455 | 412 |
|
456 | | -As discussed above, with `SwitchProducer` the choice between CPU and |
457 | | -GPU modules is done at the beginning of the job. |
458 | | - |
459 | 413 | For multi-GPU setup the device is chosen in the first CUDA module in a |
460 | 414 | chain of modules by one of the constructors of |
461 | 415 | `cms::cuda::ScopedContextAcquire`/`cms::cuda::ScopedContextProduce` |
|
0 commit comments