Skip to content

Unexpected GPU Behavior in set_hidden_parameters due to populateFrom Call #746

@hgeace

Description

@hgeace

I've encountered an issue where the behavior of analog_module.set_hidden_parameters differs between CPU and GPU environments when used to modify hidden parameters during a training run.

On the CPU, the method works as expected, modifying only the target hidden parameter. However, on the GPU, it has the side effect of resetting the entire internal state of the RPU device (e.g., internal counters for chopper mechanisms).

Description

When calling analog_module.set_hidden_parameters() during training,
the behavior differs between CPU and GPU environments.

On CPU, it correctly updates only the intended hidden parameter (e.g., resetting one reference array).
On GPU, however, it unexpectedly resets the entire internal RPU device state,
including internal counters used by the chopper mechanism.

This seems to be caused by a difference between the CPU and GPU implementations of
setDeviceParameter().
The GPU version calls populateFrom() inside RPUCudaPulsed<T>::setDeviceParameter,
which appears to reinitialize the entire device.

How to reproduce

example.ipynb

  1. Run the attached minimal example (example.ipynb).
  2. On GPU:
    • Call analog_module.set_hidden_parameters() during training.
    • Observe that all internal device states are reset (e.g., weight dynamics, chopper counters).
  3. On CPU:
    • The same code modifies only the specified hidden parameter as expected.

Expected behavior

The set_hidden_parameters() method should update only the specified hidden parameter tensor
without reinitializing the entire device state, consistent across both CPU and GPU backends.

Actual behavior

When running this notebook on the GPU, calling analog_module.set_hidden_parameters() during training resets the entire internal RPU device state (e.g., internal counters for the chopper mechanism).
The same code works properly on the CPU, modifying only the intended hidden parameter.

Other information

Detailed Analysis

The root cause appears to be a difference in the cuda implementation for setDeviceParameter

  1. The Python method set_hidden_parameters calls the C++ setDeviceParameter function via the Pybind11 binding located in rpu_base_tiles.cpp.
// in rpu_base_tiles.cpp

.def("set_hidden_parameters", [](Class &self, const torch::Tensor &hidden_parameters_) {
    // ... (tensor preparation) ...
    self.setDeviceParameter(data_ptrs);
}, ... )
  1. The GPU implementation of RPUCudaPulsed::setDeviceParameter in rpucuda_pulsed.cu subsequently calls populateFrom. This call appears to re-initialize the entire GPU device state from the CPU-side object, thus overwriting any state that had been accumulated on the GPU.
// in rpucuda_pulsed.cu
template <typename T>
void RPUCudaPulsed<T>::setDeviceParameter(const std::vector<T *> &data_ptrs) {
    // ...
    rpu_device_->setDeviceParameter(this->getWeightsPtr(), data_ptrs);
    rpucuda_device_->populateFrom(*rpu_device_); // <<< Suspected cause of the full state reset on GPU.
    // ...
};

My goal is to modify a specific hidden parameter (e.g., reset a fast array to zero) during training without affecting other learned states (like internal counters).
Is there an existing lightweight method to modify a single hidden parameter tensor on the GPU without triggering a full populateFrom?
If not, what would be the recommended approach to implement such a feature?
Thank you for your work on this excellent library. I look forward to your feedback.

  • PyTorch version: 1.13.1+cu117
  • Package version: 0.9.2
  • OS: Ubuntu 22.04 (kernel 5.15.0-1042-oracle)
  • Python version: 3.10.17
  • Conda version (or N/A): 24.11.1

Metadata

Metadata

Assignees

Labels

bugSomething isn't working

Type

No type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions