Skip to content

Inconsistent device in regressor.py #4

@lucasresck

Description

@lucasresck

Dear authors,

Thank you for releasing fast_l1 code together with datamodels.

While running the linear regression step of datamodels, I faced an issue regarding tensors not being in the same device.

After running

python -m datamodels.regression.compute_datamodels \
    -C regression_config.yaml \
    --data.data_path "$tmp_dir/reg_data.beton" \
    --cfg.out_dir "$tmp_dir/reg_results"

I would face something similar to

  File "/path_to_python3.9/site-packages/fast_l1-0.0.1-py3.9.egg/fast_l1/regressor.py", line 221, in train_saga
RuntimeError: indices should be either on cpu or on the same device as the indexed tensor (cpu)

or

  File "/path_to_python3.9/site-packages/fast_l1-0.0.1-py3.9.egg/fast_l1/regressor.py", line 341, in train_saga
RuntimeError: indices should be either on cpu or on the same device as the indexed tensor (cpu)

This happened because in lines 221 and 341 of regressor.py some CPU tensors are being indexed/sliced using other tensors that lie on the GPU, in this case, idx and still_opt_outer:

a_prev[:, :num_keep].copy_(a_table[idx, :num_keep],
non_blocking=True)

inds_to_swap = inds_to_swap[still_opt_outer[inds_to_swap]]

On the other hand, they are both on the GPU because weight and train_loader in datamodels/datamodels/regression/compute_datamodels.py are on the GPU when train_saga is called:

        regressor.train_saga(weight,
                             bias,
                             train_loader,
                             val_loader,
                             lr=lr,
                             start_lams=max_lam,
                             update_bias=(use_bias > 0),
                             lam_decay=np.exp(np.log(eps)/k),
                             num_lambdas=k,
                             early_stop_freq=early_stop_freq,
                             early_stop_eps=early_stop_eps,
                             logdir=str(log_path))

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions