Skip to content

[BUG] - <title>RuntimeError: index -9223372036854775808 is out of bounds for dimension 1 with size 1. ProbabilisticActor cannot be configured with return_log_prob=True; it will throw an error in version 0.5.0, but switching back to version 0.4.0 resolves the issue. #3011

@Sui-Xing

Description

@Sui-Xing

Add Link

https://pytorch.org/rl/stable/reference/generated/torchrl.modules.tensordict_module.ProbabilisticActor.html?highlight=probabilisticactor#torchrl.modules.tensordict_module.ProbabilisticActor

Describe the bug

CODE

policy_module = TensorDictModule(net_policy,
                                 in_keys=['hidden'],
                                 out_keys=[
                                     ("params", "action1", "logits"),
                                     ("params", "action2", "logits"),
                                     ("params", "action3", "logits"),
                                     ("params", "action4", "logits"),
                                     ("params", "action5", "logits")
                                 ])
actor = ProbabilisticActor(
    module=policy_module,
    in_keys=["params"],
    distribution_class=CompositeDistribution,
    distribution_kwargs={
        "distribution_map": {
            "action1": d.Categorical,
            "action2": d.Categorical,
            "action3": d.Categorical,
            "action4": d.Categorical,
            "action5": d.Categorical
        },

    },
    return_log_prob=True,
)
net_value = Net_Value(num_cells, device=device)
net_value.apply(init_weights)
net_value(hidden)

value_module = ValueOperator(
    module=net_value,
    in_keys=["hidden"],
    out_keys=["state_action_value"]
)

a_c_model = ActorCriticOperator(shared_module, actor, value_module)

test_td = a_c_model.get_policy_operator()(td)

ERROR MESSAGE

Traceback (most recent call last):
  File "********", line 159, in <module>
    test_td = a_c_model.get_policy_operator()(td)
  File "E:\tools\miniconda\envs\***\lib\site-packages\torch\nn\modules\module.py", line 1553, in _wrapped_call_impl
    return self._call_impl(*args, **kwargs)
  File "E:\tools\miniconda\envs\***\lib\site-packages\torch\nn\modules\module.py", line 1562, in _call_impl
    return forward_call(*args, **kwargs)
  File "E:\tools\miniconda\envs\***\lib\site-packages\tensordict\nn\common.py", line 297, in wrapper
    return func(_self, tensordict, *args, **kwargs)
  File "E:\tools\miniconda\envs\***\lib\site-packages\tensordict\_contextlib.py", line 127, in decorate_context
    return func(*args, **kwargs)
  File "E:\tools\miniconda\envs\***\lib\site-packages\tensordict\nn\utils.py", line 293, in wrapper
    return func(_self, tensordict, *args, **kwargs)
  File "E:\tools\miniconda\envs\***\lib\site-packages\tensordict\nn\probabilistic.py", line 655, in forward
    return self.module[-1](tensordict_out, _requires_sample=self._requires_sample)
  File "E:\tools\miniconda\envs\***\lib\site-packages\torch\nn\modules\module.py", line 1553, in _wrapped_call_impl
    return self._call_impl(*args, **kwargs)
  File "E:\tools\miniconda\envs\***\lib\site-packages\torch\nn\modules\module.py", line 1562, in _call_impl
    return forward_call(*args, **kwargs)
  File "E:\tools\miniconda\envs\***\lib\site-packages\tensordict\nn\common.py", line 297, in wrapper
    return func(_self, tensordict, *args, **kwargs)
  File "E:\tools\miniconda\envs\***\lib\site-packages\tensordict\_contextlib.py", line 127, in decorate_context
    return func(*args, **kwargs)
  File "E:\tools\miniconda\envs\***\lib\site-packages\tensordict\nn\utils.py", line 293, in wrapper
    return func(_self, tensordict, *args, **kwargs)
  File "E:\tools\miniconda\envs\***\lib\site-packages\tensordict\nn\probabilistic.py", line 439, in forward
    tensordict_out = dist.log_prob(tensordict_out)
  File "E:\tools\miniconda\envs\***\lib\site-packages\tensordict\nn\distributions\composite.py", line 150, in log_prob
    d[_add_suffix(name, "_log_prob")] = lp = dist.log_prob(sample.get(name))
  File "E:\tools\miniconda\envs\***\lib\site-packages\torch\distributions\categorical.py", line 142, in log_prob
    return log_pmf.gather(-1, value).squeeze(-1)
RuntimeError: index -9223372036854775808 is out of bounds for dimension 1 with size 1

ProbabilisticActor cannot be configured with return_log_prob=True; it will throw an error in version 0.5.0, but switching back to version 0.4.0 resolves the issue.

Describe your environment

windows11/windows server2022
python 3.10
cpu or cuda11.8
torch==2.4.0
torchrl==0.5.0
tensordict=0.5.0

cc @vmoens @nairbv

Metadata

Metadata

Assignees

No one assigned

    Labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions