[Bug Report] During training the models disappear from the scene #2551
Replies: 20 comments
-
Update: |
Beta Was this translation helpful? Give feedback.
-
Thank you for posting this. We can't access the video in the link you posted. Could you post here a portion of it that can be uploaded with the issue? Thanks. |
Beta Was this translation helpful? Give feedback.
-
isaaclab_short.mp4 |
Beta Was this translation helpful? Give feedback.
-
I have done a step forward with the issue. At a certain time during the training the PPO action values are Nan, Why? |
Beta Was this translation helpful? Give feedback.
-
I have encountered something similar before as well. From my experience with those issues, object or robots disappear usually caused by physics instability, the solution diverged and value exploded. You can try smaller dt, higher solver iterations. For me, one of those usually solve the issue. |
Beta Was this translation helpful? Give feedback.
-
Thank you @zoctipus for your suggestion, my dt is 0.005 ( I can try a smaller one). If I decrese the learning rate of the ppo the problems is much less frequent but still present (i don't if if this could be a correct way to do it). The first thing is going NaN are the actions value from the ppo, is correct if I clip those values before applying them to the model? |
Beta Was this translation helpful? Give feedback.
-
Good job in found out that learning rate seems to be somewhat helpful! Physic dt of 0.005 is usually an ok value, though I do see some of my environment will results in a very poor policy with 0.005 but with much better policy when dt changed to 0.002 or 0.001 . That said, Nan can either come from environment, or Nan can come from your policy network, you need to know where is the nan come from. If your nan come from network, that usually indicates if observation is not normalized, reward are too extreme, learning rate too high, exploration too aggressive etc, all the values that matters to learning are unstable and caused the network weight to explode. Smaller learning rate can be helpful, but usually those default one can be pretty good. If you found need much smaller learning rate than default to work, it may suggest other problems. If your nan come from environment, (like the robot disappear), that mean, physx solutions diverge. It can be that robot action is too aggressive(so the solution diverge), it can be that update rate is too large(so the solution diverge), and that can result states reading to be Nan, and resulted in part of observations and rewards that depends on those states becoming Nan... These issue are hard debug in nature, and forming an educated guess is crucial in diagnosing problem. you describe that smaller learning rate can results in less Nan issue makes me wonder, do you have reward that is very large? is your observation normalized? if action rate is too negative due to action explode, maybe try clipping the action rate weight range so that the crazy reward never gets to ppo? |
Beta Was this translation helpful? Give feedback.
-
@zoctipus It seems my network is generating the nan, the rewards seems to be not so high (I took the values from the cassie task): ` lin_vel_z_weight = -2.0
The ppo configuration seems not too extereme
Another problem is the model seems not learning, so maybe there is something in here that I'm not able to see (the ppo keep doing the same actions and is not able to improve). Screencast.webm |
Beta Was this translation helpful? Give feedback.
-
You can try clipping the actions explicitly or narrowing the output range (e.g.: using |
Beta Was this translation helpful? Give feedback.
-
Hi @Toni-SM I tried to clip the PPo with the parameter "clip_actions" but seems not working (I keep getting Nan values)
is correct if I clip in this way?
|
Beta Was this translation helpful? Give feedback.
-
Did you also reset your robot velocity? Make sure your reset function is proper. At least in my case a proper reset function is solution. Bad collision also cause robots to disappear. |
Beta Was this translation helpful? Give feedback.
-
@celestialdr4g0n my reset function is this, I reset the pose and also the velocities
I've noticed one strange thing during the training: Screencast.from.30-04-2025.16_45_47.mp4After a while of training it does this: Screencast.from.30-04-2025.16_48_47.mp4I don't know if this is a my impressions but seems that the gravity is not correct, am I wrong? |
Beta Was this translation helpful? Give feedback.
-
Yes, there it is. I mean zero the velocity, I do not know why using the default velocity does not work. Bellow is my code |
Beta Was this translation helpful? Give feedback.
-
Thank you @celestialdr4g0n, I changed the reset part like you mention and seems better but I still have that strange movement Screencast.from.05-05-2025.10.01.16.webm |
Beta Was this translation helpful? Give feedback.
-
I have no idea why it is in this way. But you are working with your own robot so make your functions do their jobs . Can you really control each joint of your robot as you want?(make sure the output of the policy "go into" your robot properly) In addition 40 envs is a small number. If you trained with larger num of envs then just ignore this suggestion. |
Beta Was this translation helpful? Give feedback.
-
I'm trying with a large number of envs (100/200), I see the ppo is moving the all the joints but sometimes seems it moves the joints one time and then it waits the robot death. The next step maybe repeats the same movements |
Beta Was this translation helpful? Give feedback.
-
I've done some stpes forwared, so now at least seems starting working... I think still some problems (The fps are really low, with 400 envs) Screencast.from.09-05-2025.17.03.57.webm |
Beta Was this translation helpful? Give feedback.
-
Hi @AndreaRossetto , I'm facing the exact same issue but with DDPG from SKRL (for training a quadruped). |
Beta Was this translation helpful? Give feedback.
-
@pietrodardano yes we can have a chat or a call what you prefer |
Beta Was this translation helpful? Give feedback.
-
Thank you for following up on this issue. I will move the post to our Discussions for now. |
Beta Was this translation helpful? Give feedback.
Uh oh!
There was an error while loading. Please reload this page.
Uh oh!
There was an error while loading. Please reload this page.
-
Describe the bug
Custom robot model after some time of training it disappear without errors. What can I do to solve the problem?
Here the link of video about the issue.
Steps to reproduce
System Info
Describe the characteristic of your environment:
Additional context
Add any other context about the problem here.
Checklist
Acceptance Criteria
Add the criteria for which this task is considered done. If not known at issue creation time, you can add this once the issue is assigned.
Beta Was this translation helpful? Give feedback.
All reactions