Strange behavior during forward pass, with custom dynamics #335
-
|
Hi everyone, I am using the proxddp solver with my own defined discrete dynamics, similarly in: https://github.com/Simple-Robotics/aligator/blob/main/tests/python/test_custom_pyfunctions.py Here is my discrete dynamics class: I defined the forward function to calculate the next state from x using control u and dforward function to calculate the jacobians wrt states and controls. When I setup my problem, using my discrete dynamics and run the solver I experience the following: the solver calls N-1 times the forward function (where N is the horizon) and then N-1 times the dforward function. This must be the start of the backward pass in the algorithm and if I understand correctly, these calculations are needed for the Q-function. After this, the solver calls again N-1 times the forward function. This must be the forward pass of the algorithm. Of course the incoming x and u values are changed since the solver calculated the new control values in the backward pass. Here comes the part that I do not understand. If this is the forward pass then every calculated xnext value has to be the x value in the next forward function call. However for me, the x values are different than the previous xnext values. This was the first strange behavior for me. The second strange behavior is that in the solver's result the xs variable is the sequence of incoming x values in the forward pass and not the sequence of calculated xnext values. I tested this behavior with the linked test_custom_pyfunctions.py class too. In this case, in the forward pass the incoming x values were equal with the xnext values in the previous stage. Therefore the xs solution were also consistent with the xnext values. However this example is simple, so differences can occur compared to my application. Can someone help me explain this behavior? Thank you in forward, |
Beta Was this translation helpful? Give feedback.
Replies: 2 comments
-
This is the algorithm, which is a multiple-shooting algorithm. The value of
This is the correct behaviour. Why should it be the |
Beta Was this translation helpful? Give feedback.
-
|
Thanks, this explained my problem! |
Beta Was this translation helpful? Give feedback.
This is the algorithm, which is a multiple-shooting algorithm. The value of
xnextdoesn't have to be the value ofxat the next timestep's function call, only at convergence (within a given feasibility tolerance). This behaviour depends on the rollout type and the initial inverse penalty parametermu_init. If you set to a nonlinear rollout with a lowermu_init, the states will be dynamically consistent quicker (at the cost of perhaps higher, um, cost, at algorithm convergence).