@@ -86,7 +86,7 @@ In this course, we are interested in problems with the following structure:
8686 \p hantom{\s ubstack{(\m athbf u_1,\m athbf x_1)\\\m athrm{s.t.}}}%
8787 \!\!\!\!\!\!\!\!\!\! (\m athbf u_1,\m athbf x_1)\i n\m athcal X_1(\m athbf x_0)%
8888}{%
89- \!\!\!\! c(\m athbf x_1,\m athbf y_1 )%
89+ \!\!\!\! c(\m athbf x_1,\m athbf u_1 )%
9090}
9191+\m athbb{E}_1\B igl[
9292 \q uad \c dots
@@ -123,7 +123,7 @@ constraints can be generally posed as:
123123 &\m athcal{X}_t(\m athbf{x}_{t-1}, w_t)=
124124 \b egin{cases}
125125 f(\m athbf{x}_{t-1}, w_t, \m athbf{u}_t) = \m athbf{x}_t \\
126- h(\m athbf{x}_t, \m athbf{y }_t) \g eq 0
126+ h(\m athbf{x}_t, \m athbf{u }_t) \g eq 0
127127 \e nd{cases}
128128\e nd{align}
129129```
@@ -135,7 +135,7 @@ where the outgoing state of the system $\mathbf{x}_t$ is a
135135transformation based on the incoming state, the realized uncertainty,
136136and the control variables. In the Markov Decision Process (MDP) framework, we refer to $f$ as the "transition kernel" of the system. State and
137137control variables are restricted further by additional constraints
138- captured by $h(\m athbf{x}_t, \m athbf{y }_t) \g eq 0$. We
138+ captured by $h(\m athbf{x}_t, \m athbf{u }_t) \g eq 0$. We
139139consider policies that map the past information into decisions: $\p i_t : (\m athbf{x}_{t-1}, w_t) \r ightarrow \m athbf{x}_t$. In
140140period $t$, an optimal policy is given by the solution of the dynamic
141141equations:
0 commit comments