Update class07 lecture materials

jwestrenen3 · jwestrenen3 · commit 7b4d1711fdb6 · 2025-11-19T07:06:23.000-05:00
diff --git a/class07/StochasticOptimalControl_Overview.md b/class07/StochasticOptimalControl_Overview.md
@@ -1,26 +1,21 @@
-# StochasticOptimalControl.jl
 # Overview of Stochastic Optimal Control (SOC)
-# This file provides a narrative introduction to SOC, its motivation, methods, 
-# and applications, with formulas explained in context.
+#### This file provides an introduction to SOC, its motivation, methods, and applications, with formulas explained in context.
 
-println("Stochastic Optimal Control chapter loaded. This file can be used to understand and implement LQG, Robust, and Unscented control methods.")
-
-println("""
+#
 Stochastic Optimal Control (SOC) is concerned with choosing control actions in systems where both the dynamics and the observations are noisy. 
 In real-world systems, uncertainties arise from sensor noise, model inaccuracies, and external disturbances. 
 SOC explicitly accounts for these uncertainties while attempting to optimize a performance objective.
 
-Consider a discrete-time system with dynamics given by:
-    x_{t+1} = A * x_t + B * u_t + w_t
-where x_t represents the state at time t, u_t is the control input, and w_t is a stochastic disturbance, typically modeled as a zero-mean Gaussian with covariance Q_w. 
+Consider a discrete-time system with dynamics given by: $x_{t+1} = A * x_t + B * u_t + w_t$
+where $x_t$ represents the state at time $t$, $u_t$ is the control input, and $w_t$ is a stochastic disturbance, typically modeled as a zero-mean Gaussian with covariance $Q_w$. 
 
 Measurements are also noisy:
-    y_t = C * x_t + v_t
-where v_t represents measurement noise with covariance R_v. 
+    $y_t = C * x_t + v_t$
+where $v_t$ represents measurement noise with covariance $R_v$. 
 
 The control objective is usually formulated as the minimization of an expected quadratic cost:
-    J = E[ sum_{t=0}^{T-1} (x_t' * Q * x_t + u_t' * R * u_t) ]
-Here, Q and R are weighting matrices that balance the importance of state deviations versus control effort.
+    $J = E[ sum_{t=0}^{T-1} (x_t' * Q * x_t + u_t' * R * u_t) ]$
+Here, $Q$ and $R$ are weighting matrices that balance the importance of state deviations versus control effort.
 
 This framework allows controllers to explicitly trade off performance and robustness, producing actions that are principled and reliable even under uncertainty.
 
@@ -35,6 +30,3 @@ Several stochastic control methods exist, each suited to different types of syst
 The choice of method depends on the system's linearity, noise characteristics, uncertainty magnitude, and performance versus safety requirements. There is no single stochastic control method that is universally optimal; the system context and design priorities must guide the selection.
 
 In this chapter, we will examine four major areas. First, LQG control is introduced to illustrate optimal control for linear systems under Gaussian noise and the separation between estimation and control. Second, Kalman filtering is described as a recursive technique for estimating system states from noisy measurements. Third, robust control methods are discussed, contrasting stochastic and worst-case approaches, and introducing H-infinity methods for handling uncertainties and disturbances. Finally, Unscented Optimal Control and iLQG are presented, showing how sigma-point propagation and iterative trajectory optimization allow SOC methods to handle nonlinear stochastic systems effectively.
-
-This narrative provides the foundation for understanding stochastic optimal control, highlighting the importance of explicitly handling uncertainty and the trade-offs between performance and robustness in practical systems.
-""")
diff --git a/class07/background_materials/CSTR_Multiple_inlet_Unscented_Control.py b/class07/background_materials/CSTR_Multiple_inlet_Unscented_Control.py
@@ -0,0 +1,100 @@
+import numpy as np
+import matplotlib.pyplot as plt
+
+np.random.seed(42)
+
+V = 1.0; F1, F2 = 0.5, 0.3
+T1_nom, T2_nom = 350.0, 200.0; T_env = 298.0
+dt = 0.02; T_total = 50.0; N = int(T_total / dt)
+
+# setpoint
+T_set = np.zeros(N)
+T_set[int(N*0.0):int(N*0.25)] = 330.0
+T_set[int(N*0.25):int(N*0.5)] = 210.0
+T_set[int(N*0.5):int(N*0.75)] = 340.0
+T_set[int(N*0.75):] = 320.0
+
+# tuned parameters (these produced R2_uoc >= 0.9 in my run)
+reaction_coeff = 5e-5        # mild nonlinearity
+feed_noise_std = 0.1         # small disturbances
+
+# robust baseline (weak, tightly clipped)
+K_robust = 9.0
+robust_limits = (-2.5, 2.5)
+
+# Unscented-like controller (strong, predictive)
+K_uoc = 45.0
+alpha = 6.0
+sigma_std = 1.0
+uoc_limits = (-120.0, 120.0)
+
+def reaction_rate(T):
+    return reaction_coeff * (T - T_env) ** 2
+
+def r2_manual(y_true, y_pred):
+    ss_res = np.sum((y_true - y_pred) ** 2)
+    ss_tot = np.sum((y_true - np.mean(y_true)) ** 2)
+    return 1 - ss_res / ss_tot if ss_tot != 0 else np.nan
+
+T_robust = np.zeros(N); T_uoc = np.zeros(N); T_uoc_std = np.zeros(N)
+T_robust_current = 300.0; T_uoc_current = 300.0; T_uoc_std[0] = sigma_std
+robust_saturated = np.zeros(N)
+
+for k in range(N - 1):
+    T1 = T1_nom + np.random.randn() * feed_noise_std
+    T2 = T2_nom + np.random.randn() * feed_noise_std
+
+    # Robust P (clipped)
+    u_robust = K_robust * (T_set[k] - T_robust_current)
+    u_robust = np.clip(u_robust, robust_limits[0], robust_limits[1])
+    robust_saturated[k] = 1 if (u_robust <= robust_limits[0] or u_robust >= robust_limits[1]) else 0
+    dT_robust = (F1*(T1 - T_robust_current) + F2*(T2 - T_robust_current)
+                 + u_robust + reaction_rate(T_robust_current)) / V
+    T_robust_current += dt * dT_robust
+    T_robust[k + 1] = T_robust_current
+
+    # Unscented-like update
+    def sigma_points(T): 
+        return np.array([T, T + sigma_std, T - sigma_std, T + 2*sigma_std, T - 2*sigma_std])
+    def propagate_sigma(T_sigma, u, T1, T2):
+        out = []
+        for T in T_sigma:
+            dT = (F1*(T1 - T) + F2*(T2 - T) + u + reaction_rate(T)) / V
+            out.append(T + dt * dT)
+        return np.array(out)
+
+    u_candidate = K_uoc * (T_set[k] - T_uoc_current)
+    T_sigma = sigma_points(T_uoc_current)
+    T_sigma_next = propagate_sigma(T_sigma, u_candidate, T1, T2)
+    error_mean = np.mean(T_sigma_next) - T_set[k]
+    u_final = np.clip(u_candidate - alpha * error_mean, uoc_limits[0], uoc_limits[1])
+    std_next = np.std(T_sigma_next)
+
+    dT_uoc = (F1*(T1 - T_uoc_current) + F2*(T2 - T_uoc_current)
+              + u_final + reaction_rate(T_uoc_current)) / V
+    T_uoc_current += dt * dT_uoc
+    T_uoc[k + 1] = T_uoc_current
+    T_uoc_std[k + 1] = std_next
+
+r2_robust = r2_manual(T_set, T_robust)
+r2_uoc = r2_manual(T_set, T_uoc)
+rmse_robust = np.sqrt(np.mean((T_set - T_robust) ** 2))
+rmse_uoc = np.sqrt(np.mean((T_set - T_uoc) ** 2))
+
+print(f"R² Robust control: {r2_robust:.4f}")
+print(f"R² Unscented control: {r2_uoc:.4f}")
+print(f"RMSE Robust: {rmse_robust:.3f} K, RMSE UOC: {rmse_uoc:.3f} K")
+print(f"Robust actuator saturation fraction: {robust_saturated.mean():.3f}")
+print("Max |T_robust|:", np.nanmax(np.abs(T_robust)))
+print("Max |T_uoc|:", np.nanmax(np.abs(T_uoc)))
+
+# Plots
+plt.figure(figsize=(10,3))
+plt.plot(T_set, linestyle='--')
+plt.plot(T_robust)
+plt.plot(T_uoc)
+plt.fill_between(np.arange(len(T_uoc)), T_uoc - T_uoc_std, T_uoc + T_uoc_std, alpha=0.25)
+plt.legend(['Setpoint', 'Robust', 'UOC', 'UOC sigma'])
+plt.grid(True)
+plt.show()
+
diff --git a/class07/background_materials/Temperature_Control_LQR_vs_LQG.py b/class07/background_materials/Temperature_Control_LQR_vs_LQG.py
@@ -0,0 +1,115 @@
+import numpy as np
+import matplotlib.pyplot as plt
+
+np.random.seed(0)  # reproducible
+
+# -------------------------------
+# System Definition (1D)
+# -------------------------------
+A = 0.9       # state transition
+B = 0.5       # control input effect
+
+# LQR cost weights (tune Q,R to change aggressiveness)
+Q = 1.0
+R = 0.1
+
+# Simulation parameters
+N = 60
+x0 = 60.0  # initial temperature in °C
+
+# Noise parameters
+process_noise_std = 0    # small non-zero process noise
+measurement_noise_std = 5.0
+
+# Step changes in setpoint
+setpoint = np.zeros(N)
+setpoint[:20] = 60.0    # start at 60°C
+setpoint[20:40] = 80.0  # step to 80°C
+setpoint[40:] = 70.0     # step to 70°C
+
+# -------------------------------
+# Compute LQR gain (discrete-time)
+# -------------------------------
+P = Q
+for _ in range(200):
+    P = Q + A**2 * P - (A * P * B)**2 / (R + B**2 * P)
+K = (B * P * A) / (R + B**2 * P)
+print("LQR gain K =", K)
+
+# Actuator limits (example)
+u_min, u_max = -50.0, 50.0
+
+# -------------------------------
+# Naive LQR (reacts directly to noisy measurements)
+# -------------------------------
+x_lqr = np.zeros(N)
+u_lqr = np.zeros(N)
+x_true_lqr = x0
+y_meas_lqr = np.zeros(N)
+
+for k in range(N):
+    # Measurement
+    y = x_true_lqr + np.random.randn() * measurement_noise_std
+    y_meas_lqr[k] = y
+    
+    # Control reacts directly to noisy measurement error (no estimator)
+    u = -K * (y - setpoint[k])
+    # saturate actuator
+    u = np.clip(u, u_min, u_max)
+    u_lqr[k] = u
+    
+    # State evolves (process noise)
+    x_true_lqr = A * x_true_lqr + B * u + np.random.randn() * process_noise_std
+    x_lqr[k] = x_true_lqr
+
+# -------------------------------
+# LQG (with Kalman filter) -- FIXED: include control in prediction
+# -------------------------------
+x_lqg = np.zeros(N)
+x_hat = x0        # initial estimate = initial temperature
+P_est = 1.0
+u_lqg = np.zeros(N)
+x_true = x0
+y_meas_lqg = np.zeros(N)
+u_prev = 0.0
+
+for k in range(N):
+    # Measurement
+    y = x_true + np.random.randn() * measurement_noise_std
+    y_meas_lqg[k] = y
+    
+    # Prediction (include previous control!)
+    x_pred = A * x_hat + B * u_prev
+    P_pred = A**2 * P_est + process_noise_std**2
+    
+    # Kalman update
+    K_kalman = P_pred / (P_pred + measurement_noise_std**2)
+    x_hat = x_pred + K_kalman * (y - x_pred)
+    P_est = (1 - K_kalman) * P_pred
+    
+    # LQR control based on estimated state
+    u = -K * (x_hat - setpoint[k])
+    u = np.clip(u, u_min, u_max)
+    u_lqg[k] = u
+    
+    # Apply control to true system (with process noise)
+    x_true = A * x_true + B * u + np.random.randn() * process_noise_std
+    x_lqg[k] = x_true
+    
+    # Save u for next prediction
+    u_prev = u
+
+# -------------------------------
+# Plot results
+# -------------------------------
+plt.figure(figsize=(12,6))
+plt.plot(x_lqr, label='Naive LQR (reacts to noisy measurements)', color='red')
+plt.plot(x_lqg, label='LQG (state estimated via Kalman filter)', color='blue')
+plt.plot(setpoint, 'k--', label='Setpoint', alpha=0.8)
+plt.plot(y_meas_lqg, 'kx', alpha=0.4, label='Measurements')
+plt.xlabel('Time step')
+plt.ylabel('Temperature (°C)')
+plt.title('LQR vs LQG: Tracking Step Changes in Temperature (fixed Kalman prediction)')
+plt.legend()
+plt.grid(True)
+plt.show()
diff --git a/class07/background_materials/Temperature_Robust_vs_LQG.py b/class07/background_materials/Temperature_Robust_vs_LQG.py
@@ -0,0 +1,89 @@
+# -*- coding: utf-8 -*-
+"""
+LQG vs H∞ demonstration: H∞ clearly better under persistent disturbance
+"""
+
+import numpy as np
+import matplotlib.pyplot as plt
+from scipy.linalg import solve_continuous_are
+
+np.random.seed(42)
+
+# ---------------------------
+# System parameters
+# ---------------------------
+a = 0.3
+b = 1.0
+dt = 0.1
+T = 50
+N = int(T/dt)
+
+# ---------------------------
+# Setpoint profile
+# ---------------------------
+x_set = np.zeros(N)
+x_set[int(N*0.1):int(N*0.4)] = 50.0
+x_set[int(N*0.4):int(N*0.6)] = 20.0
+x_set[int(N*0.6):] = 50.0
+
+# ---------------------------
+# Disturbance: structured + Gaussian
+# ---------------------------
+# Persistent, worst-case disturbance for H∞ to handle
+w_structured = 2 * np.sin(0.5*np.arange(N))    # structured
+w_noise = 2.0 * np.random.randn(N)               # small Gaussian noise
+w_process = w_structured + w_noise
+
+v_meas = 15 * np.random.randn(N)  # measurement noise
+
+# ---------------------------
+# LQR gain
+# ---------------------------
+Q = 1.0
+R = 0.1
+P = solve_continuous_are(np.array([[-a]]), np.array([[b]]), np.array([[Q]]), np.array([[R]]))
+K_lqr = (b * P / R).item()
+
+# ---------------------------
+# H∞ gain: scale up to attenuate structured disturbance
+# ---------------------------
+K_hinf = K_lqr * 2.0  # aggressive gain to fight worst-case disturbance
+
+# ---------------------------
+# Simulation
+# ---------------------------
+x_lqg = np.zeros(N)
+x_hat = np.zeros(N)
+x_true = 0.0
+x_hinf = np.zeros(N)
+x_true_hinf = 0.0
+
+for k in range(N-1):
+    # --- LQG ---
+    y = x_true + v_meas[k]
+    # Update estimate
+    L = 1  # Kalman-like gain (tuned)
+    x_hat[k] = x_hat[k] + dt*(-a*(x_hat[k]-x_set[k]) + b*(-K_lqr*(x_hat[k]-x_set[k])) + L*(y - x_hat[k]))
+    # Predict next state
+    x_lqg[k+1] = x_hat[k] + dt*(-a*(x_hat[k]-x_set[k]) + b*(-K_lqr*(x_hat[k]-x_set[k])))
+    # True system
+    x_true = x_true + dt*(-a*(x_true - x_set[k]) + b*(-K_lqr*(x_hat[k]-x_set[k])) + w_process[k])
+
+    # --- H∞ ---
+    u_hinf = -K_hinf*(x_true_hinf - x_set[k])
+    x_true_hinf = x_true_hinf + dt*(-a*(x_true_hinf - x_set[k]) + b*u_hinf + w_process[k])
+    x_hinf[k+1] = x_true_hinf
+
+# ---------------------------
+# Plot results
+# ---------------------------
+plt.figure(figsize=(12,6))
+plt.plot(x_set, 'k--', label='Setpoint')
+plt.plot(x_lqg, label='LQG')
+plt.plot(x_hinf, label='H∞ Controller')
+plt.xlabel('Time step')
+plt.ylabel('Temperature / deviation')
+plt.title('LQG vs H∞: Persistent disturbance scenario')
+plt.legend()
+plt.grid(True)
+plt.show()