You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Copy file name to clipboardExpand all lines: lectures/mccall_q.md
+5-5Lines changed: 5 additions & 5 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -85,7 +85,7 @@ key = jax.random.PRNGKey(123)
85
85
jax.config.update('jax_platform_name', 'cpu')
86
86
```
87
87
88
-
## Review of McCall Model
88
+
## Review of McCall model
89
89
90
90
We begin by reviewing the McCall model described in {doc}`this quantecon lecture <mccall_model>`.
91
91
@@ -239,7 +239,7 @@ We'll use this value function as a benchmark later after we have done some Q-lea
239
239
print(valfunc_VFI)
240
240
```
241
241
242
-
## Implied Quality Function $Q$
242
+
## Implied quality function $Q$
243
243
244
244
245
245
A **quality function** $Q$ map state-action pairs into optimal values.
@@ -313,7 +313,7 @@ $$
313
313
314
314
+++
315
315
316
-
## From Probabilities to Samples
316
+
## From probabilities to samples
317
317
318
318
We noted above that the optimal Q function for our McCall worker satisfies the Bellman equations
319
319
@@ -735,7 +735,7 @@ The above graphs indicates that
735
735
736
736
* the quality of approximation to the "true" value function computed by value function iteration improves for longer epochs
737
737
738
-
## Employed Worker Can't Quit
738
+
## Employed worker can't quit
739
739
740
740
741
741
The preceding version of temporal difference Q-learning described in equation system {eq}`eq:old4` lets an employed worker quit, i.e., reject her wage as an incumbent and instead receive unemployment compensation this period
@@ -775,7 +775,7 @@ We illustrate these possibilities with the following code and graph.
0 commit comments