Skip to content

Commit a59b26b

Browse files
committedOct 7, 2024·
Tom's Oct 7 edits of calvo_ML lecture
1 parent fc15b0e commit a59b26b

File tree

2 files changed

+76
-59
lines changed

2 files changed

+76
-59
lines changed
 

‎lectures/_static/quant-econ.bib

+8
Original file line numberDiff line numberDiff line change
@@ -3,6 +3,14 @@
33
Note: Extended Information (like abstracts, doi, url's etc.) can be found in quant-econ-extendedinfo.bib file in _static/
44
###
55
6+
@article{sargent1973stability,
7+
title={The stability of models of money and growth with perfect foresight},
8+
author={Sargent, Thomas J and Wallace, Neil},
9+
journal={Econometrica: Journal of the Econometric Society},
10+
pages={1043--1048},
11+
year={1973},
12+
publisher={JSTOR}
13+
}
614

715

816
@book{Shannon_1949,

‎lectures/calvo_machine_learn.md

+68-59
Original file line numberDiff line numberDiff line change
@@ -15,82 +15,82 @@ kernelspec:
1515

1616
## Introduction
1717

18-
This lecture studies a problem that we study from another angle in this quantecon lecture
19-
{doc}`calvo`.
20-
21-
Both lectures compute a Ramsey plan for a version of a model of Calvo {cite}`Calvo1978`.
18+
This lecture uses what we call a ``machine learning`` approach to
19+
compute a Ramsey plan for a version of a model of Calvo {cite}`Calvo1978`.
2220

21+
We use another approach to compute a Ramsey plan for Calvo's model in another quantecon lecture
22+
{doc}`calvo`.
2323

2424
The {doc}`calvo` lecture uses an analytic approach based on ``dynamic programming squared`` to guide computations.
2525

2626

2727
Dynamic programming squared provides information about the structure of mathematical objects in terms of which a Ramsey plan can be represented recursively.
2828

29-
That paves the way to computing a Ramsey plan efficiently.
29+
Using that information paves the way to computing a Ramsey plan efficiently.
3030

31-
Included in the structural information that dynamic programming squared provides in quantecon lecture {doc}`calvo` are descriptions of
31+
Included in the structural information that dynamic programming squared provides in quantecon lecture {doc}`calvo` are
3232

3333
* a **state** variable that confronts a continuation Ramsey planner, and
3434
* two **Bellman equations**
3535
* one that describes the behavior of the representative agent
3636
* another that describes decision problems of a Ramsey planner and of a continuation Ramsey planner
3737

3838

39-
In this lecture, we approach the Ramsey planner in a less sophisticated way.
40-
41-
We proceed without knowing the mathematical structure imparted by dynamic programming squared.
39+
In this lecture, we approach the Ramsey planner in a less sophisticated way that proceeds without knowing the mathematical structure imparted by dynamic programming squared.
4240

43-
Instead, we use a brute force approach that simply chooses a pair of infinite sequences of real numbers that maximizes a Ramsey planner's objective function.
41+
We simply choose a pair of infinite sequences of real numbers that maximizes a Ramsey planner's objective function.
4442

4543
The pair consists of
4644

4745
* a sequence $\vec \theta$ of inflation rates
4846
* a sequence $\vec \mu$ of money growh rates
4947

50-
Because it fails to take advantage of the structure recognized by dynamic programming squared and instead proliferates parameters, we take the liberty of calling this a **machine learning** approach.
48+
Because it fails to take advantage of the structure recognized by dynamic programming squared and, relative to the dynamic programming squared approach, proliferates parameters, we take the liberty of calling this a **machine learning** approach.
5149

5250
This is similar to what other machine learning algorithms also do.
5351

5452
Comparing the calculations in this lecture with those in our sister lecture {doc}`calvo` provides us
5553
with a laboratory that can help us appreciate promises and limits of machine learning approaches
5654
more generally.
5755

58-
We'll actually deploy two machine learning approaches.
56+
In this lecture, we'll actually deploy two machine learning approaches.
5957

6058
* the first is really lazy
61-
* it just writes a Python function to computes the Ramsey planner's objective as a function of a money growth rate sequence and then hands it over to a gradient descent optimizer
59+
* it writes a Python function that computes the Ramsey planner's objective as a function of a money growth rate sequence and hands it over to a ``gradient descent`` optimizer
6260
* the second is less lazy
63-
* it exerts the effort required to express the Ramsey planner's objective as an affine quadratic form in $\vec \mu$, computes first-order conditions for an optimum, arranges them into a system of simultaneous linear equations for $\vec \mu$ and then $\vec \theta$, then solves them.
61+
* it exerts the mental effort required to express the Ramsey planner's objective as an affine quadratic form in $\vec \mu$, computes first-order conditions for an optimum, arranges them into a system of simultaneous linear equations for $\vec \mu$ and then $\vec \theta$, then solves them.
6462

65-
While both of these machine learning (ML) approaches succeed in recovering the Ramsey plan that we also compute in quantecon lecture {doc}`calvo` by using dynamic programming squared, they don't reveal the recursive structure of the Ramsey plan described in that lecture.
63+
Each of these machine learning (ML) approaches recovers the same Ramsey plan that shall compute in quantecon lecture {doc}`calvo` by using dynamic programming squared.
6664

67-
That recursive structure lies hidden within some of the objects calculated by our ML approach.
65+
However, they conceal the recursive structure of the Ramsey plan.
6866

69-
We can ferret out some of that structure if we ask the right questions.
67+
That recursive structure lies hidden within some of the objects calculated by our ML approaches.
7068

71-
At the end of this lecture we describe some of those questions are and how they can be answered by running particular linear regressions on components of
72-
$\vec \mu, \vec \theta$.
69+
Nevertheless, we can ferret out some of that structure by asking the right questions.
7370

74-
Human intelligence, not the artificial intelligence deployed in our machine learning approach, is a key input into choosing which regressions to run.
71+
72+
We pose those questions at the end of this lecture and answer them by running particulars some linear regressions on components of $\vec \mu, \vec \theta$.
73+
74+
Human intelligence, not the ``artificial intelligence`` deployed in our machine learning approach, is a key input into choosing which regressions to run.
7575

7676

7777
## The Model
7878

7979
We study a linear-quadratic version of a model that Guillermo Calvo {cite}`Calvo1978` used to illustrate the **time inconsistency** of optimal government plans.
8080

8181

82-
The model focuses attention on intertemporal tradeoffs between
82+
The model focuses on intertemporal tradeoffs between
8383

84-
- utility that a representative agent's anticipations of future deflation generate by lowering the costs of holding real money balances and thereby increasing the agent's *liquidity*, as measured by holdings of real money balances, and
85-
- social costs associated with the distorting taxes that a government levies to acquire the paper money that it destroys in order to generate anticipated deflation
84+
- utility that a representative agent's anticipations of future deflation delivered by lowering the agent's cost of holding real money balances and thereby increasing the agent's *liquidity*, as ultimately measured by the agent's holdings of real money balances, and
85+
- social costs associated with the distorting taxes that a government levies to acquire the paper money that it destroys in order to generate prospective deflation
8686

8787
The model features
8888

8989
- rational expectations
9090
- costly government actions at all dates $t \geq 1$ that increase household utilities at dates before $t$
9191

9292

93-
The model combines ideas from papers by Cagan {cite}`Cagan` and Calvo {cite}`Calvo1978`.
93+
The model combines ideas from papers by Cagan {cite}`Cagan`, {cite}`sargent1973stability`, and Calvo {cite}`Calvo1978`.
9494

9595

9696

@@ -190,7 +190,7 @@ it is $-\frac{u_1}{u_2 \alpha}$.
190190
191191
Via equation {eq}`eq_grad_old3`, a government plan
192192
$\vec \mu = \{\mu_t \}_{t=0}^\infty$ leads to a
193-
sequence of inflation outcomes
193+
sequence of inflation rates
194194
$\vec \theta = \{ \theta_t \}_{t=0}^\infty$.
195195
196196
We assume that the government incurs social costs $\frac{c}{2} \mu_t^2$ at
@@ -215,7 +215,27 @@ where $\beta \in (0,1)$ is a discount factor.
215215
216216
The Ramsey planner chooses
217217
a vector of money growth rates $\vec \mu$
218-
to maximize criterion {eq}`eq:RamseyV` subject to equations {eq}`eq_grad_old3`.
218+
to maximize criterion {eq}`eq:RamseyV` subject to equations {eq}`eq_grad_old3` and a restriction
219+
requiring that
220+
221+
$$
222+
\vec \theta \in L^2
223+
$$ (eq:thetainL2)
224+
225+
Notice equations {eq}`eq_grad_old3` and {eq}`eq:thetainL2` imply that $\vec \theta$ is a function
226+
of $\vec \mu$.
227+
228+
In particular, the inflation rate $\theta_t$ satisfies
229+
230+
$$
231+
\theta_t = (1-\lambda) \sum_{j=0}^\infty \lambda^j \mu_{t+j}, \quad t \geq 0
232+
$$ (eq:inflation101)
233+
234+
where
235+
236+
$$
237+
\lambda = \frac{\alpha}{1+\alpha} .
238+
$$
219239
220240
221241
@@ -226,7 +246,7 @@ to maximize criterion {eq}`eq:RamseyV` subject to equations {eq}`eq_grad_old3`.
226246
## Parameters and Variables
227247
228248
229-
**Parameters** are
249+
**Parameters:**
230250
231251
* Demand for money parameter is $\alpha > 0$; we set its default value $\alpha = 1$
232252
@@ -241,7 +261,7 @@ to maximize criterion {eq}`eq:RamseyV` subject to equations {eq}`eq_grad_old3`.
241261
242262
243263
244-
**Variables** are
264+
**Variables:**
245265
246266
* $\theta_t = p_{t+1} - p_t$ where $p_t$ is log of price level
247267
@@ -289,20 +309,9 @@ h_2 & = - \frac{u_2 \alpha^2}{2}
289309
\end{aligned}
290310
$$
291311
292-
The inflation rate $\theta_t$ satisfies
293-
294-
$$
295-
\theta_t = (1-\lambda) \sum_{j=0}^\infty \lambda^j \mu_{t+j}, \quad t \geq 0
296-
$$ (eq:inflation101)
297-
298-
where
299-
300-
$$
301-
\lambda = \frac{\alpha}{1+\alpha}
302-
$$
303312
304313
A Ramsey planner chooses $\vec \mu$ to maximize the government's value function {eq}`eq:Ramseyvalue`
305-
subject to equation {eq}`eq:inflation101`.
314+
subject to equations {eq}`eq:inflation101`.
306315
307316
A solution $\vec \mu$ of this problem is called a **Ramsey plan**.
308317
@@ -361,8 +370,8 @@ for $t=0, 1, \ldots, T-1$ and $\bar \theta = \bar \mu$.
361370
362371
**Formula for $V$**
363372
364-
Having computed the truncated vectors $\tilde \mu$ and $\tilde \theta$
365-
as described above, we want to write a function that computes
373+
Having specified a truncated vector $\tilde \mu$ and and having computed $\tilde \theta$
374+
by using formula {eq}`eq:thetaformula102`, we want to write a Python function that computes
366375
367376
$$
368377
\tilde V = \sum_{t=0}^\infty \beta^t (
@@ -381,7 +390,7 @@ where $\tilde \theta_t, \ t = 0, 1, \ldots , T-1$ satisfies formula (1).
381390
382391
## A Gradient Descent Algorithm
383392
384-
We now describe code that maximizes the criterion function {eq}`eq:Ramseyvalue` by choice of the truncated vector $\tilde \mu$.
393+
We now describe code that maximizes the criterion function {eq}`eq:Ramseyvalue` subject to equations {eq}`eq:inflation101` by choice of the truncated vector $\tilde \mu$.
385394
386395
We use a brute force or ``machine learning`` approach that just hands our problem off to code that minimizes $V$ with respect to the components of $\tilde \mu$ by using gradient descent.
387396
@@ -413,7 +422,7 @@ import matplotlib.pyplot as plt
413422
414423
We'll eventually want to compare the results we obtain here to those that we obtain in those obtained in this quantecon lecture {doc}`calvo`.
415424
416-
To enable us to do that, we copy the class `ChangLQ` that we used in that lecture.
425+
To enable us to do that, we copy the class `ChangLQ` used in that lecture.
417426
418427
419428
We hide the cell that copies the class, but readers can find details of the class in this quantecon lecture {doc}`calvo`.
@@ -680,9 +689,9 @@ compute_V(clq.μ_series, β=0.85, c=2)
680689
681690
We take a brief detour to solve a restricted version of the Ramsey problem defined above.
682691
683-
First, recall that a Ramsey planner chooses $\vec \mu$ to maximize the government's value function {eq}`eq:Ramseyvalue`subject to equation {eq}`eq:inflation101`.
692+
First, recall that a Ramsey planner chooses $\vec \mu$ to maximize the government's value function {eq}`eq:Ramseyvalue` subject to equations {eq}`eq:inflation101`.
684693
685-
We now define a distinct problem in which the planner chooses $\vec \mu$ to maximize the government's value function {eq}`eq:Ramseyvalue`subject to equation {eq}`eq:inflation101` and
694+
We now define a distinct problem in which the planner chooses $\vec \mu$ to maximize the government's value function {eq}`eq:Ramseyvalue` subject to equation {eq}`eq:inflation101` and
686695
the additional restriction that $\mu_t = \bar \mu$ for all $t$.
687696
688697
The solution of this problem is a time-invariant $\mu_t$ that this quantecon lecture {doc}`calvo` calls $\mu^{CR}$.
@@ -701,7 +710,7 @@ optimized_μ_CR = adam_optimizer(grad_V, μ_init)
701710
print(f"optimized μ = \n{optimized_μ_CR}")
702711
```
703712
704-
Compare it to $\mu^{CR}$ in {doc}`calvo`, we again obtained very close answers.
713+
Comparing it to $\mu^{CR}$ in {doc}`calvo`, we again obtained very close answers.
705714
706715
```{code-cell} ipython3
707716
np.linalg.norm(clq.μ_CR - optimized_μ_CR)
@@ -718,7 +727,7 @@ compute_V(jnp.array([clq.μ_CR]), β=0.85, c=2)
718727
719728
## A More Structured ML Algorithm
720729
721-
By thinking a little harder about the mathematical structure of the Ramsey problem and using some linear algebra, we can simplify the problem that we hand over to a ``machine learning`` algorithm.
730+
By thinking about the mathematical structure of the Ramsey problem and using some linear algebra, we can simplify the problem that we hand over to a ``machine learning`` algorithm.
722731
723732
We start by recalling that the Ramsey problem that chooses $\vec \mu$ to maximize the government's value function {eq}`eq:Ramseyvalue`subject to equation {eq}`eq:inflation101`.
724733
@@ -1027,9 +1036,9 @@ print(f'deviation = {np.linalg.norm(closed_grad - (- grad_J(jnp.ones(T))))}')
10271036
10281037
## Some Exploratory Regressions
10291038
1030-
To help us learn something about the structure of the Ramsey plan, we compute some least squares linear regressions of some components of $\vec \theta$ and $\vec \mu$ on others.
1039+
We compute some least squares linear regressions of some components of $\vec \theta$ and $\vec \mu$ on others.
10311040
1032-
Our hope is that these regressions will reveal structure hidden within the $\vec \mu^R, \vec \theta^R$ sequences associated with a Ramsey plan.
1041+
We hope that these regressions will reveal structure hidden within the $\vec \mu^R, \vec \theta^R$ sequences associated with a Ramsey plan.
10331042
10341043
It is worth pausing to think about roles being played here by **human** intelligence and **artificial** intelligence.
10351044
@@ -1066,8 +1075,8 @@ plt.legend()
10661075
plt.show()
10671076
```
10681077
1069-
Note that $\theta_t$ is less than $\mu_t$for low $t$'s, but that it eventually converges to
1070-
the same limit $\bar \mu$ that $\mu_t$ does.
1078+
Note that while $\theta_t$ is less than $\mu_t$for low $t$'s, it eventually converges to
1079+
the limit $\bar \mu$ of $\mu_t$ as $t \rightarrow +\infty$.
10711080
10721081
This pattern reflects how formula {eq}`eq_grad_old3` makes $\theta_t$ be a weighted average of future $\mu_t$'s.
10731082
@@ -1088,13 +1097,13 @@ print("Regression of μ_t on a constant and θ_t:")
10881097
print(results1.summary(slim=True))
10891098
```
10901099
1091-
Our regression tells us that along the Ramsey outcome $\vec \mu, \vec \theta$ the linear function
1100+
Our regression tells us that the affine function
10921101
10931102
$$
10941103
\mu_t = .0645 + 1.5995 \theta_t
10951104
$$
10961105
1097-
fits perfectly.
1106+
fits perfectly along the Ramsey outcome $\vec \mu, \vec \theta$.
10981107
10991108
11001109
```{note}
@@ -1160,7 +1169,7 @@ $\bar \mu, \bar \mu$.
11601169
11611170
### Continuation Values
11621171
1163-
Next, we'll compute a sequence $\{v_t\}_{t=0}^T$ of what we'll call "continuation values" along a Ramsey plan.
1172+
Next, we'll compute a sequence $\{v_t\}_{t=0}^T$ of what we'll call ``continuation values`` along a Ramsey plan.
11641173
11651174
To do so, we'll start at date $T$ and compute
11661175
@@ -1206,7 +1215,7 @@ def compute_vt(μ, β, c, u0=1, u1=0.5, u2=3, α=1):
12061215
v_t = compute_vt(μs, β=0.85, c=2)
12071216
```
12081217
1209-
The initial continuation value $v_0$ should equals the optimized value of the Ramsey planner's criterion $V$ defined
1218+
The initial continuation value $v_0$ should equal the optimized value of the Ramsey planner's criterion $V$ defined
12101219
in equation {eq}`eq:RamseyV`.
12111220
12121221
@@ -1244,7 +1253,7 @@ plt.tight_layout()
12441253
plt.show()
12451254
```
12461255
1247-
Figure {numref}`continuation_values` shows several interesting patterns:
1256+
Figure {numref}`continuation_values` shows interesting patterns:
12481257
12491258
* The sequence of continuation values $\{v_t\}_{t=0}^T$ is monotonically decreasing
12501259
* Evidently, $v_0 > V^{CR} > v_T$ so that
@@ -1372,9 +1381,9 @@ $$
13721381
13731382
We discovered these relationships by running some carefully chosen regressions and staring at the results, noticing that the $R^2$'s of unity tell us that the fits are perfect.
13741383
1375-
We have learned something about the structure of the Ramsey problem.
1384+
We have learned much about the structure of the Ramsey problem.
13761385
1377-
However, it is challenging to say more just by using the methods and ideas that we have deployed in this lecture.
1386+
However, by using the methods and ideas that we have deployed in this lecture, it is challenging to say more.
13781387
13791388
There are many other linear regressions among components of $\vec \mu^R, \theta^R$ that would also have given us perfect fits.
13801389

2 commit comments

Comments
 (2)
Please sign in to comment.