You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Copy file name to clipboardExpand all lines: lectures/calvo_machine_learn.md
+68-59
Original file line number
Diff line number
Diff line change
@@ -15,82 +15,82 @@ kernelspec:
15
15
16
16
## Introduction
17
17
18
-
This lecture studies a problem that we study from another angle in this quantecon lecture
19
-
{doc}`calvo`.
20
-
21
-
Both lectures compute a Ramsey plan for a version of a model of Calvo {cite}`Calvo1978`.
18
+
This lecture uses what we call a ``machine learning`` approach to
19
+
compute a Ramsey plan for a version of a model of Calvo {cite}`Calvo1978`.
22
20
21
+
We use another approach to compute a Ramsey plan for Calvo's model in another quantecon lecture
22
+
{doc}`calvo`.
23
23
24
24
The {doc}`calvo` lecture uses an analytic approach based on ``dynamic programming squared`` to guide computations.
25
25
26
26
27
27
Dynamic programming squared provides information about the structure of mathematical objects in terms of which a Ramsey plan can be represented recursively.
28
28
29
-
That paves the way to computing a Ramsey plan efficiently.
29
+
Using that information paves the way to computing a Ramsey plan efficiently.
30
30
31
-
Included in the structural information that dynamic programming squared provides in quantecon lecture {doc}`calvo` are descriptions of
31
+
Included in the structural information that dynamic programming squared provides in quantecon lecture {doc}`calvo` are
32
32
33
33
* a **state** variable that confronts a continuation Ramsey planner, and
34
34
* two **Bellman equations**
35
35
* one that describes the behavior of the representative agent
36
36
* another that describes decision problems of a Ramsey planner and of a continuation Ramsey planner
37
37
38
38
39
-
In this lecture, we approach the Ramsey planner in a less sophisticated way.
40
-
41
-
We proceed without knowing the mathematical structure imparted by dynamic programming squared.
39
+
In this lecture, we approach the Ramsey planner in a less sophisticated way that proceeds without knowing the mathematical structure imparted by dynamic programming squared.
42
40
43
-
Instead, we use a brute force approach that simply chooses a pair of infinite sequences of real numbers that maximizes a Ramsey planner's objective function.
41
+
We simply choose a pair of infinite sequences of real numbers that maximizes a Ramsey planner's objective function.
44
42
45
43
The pair consists of
46
44
47
45
* a sequence $\vec \theta$ of inflation rates
48
46
* a sequence $\vec \mu$ of money growh rates
49
47
50
-
Because it fails to take advantage of the structure recognized by dynamic programming squared and instead proliferates parameters, we take the liberty of calling this a **machine learning** approach.
48
+
Because it fails to take advantage of the structure recognized by dynamic programming squared and, relative to the dynamic programming squared approach, proliferates parameters, we take the liberty of calling this a **machine learning** approach.
51
49
52
50
This is similar to what other machine learning algorithms also do.
53
51
54
52
Comparing the calculations in this lecture with those in our sister lecture {doc}`calvo` provides us
55
53
with a laboratory that can help us appreciate promises and limits of machine learning approaches
56
54
more generally.
57
55
58
-
We'll actually deploy two machine learning approaches.
56
+
In this lecture, we'll actually deploy two machine learning approaches.
59
57
60
58
* the first is really lazy
61
-
* it just writes a Python function to computes the Ramsey planner's objective as a function of a money growth rate sequence and then hands it over to a gradient descent optimizer
59
+
* it writes a Python function that computes the Ramsey planner's objective as a function of a money growth rate sequence and hands it over to a ``gradient descent`` optimizer
62
60
* the second is less lazy
63
-
* it exerts the effort required to express the Ramsey planner's objective as an affine quadratic form in $\vec \mu$, computes first-order conditions for an optimum, arranges them into a system of simultaneous linear equations for $\vec \mu$ and then $\vec \theta$, then solves them.
61
+
* it exerts the mental effort required to express the Ramsey planner's objective as an affine quadratic form in $\vec \mu$, computes first-order conditions for an optimum, arranges them into a system of simultaneous linear equations for $\vec \mu$ and then $\vec \theta$, then solves them.
64
62
65
-
While both of these machine learning (ML) approaches succeed in recovering the Ramsey plan that we also compute in quantecon lecture {doc}`calvo` by using dynamic programming squared, they don't reveal the recursive structure of the Ramsey plan described in that lecture.
63
+
Each of these machine learning (ML) approaches recovers the same Ramsey plan that shall compute in quantecon lecture {doc}`calvo` by using dynamic programming squared.
66
64
67
-
That recursive structure lies hidden within some of the objects calculated by our ML approach.
65
+
However, they conceal the recursive structure of the Ramsey plan.
68
66
69
-
We can ferret out some of that structure if we ask the right questions.
67
+
That recursive structure lies hidden within some of the objects calculated by our ML approaches.
70
68
71
-
At the end of this lecture we describe some of those questions are and how they can be answered by running particular linear regressions on components of
72
-
$\vec \mu, \vec \theta$.
69
+
Nevertheless, we can ferret out some of that structure by asking the right questions.
73
70
74
-
Human intelligence, not the artificial intelligence deployed in our machine learning approach, is a key input into choosing which regressions to run.
71
+
72
+
We pose those questions at the end of this lecture and answer them by running particulars some linear regressions on components of $\vec \mu, \vec \theta$.
73
+
74
+
Human intelligence, not the ``artificial intelligence`` deployed in our machine learning approach, is a key input into choosing which regressions to run.
75
75
76
76
77
77
## The Model
78
78
79
79
We study a linear-quadratic version of a model that Guillermo Calvo {cite}`Calvo1978` used to illustrate the **time inconsistency** of optimal government plans.
80
80
81
81
82
-
The model focuses attention on intertemporal tradeoffs between
82
+
The model focuses on intertemporal tradeoffs between
83
83
84
-
- utility that a representative agent's anticipations of future deflation generate by lowering the costs of holding real money balances and thereby increasing the agent's *liquidity*, as measured by holdings of real money balances, and
85
-
- social costs associated with the distorting taxes that a government levies to acquire the paper money that it destroys in order to generate anticipated deflation
84
+
- utility that a representative agent's anticipations of future deflation delivered by lowering the agent's cost of holding real money balances and thereby increasing the agent's *liquidity*, as ultimately measured by the agent's holdings of real money balances, and
85
+
- social costs associated with the distorting taxes that a government levies to acquire the paper money that it destroys in order to generate prospective deflation
86
86
87
87
The model features
88
88
89
89
- rational expectations
90
90
- costly government actions at all dates $t \geq 1$ that increase household utilities at dates before $t$
91
91
92
92
93
-
The model combines ideas from papers by Cagan {cite}`Cagan` and Calvo {cite}`Calvo1978`.
93
+
The model combines ideas from papers by Cagan {cite}`Cagan`, {cite}`sargent1973stability`, and Calvo {cite}`Calvo1978`.
94
94
95
95
96
96
@@ -190,7 +190,7 @@ it is $-\frac{u_1}{u_2 \alpha}$.
190
190
191
191
Via equation {eq}`eq_grad_old3`, a government plan
192
192
$\vec \mu = \{\mu_t \}_{t=0}^\infty$ leads to a
193
-
sequence of inflation outcomes
193
+
sequence of inflation rates
194
194
$\vec \theta = \{ \theta_t \}_{t=0}^\infty$.
195
195
196
196
We assume that the government incurs social costs $\frac{c}{2} \mu_t^2$ at
@@ -215,7 +215,27 @@ where $\beta \in (0,1)$ is a discount factor.
215
215
216
216
The Ramsey planner chooses
217
217
a vector of money growth rates $\vec \mu$
218
-
to maximize criterion {eq}`eq:RamseyV` subject to equations {eq}`eq_grad_old3`.
218
+
to maximize criterion {eq}`eq:RamseyV` subject to equations {eq}`eq_grad_old3` and a restriction
219
+
requiring that
220
+
221
+
$$
222
+
\vec \theta \in L^2
223
+
$$ (eq:thetainL2)
224
+
225
+
Notice equations {eq}`eq_grad_old3` and {eq}`eq:thetainL2` imply that $\vec \theta$ is a function
226
+
of $\vec \mu$.
227
+
228
+
In particular, the inflation rate $\theta_t$ satisfies
229
+
230
+
$$
231
+
\theta_t = (1-\lambda) \sum_{j=0}^\infty \lambda^j \mu_{t+j}, \quad t \geq 0
232
+
$$ (eq:inflation101)
233
+
234
+
where
235
+
236
+
$$
237
+
\lambda = \frac{\alpha}{1+\alpha} .
238
+
$$
219
239
220
240
221
241
@@ -226,7 +246,7 @@ to maximize criterion {eq}`eq:RamseyV` subject to equations {eq}`eq_grad_old3`.
226
246
## Parameters and Variables
227
247
228
248
229
-
**Parameters** are
249
+
**Parameters:**
230
250
231
251
* Demand for money parameter is $\alpha > 0$; we set its default value $\alpha = 1$
232
252
@@ -241,7 +261,7 @@ to maximize criterion {eq}`eq:RamseyV` subject to equations {eq}`eq_grad_old3`.
241
261
242
262
243
263
244
-
**Variables** are
264
+
**Variables:**
245
265
246
266
* $\theta_t = p_{t+1} - p_t$ where $p_t$ is log of price level
\theta_t = (1-\lambda) \sum_{j=0}^\infty \lambda^j \mu_{t+j}, \quad t \geq 0
296
-
$$ (eq:inflation101)
297
-
298
-
where
299
-
300
-
$$
301
-
\lambda = \frac{\alpha}{1+\alpha}
302
-
$$
303
312
304
313
A Ramsey planner chooses $\vec \mu$ to maximize the government's value function {eq}`eq:Ramseyvalue`
305
-
subject to equation {eq}`eq:inflation101`.
314
+
subject to equations {eq}`eq:inflation101`.
306
315
307
316
A solution $\vec \mu$ of this problem is called a **Ramsey plan**.
308
317
@@ -361,8 +370,8 @@ for $t=0, 1, \ldots, T-1$ and $\bar \theta = \bar \mu$.
361
370
362
371
**Formula for $V$**
363
372
364
-
Having computed the truncated vectors $\tilde \mu$ and $\tilde \theta$
365
-
as described above, we want to write a function that computes
373
+
Having specified a truncated vector $\tilde \mu$ and and having computed $\tilde \theta$
374
+
by using formula {eq}`eq:thetaformula102`, we want to write a Python function that computes
366
375
367
376
$$
368
377
\tilde V = \sum_{t=0}^\infty \beta^t (
@@ -381,7 +390,7 @@ where $\tilde \theta_t, \ t = 0, 1, \ldots , T-1$ satisfies formula (1).
381
390
382
391
## A Gradient Descent Algorithm
383
392
384
-
We now describe code that maximizes the criterion function {eq}`eq:Ramseyvalue` by choice of the truncated vector $\tilde \mu$.
393
+
We now describe code that maximizes the criterion function {eq}`eq:Ramseyvalue` subject to equations {eq}`eq:inflation101` by choice of the truncated vector $\tilde \mu$.
385
394
386
395
We use a brute force or ``machine learning`` approach that just hands our problem off to code that minimizes $V$ with respect to the components of $\tilde \mu$ by using gradient descent.
387
396
@@ -413,7 +422,7 @@ import matplotlib.pyplot as plt
413
422
414
423
We'll eventually want to compare the results we obtain here to those that we obtain in those obtained in this quantecon lecture {doc}`calvo`.
415
424
416
-
To enable us to do that, we copy the class `ChangLQ` that we used in that lecture.
425
+
To enable us to do that, we copy the class `ChangLQ` used in that lecture.
417
426
418
427
419
428
We hide the cell that copies the class, but readers can find details of the class in this quantecon lecture {doc}`calvo`.
We take a brief detour to solve a restricted version of the Ramsey problem defined above.
682
691
683
-
First, recall that a Ramsey planner chooses $\vec \mu$ to maximize the government's value function {eq}`eq:Ramseyvalue`subject to equation {eq}`eq:inflation101`.
692
+
First, recall that a Ramsey planner chooses $\vec \mu$ to maximize the government's value function {eq}`eq:Ramseyvalue`subject to equations {eq}`eq:inflation101`.
684
693
685
-
We now define a distinct problem in which the planner chooses $\vec \mu$ to maximize the government's value function {eq}`eq:Ramseyvalue`subject to equation {eq}`eq:inflation101` and
694
+
We now define a distinct problem in which the planner chooses $\vec \mu$ to maximize the government's value function {eq}`eq:Ramseyvalue`subject to equation {eq}`eq:inflation101` and
686
695
the additional restriction that $\mu_t = \bar \mu$ for all $t$.
687
696
688
697
The solution of this problem is a time-invariant $\mu_t$ that this quantecon lecture {doc}`calvo` calls $\mu^{CR}$.
By thinking a little harder about the mathematical structure of the Ramsey problem and using some linear algebra, we can simplify the problem that we hand over to a ``machine learning`` algorithm.
730
+
By thinking about the mathematical structure of the Ramsey problem and using some linear algebra, we can simplify the problem that we hand over to a ``machine learning`` algorithm.
722
731
723
732
We start by recalling that the Ramsey problem that chooses $\vec \mu$ to maximize the government's value function {eq}`eq:Ramseyvalue`subject to equation {eq}`eq:inflation101`.
To help us learn something about the structure of the Ramsey plan, we compute some least squares linear regressions of some components of $\vec \theta$ and $\vec \mu$ on others.
1039
+
We compute some least squares linear regressions of some components of $\vec \theta$ and $\vec \mu$ on others.
1031
1040
1032
-
Our hope is that these regressions will reveal structure hidden within the $\vec \mu^R, \vec \theta^R$ sequences associated with a Ramsey plan.
1041
+
We hope that these regressions will reveal structure hidden within the $\vec \mu^R, \vec \theta^R$ sequences associated with a Ramsey plan.
1033
1042
1034
1043
It is worth pausing to think about roles being played here by **human** intelligence and **artificial** intelligence.
1035
1044
@@ -1066,8 +1075,8 @@ plt.legend()
1066
1075
plt.show()
1067
1076
```
1068
1077
1069
-
Note that $\theta_t$ is less than $\mu_t$for low $t$'s, but that it eventually converges to
1070
-
the same limit $\bar \mu$ that $\mu_t$ does.
1078
+
Note that while $\theta_t$ is less than $\mu_t$for low $t$'s, it eventually converges to
1079
+
the limit $\bar \mu$ of $\mu_t$ as $t \rightarrow +\infty$.
1071
1080
1072
1081
This pattern reflects how formula {eq}`eq_grad_old3` makes $\theta_t$ be a weighted average of future $\mu_t$'s.
1073
1082
@@ -1088,13 +1097,13 @@ print("Regression of μ_t on a constant and θ_t:")
1088
1097
print(results1.summary(slim=True))
1089
1098
```
1090
1099
1091
-
Our regression tells us that along the Ramsey outcome $\vec \mu, \vec \theta$ the linear function
1100
+
Our regression tells us that the affine function
1092
1101
1093
1102
$$
1094
1103
\mu_t = .0645 + 1.5995 \theta_t
1095
1104
$$
1096
1105
1097
-
fits perfectly.
1106
+
fits perfectly along the Ramsey outcome $\vec \mu, \vec \theta$.
1098
1107
1099
1108
1100
1109
```{note}
@@ -1160,7 +1169,7 @@ $\bar \mu, \bar \mu$.
1160
1169
1161
1170
### Continuation Values
1162
1171
1163
-
Next, we'll compute a sequence $\{v_t\}_{t=0}^T$ of what we'll call "continuation values" along a Ramsey plan.
1172
+
Next, we'll compute a sequence $\{v_t\}_{t=0}^T$ of what we'll call ``continuation values`` along a Ramsey plan.
* The sequence of continuation values $\{v_t\}_{t=0}^T$ is monotonically decreasing
1250
1259
* Evidently, $v_0 > V^{CR} > v_T$ so that
@@ -1372,9 +1381,9 @@ $$
1372
1381
1373
1382
We discovered these relationships by running some carefully chosen regressions and staring at the results, noticing that the $R^2$'s of unity tell us that the fits are perfect.
1374
1383
1375
-
We have learned something about the structure of the Ramsey problem.
1384
+
We have learned much about the structure of the Ramsey problem.
1376
1385
1377
-
However, it is challenging to say more just by using the methods and ideas that we have deployed in this lecture.
1386
+
However, by using the methods and ideas that we have deployed in this lecture, it is challenging to say more.
1378
1387
1379
1388
There are many other linear regressions among components of $\vec \mu^R, \theta^R$ that would also have given us perfect fits.
2 commit comments
github-actions[bot] commentedon Oct 8, 2024
🚀 Deployed on https://67048140cfb9a579b4ec5925--wonderful-lalande-528d1c.netlify.app
github-actions[bot] commentedon Oct 8, 2024
🚀 Deployed on https://6704b9155cadcab5a1167206--wonderful-lalande-528d1c.netlify.app