Skip to content

Commit b4d4371

Browse files
HumphreyYangThomas Sargent
and
Thomas Sargent
authored
[calvo_ml] Updates on Visualizations and Regressions (#170)
* Adding a new regression v_t on theta and theta^2 * Tom's Aug 4 edits of Calvo_machine_learning lecture * update graph for v_t and regression data * use V^R to distinguish with the criterion V * Tom's Aug 5 edits of calvo_machine_learning lecture * update graph with V^CR * Tom's second Aug 5 edits of calvo_machine_learning lecture --------- Co-authored-by: Thomas Sargent <thomassargent@pop-os.localdomain>
1 parent f41653d commit b4d4371

File tree

1 file changed

+222
-35
lines changed

1 file changed

+222
-35
lines changed

lectures/calvo_machine_learn.md

+222-35
Original file line numberDiff line numberDiff line change
@@ -15,7 +15,7 @@ kernelspec:
1515

1616
## Introduction
1717

18-
This lecture studies a problem that we also study in another quantecon lecture
18+
This lecture studies a problem that we shall study from another angle in another quantecon lecture
1919
{doc}`calvo`.
2020

2121
That lecture used an analytic approach based on ``dynamic programming squared`` to guide computation of a Ramsey plan in a version of a model of Calvo {cite}`Calvo1978`.
@@ -172,7 +172,7 @@ U(m_t - p_t) = u_0 + u_1 (m_t - p_t) - \frac{u_2}{2} (m_t - p_t)^2, \quad u_0 >
172172
The money demand function {eq}`eq_grad_old1` and the utility function {eq}`eq_grad_old5` imply that
173173

174174
$$
175-
U(-\alpha \theta_t) = u_1 + u_2 (-\alpha \theta_t) -\frac{u_2}{2}(-\alpha \theta_t)^2 .
175+
U(-\alpha \theta_t) = u_0 + u_1 (-\alpha \theta_t) -\frac{u_2}{2}(-\alpha \theta_t)^2 .
176176
$$ (eq_grad_old5a)
177177
178178
@@ -697,7 +697,8 @@ np.linalg.norm(clq.μ_CR - optimized_μ_CR)
697697
```
698698
699699
```{code-cell} ipython3
700-
compute_V(optimized_μ_CR, β=0.85, c=2)
700+
V_CR = compute_V(optimized_μ_CR, β=0.85, c=2)
701+
V_CR
701702
```
702703
703704
```{code-cell} ipython3
@@ -708,7 +709,14 @@ compute_V(jnp.array([clq.μ_CR]), β=0.85, c=2)
708709
709710
By thinking a little harder about the mathematical structure of the Ramsey problem and using some linear algebra, we can simplify the problem that we hand over to a ``machine learning`` algorithm.
710711
711-
The idea here is that the Ramsey problem that chooses $\vec \mu$ to maximize the government's value function {eq}`eq:Ramseyvalue`subject to equation {eq}`eq:inflation101` is actually a quadratic optimum problem whose solution is characterized by a set of simultaneous linear equations in $\vec \mu$.
712+
We start by recalling that the Ramsey problem that chooses $\vec \mu$ to maximize the government's value function {eq}`eq:Ramseyvalue`subject to equation {eq}`eq:inflation101`.
713+
714+
This is actually an optimization problem with a quadratic objective function and linear constraints.
715+
716+
First-order conditions for this problem are a set of simultaneous linear equations in $\vec \mu$.
717+
718+
If we trust that the second-order conditions for a maximum are also satisfied (they are in our problem),
719+
we can compute the Ramsey plan by solving these equations for $\vec \mu$.
712720
713721
We'll apply this approach here and compare answers with what we obtained above with the gradient descent approach.
714722
@@ -933,7 +941,8 @@ print(f'deviation = {np.linalg.norm(optimized_μ - clq.μ_series)}')
933941
```
934942
935943
```{code-cell} ipython3
936-
compute_V(optimized_μ, β=0.85, c=2)
944+
V_R = compute_V(optimized_μ, β=0.85, c=2)
945+
V_R
937946
```
938947
939948
We find that by exploiting more knowledge about the structure of the problem, we can significantly speed up our computation.
@@ -1005,32 +1014,31 @@ closed_grad
10051014
print(f'deviation = {np.linalg.norm(closed_grad - (- grad_J(jnp.ones(T))))}')
10061015
```
10071016
1008-
## Some Regressions
1017+
## Some Exploratory Regressions
10091018
10101019
To help us learn about the structure of the Ramsey plan, we shall compute some least squares linear regressions of particular components of $\vec \theta$ and $\vec \mu$ on others.
10111020
10121021
Our hope is that these regressions will reveal structure hidden within the $\vec \mu^R, \vec \theta^R$ sequences associated with a Ramsey plan.
10131022
1014-
It is worth pausing here to think about roles played by **human** intelligence and **artificial** intelligence here.
1023+
It is worth pausing here to think about roles being played by **human** intelligence and **artificial** intelligence.
10151024
1016-
Artificial intelligence (AI a.k.a. ML) is running the regressions.
1025+
Artificial intelligence, i.e., some Python code and a computer, is running the regressions for us.
10171026
1018-
But you can regress anything on anything else.
1027+
But we are free to regress anything on anything else.
10191028
1020-
Human intelligence tell us which regressions to run.
1029+
Human intelligence tells us what regressions to run.
10211030
1022-
Even more human intelligence is required fully to appreciate what they reveal about the structure of the Ramsey plan.
1031+
Additional inputs of human intelligence will be required fully to appreciate what those regressions reveal about the structure of a Ramsey plan.
10231032
10241033
```{note}
1025-
At this point, it is worthwhile to read how Chang {cite}`chang1998credible` chose
1034+
When we eventually get around to trying to understand the regressions below, it will worthwhile to study the reasoning that let Chang {cite}`chang1998credible` to choose
10261035
$\theta_t$ as his key state variable.
10271036
```
10281037
10291038
10301039
We'll begin by simply plotting the Ramsey plan's $\mu_t$ and $\theta_t$ for $t =0, \ldots, T$ against $t$ in a graph with $t$ on the ordinate axis.
10311040
1032-
These are the data that we'll be running some linear least squares regressions on.
1033-
1041+
These are the data that we'll be running some linear least squares regressions on.
10341042
10351043
```{code-cell} ipython3
10361044
# Compute θ using optimized_μ
@@ -1040,23 +1048,23 @@ These are the data that we'll be running some linear least squares regressions o
10401048
# Plot the two sequences
10411049
Ts = np.arange(T)
10421050
1043-
plt.plot(Ts, μs, label=r'$\mu_t$')
1044-
plt.plot(Ts, θs, label=r'$\theta_t$')
1051+
plt.scatter(Ts, μs, label=r'$\mu_t$', alpha=0.7)
1052+
plt.scatter(Ts, θs, label=r'$\theta_t$', alpha=0.7)
10451053
plt.xlabel(r'$t$')
10461054
plt.legend()
10471055
plt.show()
10481056
```
10491057
10501058
We notice that $\theta_t$ is less than $\mu_t$for low $t$'s but that it eventually converges to
1051-
the same limit that $\mu_t$ does.
1059+
the same limit $\bar \mu$ that $\mu_t$ does.
10521060
1053-
This pattern reflects how formula {eq}`eq_grad_old3` for low $t$'s makes $\theta_t$ makes a weighted average of future $\mu_t$'s.
1061+
This pattern reflects how formula {eq}`eq_grad_old3` makes $\theta_t$ be a weighted average of future $\mu_t$'s.
10541062
10551063
We begin by regressing $\mu_t$ on a constant and $\theta_t$.
10561064
1057-
This might seem strange because, first of all, equation {eq}`eq_grad_old3` asserts that inflation at time $t$ is determined $\{\mu_s\}_{s=t}^\infty$
1065+
This might seem strange because, after all, equation {eq}`eq_grad_old3` asserts that inflation at time $t$ is determined $\{\mu_s\}_{s=t}^\infty$
10581066
1059-
Nevertheless, we'll run this regression anyway and provide a justification later.
1067+
Nevertheless, we'll run this regression anyway.
10601068
10611069
```{code-cell} ipython3
10621070
# First regression: μ_t on a constant and θ_t
@@ -1077,21 +1085,25 @@ $$
10771085
10781086
fits perfectly.
10791087
1080-
Let's plot this function and the points $(\theta_t, \mu_t)$ that lie on it for $t=0, \ldots, T$.
1088+
1089+
```{note}
1090+
Of course, this means that a regression of $\theta_t$ on $\mu_t$ and a constant would also fit perfectly.
1091+
```
1092+
1093+
Let's plot the regression line $\mu_t = .0645 + 1.5995 \theta_t$ and the points $(\theta_t, \mu_t)$ that lie on it for $t=0, \ldots, T$.
10811094
10821095
```{code-cell} ipython3
1083-
plt.scatter(θs, μs)
1084-
plt.plot(θs, results1.predict(X1_θ), 'C1', label='$\hat \mu_t$', linestyle='--')
1096+
plt.scatter(θs, μs, label=r'$\mu_t$')
1097+
plt.plot(θs, results1.predict(X1_θ), 'grey', label='$\hat \mu_t$', linestyle='--')
10851098
plt.xlabel(r'$\theta_t$')
10861099
plt.ylabel(r'$\mu_t$')
10871100
plt.legend()
10881101
plt.show()
10891102
```
10901103
1091-
The time $0$ pair $\theta_0, \mu_0$ appears as the point on the upper right.
1104+
The time $0$ pair $(\theta_0, \mu_0)$ appears as the point on the upper right.
10921105
1093-
Points for succeeding times appear further and further to the lower left and eventually converge to
1094-
$\bar \mu, \bar \mu$.
1106+
Points $(\theta_t, \mu_t)$ for succeeding times appear further and further to the lower left and eventually converge to $(\bar \mu, \bar \mu)$.
10951107
10961108
10971109
Next, we'll run a linear regression of $\theta_{t+1}$ against $\theta_t$.
@@ -1122,8 +1134,8 @@ that prevails along the Ramsey outcome for inflation.
11221134
Let's plot $\theta_t$ for $t =0, 1, \ldots, T$ along the line.
11231135
11241136
```{code-cell} ipython3
1125-
plt.scatter(θ_t, θ_t1)
1126-
plt.plot(θ_t, results2.predict(X2_θ), color='C1', label='$\hat θ_t$', linestyle='--')
1137+
plt.scatter(θ_t, θ_t1, label=r'$\theta_{t+1}$')
1138+
plt.plot(θ_t, results2.predict(X2_θ), color='grey', label='$\hat θ_{t+1}$', linestyle='--')
11271139
plt.xlabel(r'$\theta_t$')
11281140
plt.ylabel(r'$\theta_{t+1}$')
11291141
plt.legend()
@@ -1135,8 +1147,174 @@ plt.show()
11351147
Points for succeeding times appear further and further to the lower left and eventually converge to
11361148
$\bar \mu, \bar \mu$.
11371149
1150+
### Continuation Values
1151+
1152+
Next, we'll compute a sequence $\{v_t\}_{t=0}^T$ of what we'll call "continuation values" along a Ramsey plan.
1153+
1154+
To do so, we'll start at date $T$ and compute
11381155
1139-
### What has machine learning taught us?
1156+
$$
1157+
v_T = \frac{1}{1-\beta} s(\bar \mu, \bar \mu).
1158+
$$
1159+
1160+
Then starting from $t=T-1$, we'll iterate backwards on the recursion
1161+
1162+
$$
1163+
v_t = s(\theta_t, \mu_t) + \beta v_{t+1}
1164+
$$
1165+
1166+
for $t= T-1, T-2, \ldots, 0.$
1167+
1168+
```{code-cell} ipython3
1169+
# Define function for s and U in section 41.3
1170+
def s(θ, μ, u0, u1, u2, α, c):
1171+
U = lambda x: u0 + u1 * x - (u2 / 2) * x**2
1172+
return U(-α*θ) - (c / 2) * μ**2
1173+
1174+
# Calculate v_t sequence backward
1175+
def compute_vt(μ, β, c, u0=1, u1=0.5, u2=3, α=1):
1176+
T = len(μs)
1177+
θ = compute_θ(μ, α)
1178+
1179+
v_t = np.zeros(T)
1180+
μ_bar = μs[-1]
1181+
1182+
# Reduce parameters
1183+
s_p = lambda θ, μ: s(θ, μ,
1184+
u0=u0, u1=u1, u2=u2, α=α, c=c)
1185+
1186+
# Define v_T
1187+
v_t[T-1] = (1 / (1 - β)) * s_p(μ_bar, μ_bar)
1188+
1189+
# Backward iteration
1190+
for t in reversed(range(T-1)):
1191+
v_t[t] = s_p(θ[t], μ[t]) + β * v_t[t+1]
1192+
1193+
return v_t
1194+
1195+
v_t = compute_vt(μs, β=0.85, c=2)
1196+
```
1197+
1198+
The initial continuation value $v_0$ should equals the optimized value of the Ramsey planner's criterion $V$ defined
1199+
in equation {eq}`eq:RamseyV`.
1200+
1201+
1202+
Indeed, we find that the deviation is very small:
1203+
1204+
```{code-cell} ipython3
1205+
print(f'deviation = {np.linalg.norm(v_t[0] - V_R)}')
1206+
```
1207+
1208+
We can also verify approximate equality by inspecting a graph of $v_t$ against $t$ for $t=0, \ldots, T$ along with the value attained by a restricted Ramsey planner $V^{CR}$ and the optimized value of the ordinary Ramsey planner $V^R$
1209+
1210+
```{code-cell} ipython3
1211+
---
1212+
mystnb:
1213+
figure:
1214+
caption: Continuation values
1215+
name: continuation_values
1216+
---
1217+
# Plot the scatter plot
1218+
plt.scatter(Ts, v_t, label='$v_t$')
1219+
1220+
# Plot horizontal lines
1221+
plt.axhline(V_CR, color='C1', alpha=0.5)
1222+
plt.axhline(V_R, color='C2', alpha=0.5)
1223+
1224+
# Add labels
1225+
plt.text(max(Ts) + max(Ts)*0.07, V_CR, '$V^{CR}$', color='C1',
1226+
va='center', clip_on=False, fontsize=15)
1227+
plt.text(max(Ts) + max(Ts)*0.07, V_R, '$V^R$', color='C2',
1228+
va='center', clip_on=False, fontsize=15)
1229+
plt.xlabel(r'$t$')
1230+
plt.ylabel(r'$v_t$')
1231+
1232+
plt.tight_layout()
1233+
plt.show()
1234+
```
1235+
1236+
Figure {numref}`continuation_values` shows several striking patterns:
1237+
1238+
* The sequence of continuation values $\{v_t\}_{t=0}^T$ is monotonically decreasing
1239+
* Evidently, $v_0 > V^{CR} > v_T$ so that
1240+
* the value $v_0$ of the ordinary Ramsey plan exceeds the value $V^{CR}$ of the special Ramsey plan in which the planner is constrained to set $\mu_t = \mu^{CR}$ for all $t$.
1241+
* the continuation value $v_T$ of the ordinary Ramsey plan for $t \geq T$ is constant and is less than the value $V^{CR}$ of the special Ramsey plan in which the planner is constrained to set $\mu_t = \mu^{CR}$ for all $t$
1242+
1243+
1244+
```{note}
1245+
The continuation value $v_T$ is what some researchers call the "value of a Ramsey plan under a
1246+
time-less perspective." A more descriptive phrase is "the value of the worst continuation Ramsey plan."
1247+
```
1248+
1249+
1250+
Next we ask Python to regress $v_t$ against a constant, $\theta_t$, and $\theta_t^2$.
1251+
1252+
$$
1253+
v_t = g_0 + g_1 \theta_t + g_2 \theta_t^2 .
1254+
$$
1255+
1256+
```{code-cell} ipython3
1257+
# Third regression: v_t on a constant, θ_t and θ^2_t
1258+
X3_θ = np.column_stack((np.ones(T), θs, θs**2))
1259+
model3 = sm.OLS(v_t, X3_θ)
1260+
results3 = model3.fit()
1261+
1262+
1263+
# Print regression summary
1264+
print("\nRegression of v_t on a constant, θ_t and θ^2_t:")
1265+
print(results3.summary(slim=True))
1266+
```
1267+
1268+
The regression has an $R^2$ equal to $1$ and so fits perfectly.
1269+
1270+
However, notice the warning about the high condition number.
1271+
1272+
As indicated in the printout, this is a consequence of
1273+
$\theta_t$ and $\theta_t^2$ being highly correlated along the Ramsey plan.
1274+
1275+
```{code-cell} ipython3
1276+
np.corrcoef(θs, θs**2)
1277+
```
1278+
1279+
Let's plot $v_t$ against $\theta_t$ along with the nonlinear regression line.
1280+
1281+
```{code-cell} ipython3
1282+
θ_grid = np.linspace(min(θs), max(θs), 100)
1283+
X3_grid = np.column_stack((np.ones(len(θ_grid)), θ_grid, θ_grid**2))
1284+
1285+
plt.scatter(θs, v_t)
1286+
plt.plot(θ_grid, results3.predict(X3_grid), color='grey',
1287+
label='$\hat v_t$', linestyle='--')
1288+
plt.axhline(V_CR, color='C1', alpha=0.5)
1289+
1290+
plt.text(max(θ_grid) - max(θ_grid)*0.025, V_CR, '$V^{CR}$', color='C1',
1291+
va='center', clip_on=False, fontsize=15)
1292+
1293+
plt.xlabel(r'$\theta_{t}$')
1294+
plt.ylabel(r'$v_t$')
1295+
plt.legend()
1296+
1297+
plt.tight_layout()
1298+
plt.show()
1299+
```
1300+
1301+
The highest continuation value $v_0$ at $t=0$ appears at the peak of the function quadratic function
1302+
$g_0 + g_1 \theta_t + g_2 \theta_t^2$.
1303+
1304+
1305+
Subsequent values of $v_t$ for $t \geq 1$ appear to the lower left of the pair $(\theta_0, v_0)$ and converge monotonically from above to $v_T$ at time $T$.
1306+
1307+
The value $V^{CR}$ attained by the Ramsey plan that is restricted to be a constant $\mu_t = \mu^{CR}$ sequence appears as a horizontal line.
1308+
1309+
Evidently, continuation values $v_t > V^{CR}$ for $t=0, 1, 2$ while $v_t < V^{CR}$ for $t \geq 3$.
1310+
1311+
1312+
1313+
1314+
1315+
1316+
1317+
## What has Machine Learning Taught Us?
11401318
11411319
11421320
Our regressions tells us that along the Ramsey outcome $\vec \mu^R, \vec \theta^R$, the linear function
@@ -1145,10 +1323,14 @@ $$
11451323
\mu_t = .0645 + 1.5995 \theta_t
11461324
$$
11471325
1148-
fits perfectly and that so does the regression line
1326+
fits perfectly and that so do the regression lines
1327+
1328+
$$
1329+
\theta_{t+1} = - .0645 + .4005 \theta_t
1330+
$$
11491331
11501332
$$
1151-
\theta_{t+1} = - .0645 + .4005 \theta_t .
1333+
v_t = 6.8052 - .7580 \theta_t - 4.6991 \theta_t^2.
11521334
$$
11531335
11541336
@@ -1170,12 +1352,18 @@ that along a Ramsey plan, the following relationships prevail:
11701352
11711353
where the initial value $\theta_0^R$ was computed along with other components of $\vec \mu^R, \vec \theta^R$ when we computed the Ramsey plan, and where $b_0, b_1, d_0, d_1$ are parameters whose values we estimated with our regressions.
11721354
1355+
In addition, we learned that continuation values are described by the quadratic function
1356+
1357+
$$
1358+
v_t = g_0 + g_1 \theta_t + g_2 \theta_t^2
1359+
$$
1360+
11731361
1174-
We discovered this representation by running some carefully chosen regressions and staring at the results, noticing that the $R^2$ of unity tell us that the fits are perfect.
1362+
We discovered these relationships by running some carefully chosen regressions and staring at the results, noticing that the $R^2$'s of unity tell us that the fits are perfect.
11751363
11761364
We have learned something about the structure of the Ramsey problem.
11771365
1178-
But it is challenging to say more just by using the methods and ideas that we have deployed in this lecture.
1366+
However, it is challenging to say more just by using the methods and ideas that we have deployed in this lecture.
11791367
11801368
There are many other linear regressions among components of $\vec \mu^R, \theta^R$ that would also have given us perfect fits.
11811369
@@ -1187,7 +1375,7 @@ After all, the Ramsey planner chooses $\vec \mu$, while $\vec \theta$ is an o
11871375
11881376
Isn't it more natural then to expect that we'd learn more about the structure of the Ramsey problem from a regression of components of $\vec \theta$ on components of $\vec \mu$?
11891377
1190-
To answer such questions, we'll have to deploy more economic theory.
1378+
To answer these questions, we'll have to deploy more economic theory.
11911379
11921380
We do that in this quantecon lecture {doc}`calvo`.
11931381
@@ -1204,7 +1392,6 @@ and the parameters $d_0, d_1$ in the updating rule for $\theta_{t+1}$ in represe
12041392
12051393
First, we'll again use ``ChangLQ`` to compute these objects (along with a number of others).
12061394
1207-
12081395
```{code-cell} ipython3
12091396
clq = ChangLQ(β=0.85, c=2, T=T)
12101397
```

0 commit comments

Comments
 (0)