Housekeeping...
# Classic libraries
from __future__ import division
import numpy as np
import sympy as sym
# Plotting libraries
import matplotlib.pyplot as plt
import seaborn as sn # library for pretty plots. See https://stanford.edu/~mwaskom/software/seaborn/
sn.set_style('whitegrid')
cp = sn.xkcd_rgb["pinkish"]
cb = "#3498db"
cr = "#1fa774"
# Specialist functions
from scipy.optimize import minimize, root, fsolve
from scipy.optimize import leastsq, least_squares
from scipy.interpolate import interp1d
from scipy.special import erf
# Core functions - solve the problem and plot the results
import capital
# Other extensions
%matplotlib inline
%load_ext autoreload
%autoreload 2
Loading data from Adamopoulos and Restuccia (2014)
CalUS = np.loadtxt('model_data.csv', delimiter=',').T
CalP = np.loadtxt('model_data_p.csv', delimiter=',').T
y_grid, y_prob, farm_grid, k_grid, output_m = CalUS
y_grid_p, y_prob_p, farm_grid_p, k_grid_p, output_m_p = CalP
The original A&R paper's production function is:
$$ p_aA\kappa(\eta(k)^{\rho} + (1 -\eta)(yl))^{\rho})^{\frac{\gamma}{\rho}} $$Where:
The rest are parameters that are calibrated to match the farm size distribution of the US and the capital to land ratio.
E&K introduce two more variables:
Adding heterogenity in land quality ($x$) to the A&R model requires a different solution method.
The optimal choice of $k$ can be calculated as a function of the other variables so that the problem can be solved as an initial value problem (IVP) in the standard E&K way. That is, finding $k^*$ such that $$k^*(x,y,l,r) = \mathrm{argmax}_{k}F(x,y,l,r,k)-rRk$$
There is a number of ways of introducing these $x$ and $r$, of which 5 were considered. Below is the list of all tehm, with their accompaning first order condiction (FOC) for solving for $k^*(x,y,l,r)$.
# | name | function | FOC |
---|---|---|---|
0 | $\color{red}{x}$ with $l/r$ | $ rp_aA\kappa(\eta(k)^{1/4} + (1 -\eta)(y(l/r)^{\color{red}{x}})^{1/4})^{2} - rRk$ | $ K^{3} + -\frac{Ap_a\kappa\eta^{2}}{2R} K + \frac{Ap_a\kappa\eta(1 -\eta)(y(\frac{l}{r})^{x})^{1/4}}{2R} = 0$ |
1 | $\color{red}{x}$ with $A$ | $ rp_a\color{red}{x}A\kappa [\eta (k)^{1/4} + (1-\eta)(y(l/r))^{1/4} ]^2 - rRk $ | $K^3 - \frac{xAp_a\kappa\eta^2}{2R}K - \frac{xAp_a\kappa\eta(1-\eta)(y\frac{l}{r})^{1/4}}{2R}= 0 $ |
2 | $\color{red}{x}$ with $k$ | $ rp_aA\kappa [\eta (k\color{red}{x})^{1/4} + (1-\eta)(y(l/r))^{1/4} ]^2 - rRk \color{red}{x} $ | $K^3 - \frac{Ap_a\kappa\eta^2 x^{-1/2}}{2R}K - \frac{Ap_a\kappa\eta(1-\eta)(y\frac{l}{r})^{1/4} x^{-3/4}}{2R}= 0 $ |
3 | $\color{red}{x}$ with $k$ (out of the bracket) | $ rp_aA\kappa [\eta \color{red}{x} k^{1/4} + (1-\eta)(y(l/r))^{1/4} ]^2 - rRk \color{red}{x} $ | $K^3 - \frac{Ap_a\kappa\eta^2 x^2}{2R}K - \frac{Ap_a\kappa\eta(1-\eta)(y\frac{l}{r})^{1/4} x}{2R}= 0 $ |
4 | $k^{\color{red}{x}}$ | $ rp_aA\kappa(\eta(k)^{\color{red}{x}/4} + (1 -\eta)(y(l/r))^{1/4})^{2} - rRk$ | None |
Note: $K = k^{1/4}$
For the case of $k^{\color{red}{x}}$, there is no closed-form solution and $k^*$ has to be solved numerically. Solving the model with this especification renders the shooting solver I use unstable, therefore I do not use it. However, capital.py
includes a function to carry this approximation out.
For the rest of the specifications, $k^*$ is found by solving a cubic root. This is done in section 2. capital.py
approximates the solution with cubic or quadratic "tensor" polynomials - that is, polynomials with cross-products. This solves a multiple root problem and gives an easily differentiable expression (tht would make the solver work faster).
Once $k^*$ has been found, I proceed to solve for the baseline case where $x=1$ (section 3) and then increase the spread up to $x\in[0.75,1.25]$ (section 4). I use a shooting algorithm with logaritmic steps.
I chose to give $x$ a lognormal distribution (with mean 1) because is the one that makes the results move closest to the data - that is, small farms increasing size, but not much. I have left the code here to try other distributions and spreads.
Finally, in section 5 I provide the symolic derivatives to the production function to show that with A&R parameters, PAM holds for any combination of $x,y,l,r$ and $k$.
In the past, I calculated the optimal choice of $k$ for different functional forms - that's why I creaded dictionaries with the keys being the different functional forms the production function could take. I have keep it that way to bear in mind that I chose function (2), but you can solve for $k$ for any of the four different production functions.
I also have different $k^*$ approximations depending on the maximum spread of $x$ (here [0.90,1.10] and [0.75,1.25]). Shorter spreads lead to more accurate results.
Rich country (US calibration):
k_stars25 = {}
k_stars10 = {}
k_stars25[2] = capital.get_k("R", ftype=2,fchoice=1,plot=False, save_in="Rkpams25_{}".format(2), x_range=[0.75,1.25])
k_stars10[2] = capital.get_k("R", ftype=2,fchoice=0,plot=False, save_in="Rkpams{}".format(2), x_range=[0.9,1.1])
Poor country:
k_starsP25 = {}
k_starsP10 = {}
k_starsP25[2] = capital.get_k("P", ftype=2,fchoice=1,plot=False, save_in="Pkpams25_{}".format(2), x_range=[0.75,1.25])
k_starsP10[2] = capital.get_k("P", ftype=2,fchoice=0,plot=False, save_in="Pkpams{}".format(2), x_range=[0.9,1.1])
These $k$s will be feed to the solver.
capital.py
solves for no spread of $x$ first - the default values of the spread of $x$ being $(0.99999,1.00001)$.
sol_R1 = capital.solve("R", k_stars10[2], ftype=2,assort='positive',scaling_x=0.90607341901223637)
sol_P1 = capital.solve("P", k_starsP10[2], ftype=2,assort='positive',scaling_x=0.90607341901223637)
Note: In order for the solver to match the lowest values of $y$ (0.000003) capital.solve
uses an equivalent production function
This does not alter the results, but requires some care when dealing with the results: farm size ($\theta = \frac{l}{r}$) coming from the solver would be the inverse of model $\theta$, and in the code below you'll find xs
refering to farmer skill ($y$ in the model).
Histogram comparing with the original paper (Figure 7 in the appendix):
capital.plot_histogram(sol_R1,sol_P1,CalUS,CalP,save_in='Figure7.pdf')
For comparison later on, I solve for increasing the spread evenly (uniform distribution) for different ranges of $x$:
sol_R10 = {}
sol_P10 = {}
x_ranges = [(0.975,1.025),(0.95,1.05),(0.925,1.075),(0.9,1.1)]
for i in range(len(x_ranges)):
sol_R10[i] = capital.solve("R", k_stars10[2], ftype=2,assort='positive',x_range=x_ranges[i],scaling_x=0.90630885374008696)
sol_P10[i] = capital.solve("P", k_starsP10[2], ftype=2,assort='positive',x_range=x_ranges[i],scaling_x=0.90613690136320324)
print "Done with range {}".format(x_ranges[i])
_modules = [{'ImmutableMatrix': np.array, 'erf': erf, 'sqrt': np.sqrt}, 'numpy']
x, loc2, mu2, sigma2 = sym.var('x, loc2, mu2, sigma2')
productivity_cdf = (0.5 + 0.5 * sym.erf((sym.log(x - loc2) - mu2) / sym.sqrt(2 * sigma2**2)))
productivity_params62 = {'loc2': 0.0, 'mu2': 0.00, 'sigma2': 0.20}
pdf62 = productivity_cdf.diff(x).subs(productivity_params62)
pdf_exe62 = sym.lambdify(x, pdf62, 'numpy')
cdf_exe62 = sym.lambdify(x, productivity_cdf.subs(productivity_params62), _modules)
Bimodal
sig1, sig2, x, mu1, mu2= sym.var("sig1, sig2, x, mu1, mu2")
c1 = 0.5 * (1 + sym.erf((x-mu1)/ sym.sqrt(2 * sig1**2)))
c2 = 0.5 * (1 + sym.erf((x-mu2)/ sym.sqrt(2 * sig2**2)))
cdf = c1*0.5+c2*0.5
params72 = {'mu1': 0.93333, 'mu2': 1.06667, 'sig1': 0.033,'sig2': 0.033}
pdf72 = cdf.diff(x).subs(params72)
cdf72 = cdf.subs(params72)
pdf_exe72 = sym.lambdify(x, pdf72, _modules)
cdf_exe72 = sym.lambdify(x, cdf72, _modules)
Normal
params92 = {'mu1': 1.0, 'sig1': 0.06666 }
pdf92 = c1.diff(x).subs(params92)
cdf92 = c1.subs(params92)
pdf_exe92 = sym.lambdify(x, pdf92, _modules)
cdf_exe92 = sym.lambdify(x, cdf92, _modules)
Normalising constants - so total ammount of land integrates to 1.
norm72 = 1/(cdf_exe72(1.1)-cdf_exe72(0.9))
norm92 = 1/(cdf_exe92(1.1)-cdf_exe92(0.9))
norm62 = 1/(cdf_exe62(1.1)-cdf_exe62(0.9))
c72 = cdf_exe72(1.1)*norm72-1
c92= cdf_exe92(1.1)*norm92-1
c62= cdf_exe92(1.1)*norm62-1
Plots of the different distributions
xss = np.linspace(0.9,1.1,6000)
plt.figure(figsize=(6,6))
sn.set_palette('colorblind', 5)
plt.axvline(1.0,label="x=1", c= 'black')
plt.plot(xss,5*np.ones(len(xss)), label="Uniform")
plt.plot(xss,pdf_exe72(xss)*norm72, label='Bimodal')
plt.plot(xss,pdf_exe62(xss)*norm62, label="Lognormal")
plt.xlim(0.9,1.1)
plt.title("Pdf of land quality", fontsize=16)
plt.plot(xss,pdf_exe92(xss)*norm92, label='Normal')
plt.tight_layout()
plt.legend(loc='best')
plt.show()
The distribution that used in the paper is the Lognormal (code 26xx). This distribution delivers the results closer to the data provided in Adamopoulos and Restuccia (2014).
# Execute this cell to store results
sol_log = {}
sol_logP = {}
Lognormal
sol_log[2605] = capital.solve("R", k_stars10[2], ftype=2,assort='positive',spread='lognorm',dispams=[0.00,0.2],
x_range=[0.95,1.05],scaling_x=norm62*1.7611607186330467, normconst=c62)
sol_logP[2605]= capital.solve("P", k_starsP10[2], ftype=2,assort='positive',spread='lognorm',dispams=[0.00,0.2],
x_range=[0.95,1.05],scaling_x=norm62*1.7611607186330467, normconst=c62)
sol_log[2610] = capital.solve("R", k_stars10[2], ftype=2,assort='positive',spread='lognorm',dispams=[0.00,0.2],
x_range=[0.9,1.1],scaling_x=norm62*0.9061110680375417, normconst=c62)
sol_logP[2610]= capital.solve("P", k_starsP10[2], ftype=2,assort='positive',spread='lognorm',dispams=[0.00,0.2],
x_range=[0.9,1.1],scaling_x=norm62*0.9061110680375417, normconst=c62)
sol_log[2615] = capital.solve("R", k_stars25[2], ftype=2,assort='positive',spread='lognorm',dispams=[0.00,0.2],
x_range=[0.85,1.15],scaling_x=1.6494247873450758, normconst=c62)
sol_logP[2615]= capital.solve("P", k_starsP25[2], ftype=2,assort='positive',spread='lognorm',dispams=[0.00,0.2],
x_range=[0.85,1.15],scaling_x=1.6494247873450758, normconst=c62)
sol_log[2620] = capital.solve("R", k_stars25[2], ftype=2,assort='positive',spread='lognorm',dispams=[0.00,0.2],
x_range=[0.8,1.20],scaling_x=1.3196670116025528, normconst=c62)
sol_logP[2620]= capital.solve("P", k_starsP25[2], ftype=2,assort='positive',spread='lognorm',dispams=[0.00,0.2],
x_range=[0.8,1.20],scaling_x=1.3196670116025528, normconst=c62)
sol_log[2625] = capital.solve("R", k_stars25[2], ftype=2,assort='positive',spread='lognorm',dispams=[0.00,0.2],
x_range=[0.75,1.25],scaling_x=1.1434977494042775, normconst=c62)
sol_logP[2625]= capital.solve("P", k_starsP25[2], ftype=2,assort='positive',spread='lognorm',dispams=[0.00,0.2],
x_range=[0.75,1.25],scaling_x=1.1434977494042775, normconst=c62)
solsR = (sol_R1,sol_log[2605],sol_log[2610],sol_log[2615],sol_log[2620],sol_log[2625])
solsP = (sol_P1,sol_logP[2605],sol_logP[2610],sol_logP[2615],sol_logP[2620],sol_logP[2625])
labs = ('$x=0$','$x\in[0.95,1.05]$ ','$x\in[0.90,1.10]$','$x\in[0.85,1.15]$','$x\in[0.80,1.20]$','$x\in[0.75,1.25]$')
In logs:
capital.plot_land_choice(solsR,labs,solsP,labs, color=True, logs=True)
In levels:
capital.plot_land_choice(solsR,labs,solsP,labs, color=True, logs=False)
Focusing on the poor country (Figure 1 in the paper):
capital.plot_land_choice(solsP,labs,None,None, color=True, logs=False,save_in='Figure1.pdf')
$k/l$ ratios
These show the $k/l$ ratio normalized by the largest farm. That is, is $k/l=100$ for the largest farm, the $k/l$ ratio of the rest is divided by 100.
Rinch country on the left, poor country on the right.
capital.plot_kl_ratio(solsR,labs,solsP,labs,save_in=None,color=True)
Distribution plots
Rich country:
capital.plot_dischange(sol_R1,sol_log[2620],extras=[sol_log[2610],(0.9,1.1)],xrange=(0.8,1.2),
save_in=None,allplots=False,color=True, uc=cr)
Poor country:
capital.plot_dischange(sol_P1,sol_logP[2620],extras=[sol_logP[2610],(0.9,1.1)],xrange=(0.8,1.2),
save_in=None, allplots=False,color=True,uc=cp)
Does the PAM condition hold? Here I check for the PAM condition, so if lhs-rhs > 0, then PAM holds.
First without substituiting parameters:
x, y, k = sym.var('x, y, k')
R, l, r, A, kappa, p_a, rho, gamma, eta = sym.var('R, l, r, A, kappa, p_a, rho, gamma, eta')
sym.init_printing()
F = r*p_a*A*kappa*(eta*(x*k)**rho + (1- eta)*((l/r)*y)**rho)**(gamma/rho)
lhs = F.diff(x,y)*F.diff(l,r)*F.diff(k,k) - F.diff(x,y)*F.diff(l,k)*F.diff(r,k) - F.diff(x,k)*F.diff(y,k)*F.diff(l,r)
rhs = F.diff(x,r)*F.diff(y,l)*F.diff(k,k) - F.diff(x,r)*F.diff(y,k)*F.diff(l,k) - F.diff(x,k)*F.diff(y,l)*F.diff(r,k)
sym.simplify(lhs-rhs)
Then substituiting all the parameters fixed across countries:
eta = 0.89
rho = 0.25
gamma = 0.5
kappa = 1
F = r*p_a*A*kappa*(eta*(x*k)**rho + (1- eta)*((l/r)*y)**rho)**(gamma/rho)
lhs = F.diff(x,y)*F.diff(l,r)*F.diff(k,k) - F.diff(x,y)*F.diff(l,k)*F.diff(r,k) - F.diff(x,k)*F.diff(y,k)*F.diff(l,r)
rhs = F.diff(x,r)*F.diff(y,l)*F.diff(k,k) - F.diff(x,r)*F.diff(y,k)*F.diff(l,k) - F.diff(x,k)*F.diff(y,l)*F.diff(r,k)
sym.simplify(lhs-rhs)