From f88db7e34aa253ff5c4fa32c98a7d25a5cee5886 Mon Sep 17 00:00:00 2001
From: John Stachurski <john.stachurski@gmail.com>
Date: Wed, 27 Nov 2019 10:48:13 +1100
Subject: [PATCH 1/3] getting started

---
 source/_static/quant-econ.bib    |   7 +
 source/rst/index_intro_dynam.rst |   1 +
 source/rst/kesten_processes.rst  | 353 +++++++++++++++++++++++++++++++
 3 files changed, 361 insertions(+)
 create mode 100644 source/rst/kesten_processes.rst

diff --git a/source/_static/quant-econ.bib b/source/_static/quant-econ.bib
index 22b09b19..408c647a 100644
--- a/source/_static/quant-econ.bib
+++ b/source/_static/quant-econ.bib
@@ -2,6 +2,13 @@
 QuantEcon Bibliography File used in conjuction with sphinxcontrib-bibtex package
 Note: Extended Information (like abstracts, doi, url's etc.) can be found in quant-econ-extendedinfo.bib file in _static/
 ###
+@book{buraczewski2016stochastic,
+    title={Stochastic models with power-law tails},
+    author={Buraczewski, Dariusz and Damek, Ewa and Mikosch, Thomas and others},
+    year={2016},
+    publisher={Springer}
+}
+
 
 @inproceedings{nishiyama2004estimation,
     title={Estimation and testing for rank size rule regression under pareto
diff --git a/source/rst/index_intro_dynam.rst b/source/rst/index_intro_dynam.rst
index d005726b..c72aa192 100644
--- a/source/rst/index_intro_dynam.rst
+++ b/source/rst/index_intro_dynam.rst
@@ -25,6 +25,7 @@ agents as given.  Later we will look at full equilibrium problems.
     finite_markov
     linear_models
     samuelson
+    kesten_processes
     stationary_densities
     cass_koopmans
     kalman
diff --git a/source/rst/kesten_processes.rst b/source/rst/kesten_processes.rst
new file mode 100644
index 00000000..bf786dc8
--- /dev/null
+++ b/source/rst/kesten_processes.rst
@@ -0,0 +1,353 @@
+.. include:: /_static/includes/header.raw
+
+.. highlight:: python3
+
+**********************************
+Kesten Processes and Firm Dynamics
+**********************************
+
+.. index::
+    single: Linear State Space Models
+
+.. contents:: :depth: 2
+
+In addition to what's in Anaconda, this lecture will need the following libraries:
+
+.. code-block:: ipython
+  :class: hide-output
+
+  !pip install --upgrade quantecon
+  !pip install --upgrade yfinance
+
+
+Overview
+========
+
+:doc:`Previously <linear_models>` we learned about linear stochastic processes
+(linear state space models).
+
+Now we turn to a class of models that linear apart from the fact that the
+multiplicative component is stochastic.  
+
+Such processes are known as Kesten processes after German--American mathematician Harry Kesten (1931--2019)
+
+Although simple to write down, Kesten processes are important for two reasons.
+
+1. A number of significant economic processes are or can be described as Kesten processes.
+
+2. Kesten processes generate interesting dynamics, including, in some cases, heavy-tailed cross-sectional distributions.
+
+
+Let's start with some imports:
+
+.. code-block:: ipython
+
+    import numpy as np
+    import matplotlib.pyplot as plt
+    %matplotlib inline
+
+
+Kesten Processes
+=================
+
+.. index::
+    single: Kesten processes; heavy tails
+
+
+Kesten processes are stochastic processes of the form
+
+.. math::
+    :label: kesproc
+
+    x_{t+1} = a_{t+1} x_t + \eta_{t+1}
+    \quad \text{with } x_0 \text{ given}
+
+We will focus on the nonnegative scalar case, where :math:`x_t` takes values
+in :math:`\mathbb R_+`
+
+In particular, we will assume that 
+
+* the initial condition :math:`x_0` is nonnegative,
+
+* :math:`\{a_t\}_{t \geq 1}` is a nonnegative iid stochastic process and
+
+* :math:`\{\eta_t\}_{t \geq 1}` is another nonnegative iid stochastic process, independent of the first.
+
+
+
+
+Example: GARCH Volatility
+-------------------------
+
+
+The GARCH model is common in financial settings, where time series such as asset returns exhibit time varying volatility.
+
+For example, consider the following plot of daily returns on the Nasdaq
+Composite Index for the period 1st January 2006 to 1st November 2019. 
+
+.. _ndcode:
+
+.. code-block:: python3
+
+    import yfinance as yf
+    import pandas as pd
+
+    s = yf.download('^IXIC', '2006-1-1', '2019-11-1')['Adj Close']
+    
+    r = s.pct_change()
+    
+    fig, ax = plt.subplots()
+    
+    ax.plot(r, alpha=0.7, ms=4)
+    
+    ax.set_ylabel('returns', fontsize=12)
+    ax.set_xlabel('date', fontsize=12)
+    
+    plt.show()
+
+
+Notice how the series exhibits bursts of volatility (high variance) and then
+settles down again.
+
+GARCH models can replicate this feature.
+
+The GARCH(1, 1) volatility process takes the form
+
+.. math::
+    :label: garch11v
+
+    \sigma_{t+1}^2 = \alpha_0 + \sigma_t^2 (\alpha_1 \xi_{t+1}^2 + \beta)
+    
+where :math:`\{\xi_t\}` is iid with :math:`\mathbb E \xi_t^2 = 1` and all parameters are positive.  
+
+Returns on a given asset are then modeled as
+
+.. math::
+    :label: garch11r
+
+    r_t = \sigma_t \zeta_{t+1}
+
+where :math:`\{\zeta_t\}` is again iid and independent of :math:`\{\xi_t\}`.
+
+Notice that the volatility sequence :math:`\{\sigma_t\}`, which drives the dynamics, is a Kesten process.
+
+
+Example: Wealth Dynamics
+------------------------
+
+Suppose that a given household saves a fixed fraction :math:`s` of its current wealth in every period.
+
+The household earns labor income :math:`y_t` at the start of time :math:`t`.
+
+Wealth then evolves according to 
+
+.. math::
+    :label: wealth_dynam
+
+    w_{t+1} = R_{t+1} s w_t  + y_{t+1}
+    
+where :math:`\{R_t\}` is the gross rate of return on assets.
+
+If :math:`\{R_t\}` and :math:`\{y_t\}` are both iid, then :eq:`wealth_dynam`
+is a Kesten process.
+
+
+Stationarity
+------------
+
+In earlier lectures on :doc:`Markov chains <finite_markov>` and :doc:`linear state space models <linear_models>`, we introduced the notion of a stationary distribution.
+
+In the present context, we can define a stationary distribution as follows:
+
+The distribution :math:`F^*` on :math:`\mathbb R` is called **stationary** for the
+Kesten process :eq:`kesproc` if
+
+.. math::
+    :label: kp_stationary0
+
+    x_t \sim F^* 
+    \quad \implies \quad 
+    a_{t+1} x_t + \eta_{t+1} \sim F^*
+
+In other words, if the current state :math:`x_t` has distribution :math:`F^*`,
+then so does the next period state :math:`x_{t+1}`.
+
+We can write this alternatively as
+
+.. math::
+    :label: kp_stationary
+
+    F^*(y) = \int \mathbb P\{ a_{t+1} x + \eta_{t+1} \leq y\} F^*(dx)
+    \quad \text{for all } y \geq 0.
+
+The left hand side is the distribution of the next period state when the
+current state is drawn from :math:`F^*`.
+
+The equality in :eq:`kp_stationary` states that this distribution is unchanged.
+
+
+Cross-Sectional Interpretation
+------------------------------
+
+There is an important cross-sectional interpretation of stationary distributions, discussed previously but worth repeating here.
+
+Suppose, for example, that we are interested in the wealth distribution --- that is, the current distribution of wealth across households in a given country.
+
+Suppose further that 
+
+* the wealth of each household evolves independently according to
+  :eq:`wealth_dynam`,
+
+* :math:`F^*` is a stationary distribution for this stochastic process and
+
+* there are many households.
+
+Then :math:`F^*` is a steady state for the wealth distribution.
+
+To see this, suppose that :math:`F^*` is the current wealth distribution.
+
+What is the fraction of households with wealth less than :math:`y` next
+period?
+
+To obtain this, we sum the probability that wealth is less than :math:`y` tomorrow given that current wealth is :math:`w`, weighted by the fraction of households with wealth :math:`w`.  
+
+.. If we randomly select a household and update it via :eq:`wealth_dynam`, then, by the definition of stationarity, we draw its new wealth according to :math:`F^*`
+
+Noting that the fraction of households with wealth in interval :math:`dw` is :math:`F^*(dw)`, we get
+
+
+.. math::
+
+    \int \mathbb P\{ R_{t+1} s w  + y_{t+1} \leq y\} F^*(dw)
+
+Since :math:`F^*` is stationary for the wealth process, this is just
+:math:`F^*(y)`.
+
+Hence the fraction of households with wealth in :math:`[0, y]` is the same
+next period as it is this period.
+
+Since :math:`y` was chosen arbitrarily, the distribution is inchanged.
+
+
+Conditions for Stationarity
+---------------------------
+
+The Kesten process :math:`x_{t+1} = a_{t+1} x_t + \eta_{t+1}` does not always
+have a stationary distribution.
+
+For example, if :math:`a_t \equiv \eta_t \equiv 1` for all :math:`t`, then
+:math:`x_t = x_0 + t`, which diverges to infinity.
+
+To prevent this kind of divergence, we require that :math:`\{a_t\}` is
+strictly less than 1 most of the time.
+
+In particular, if 
+
+.. math::
+    :label: kp_stat_cond
+
+    \mathbb E \ln a_t < 0
+    \quad \text{and} \quad
+    \mathbb E \eta_t < \infty
+
+then a unique stationary distribution exists on :math:`\mathbb R_+`.
+
+* See, for example, theorem 2.1.3 of :cite:`buraczewski2016stochastic`, which provides slightly weaker conditions.
+
+As one application of this result, we see that the wealth process
+:eq:`wealth_dynam` will have a unique stationary distribution whenever
+:math:`\mathbb E \ln R_t  + \ln s < 0`.
+
+
+Heavy Tails
+===========
+
+Under certain conditions, the stationary distribution of a Kesten process has
+a Pareto tail.
+
+(See our :doc:`earlier lecture <heavy_tails>`  on heavy-tailed distributions for a discussion of Pareto tails.)
+
+This fact is highly significant for economics because of the prevalence of
+Pareto-tailed distributions.
+
+To state the conditions, we recall that a random variable is called
+**nonarithmetic** if its distribution is not concentrated on :math:`t \mathbb
+Z` for any :math:`t \geq 0`.
+
+For example, any random variable with a density is nonarithmetic.
+
+The famous Kesten--Goldie Theorem (see, e.g.,
+:cite:`buraczewski2016stochastic`, theorem 2.4.4) states that if
+
+1. The stationarity conditions in :eq:`kp_stat_cond` hold,
+
+2. The random variable $a_t$ is positive with probability one and nonarithmetic.
+
+3. :math:`\mathbb P\{a_t x + \eta_t = x\} < 1` for all :math:`x \in \mathbb R_+`.
+
+4. There exists a positive constant :math:`\alpha` such that
+
+.. math::
+
+    \mathbb E a_t^\alpha = 1,
+        \quad
+    \mathbb E \eta_t^\alpha < \infty,
+        \quad \text{and} \quad
+    \mathbb E [a_t^{\alpha+1} ] < \infty
+
+then the stationary distribution of the Kesten process has a Pareto tail with
+tail index :math:`\alpha`.
+
+More precisely, if :math:`F^*` is the unique stationary distribution and :math:`x^* \sim F^*`, then 
+
+.. math::
+
+    \lim_{x \to \infty} x^\alpha \mathbb P\{x^* > x\} = c
+    
+for some positive constant :math:`c`.
+
+
+
+
+Application: Firm Dynamics
+==========================
+
+
+
+.. math::
+    :label: firm_dynam
+
+    s_{t+1} = a_{t+1} s_t + b_{t+1}
+    
+
+If :math:`\{a_t\}` and :math:`\{b_t\}` are both iid, then :eq:`firm_dynam`
+is a Kesten process.
+
+
+
+
+Exercises
+=========
+
+Exercise 1
+----------
+
+Simulate and plot 15 years of daily returns (consider each year as having 250
+working days) the GARCH(1, 1) process in :eq:`garch11v`--:eq:`garch11r`.
+
+Take :math:`\xi_t` and :math:`\zeta_t` to be independent and standard normal.
+
+Set :math:`\alpha_0 = XX, \alpha_1 = XX, \beta = XX` and :math:`\sigma_0 = 0`.
+    
+Compare visually with the Nasdaq Composite Index returns :ref:`shown above <ndcode>`.
+
+While the time path differs, you should see bursts of high volatility.
+
+
+Solutions
+=========
+
+Exercise 1
+----------
+
+Foobar

From aac85c7109690263d3810237cf61fdc617ca54f8 Mon Sep 17 00:00:00 2001
From: John Stachurski <john.stachurski@gmail.com>
Date: Wed, 27 Nov 2019 15:38:56 +1100
Subject: [PATCH 2/3] misc edits

---
 source/rst/kesten_processes.rst | 98 +++++++++++++++++++++++++++++++--
 1 file changed, 93 insertions(+), 5 deletions(-)

diff --git a/source/rst/kesten_processes.rst b/source/rst/kesten_processes.rst
index bf786dc8..1917f23b 100644
--- a/source/rst/kesten_processes.rst
+++ b/source/rst/kesten_processes.rst
@@ -26,8 +26,7 @@ Overview
 :doc:`Previously <linear_models>` we learned about linear stochastic processes
 (linear state space models).
 
-Now we turn to a class of models that linear apart from the fact that the
-multiplicative component is stochastic.  
+Now we generalize these linear models slightly by allowing the multiplicative component to be stochastic.  
 
 Such processes are known as Kesten processes after German--American mathematician Harry Kesten (1931--2019)
 
@@ -38,6 +37,8 @@ Although simple to write down, Kesten processes are important for two reasons.
 2. Kesten processes generate interesting dynamics, including, in some cases, heavy-tailed cross-sectional distributions.
 
 
+We will discuss these issues as we go along.
+
 Let's start with some imports:
 
 .. code-block:: ipython
@@ -54,7 +55,7 @@ Kesten Processes
     single: Kesten processes; heavy tails
 
 
-Kesten processes are stochastic processes of the form
+A **Kesten process** is a stochastic processes of the form
 
 .. math::
     :label: kesproc
@@ -62,6 +63,9 @@ Kesten processes are stochastic processes of the form
     x_{t+1} = a_{t+1} x_t + \eta_{t+1}
     \quad \text{with } x_0 \text{ given}
 
+where :math:`\{a_t\}_{t \geq 1}` and :math:`\{\eta_t\}_{t \geq 1}` are iid
+sequences.
+
 We will focus on the nonnegative scalar case, where :math:`x_t` takes values
 in :math:`\mathbb R_+`
 
@@ -270,6 +274,9 @@ a Pareto tail.
 This fact is highly significant for economics because of the prevalence of
 Pareto-tailed distributions.
 
+The Kesten-Goldie Theorem
+-------------------------
+
 To state the conditions, we recall that a random variable is called
 **nonarithmetic** if its distribution is not concentrated on :math:`t \mathbb
 Z` for any :math:`t \geq 0`.
@@ -307,12 +314,53 @@ More precisely, if :math:`F^*` is the unique stationary distribution and :math:`
 for some positive constant :math:`c`.
 
 
+An illustration Using Rank-Size Plots
+--------------------------------------
+
+To be added.
+
+
 
 
 Application: Firm Dynamics
 ==========================
 
 
+As noted in our :doc:`lecture on heavy tails`, for common measures of firm size such as revenue or employment,
+the US firm size distribution exhibits a Pareto tail (see, e.g., :cite:`axtell2001zipf`, :cite:`gabaix2016power`}).
+
+We can explain this rather striking fact using the Kesten--Goldie Theorem.
+
+For starts, it was postulated many years ago that firm size evolves according to
+Gibrat's law of proportional growth (see, e.g., XXX ).
+
+This can be expressed by stating that, for some suitable measure of firm size :math:`s_t`, we have
+
+.. math::
+    :label: firm_dynam_gb
+
+    \frac{s_{t+1}}{s_t} = a_{t+1} 
+    
+where :math:`\{a_t\}` is some positive iid sequence.
+
+In particular, Gibrat's law asserts that the growth rate of individual firms does not depend on firm size.
+
+However, over the last few decades, research contradicting Gibrat's law has
+accumulated in the literature.
+
+For example, it is commonly found that, on average,
+
+1. small firms grow faster than large firms and
+
+2. the growth rate of small firms is more volatile than that of large firms.
+
+See, for example. XXXX.
+
+On the other hand, Gibrat's law is generally agreed to be a reasonable
+approximation for large firms (see, e.g., XXXX).
+
+We can accommodate these empirical findings by modifying :eq:`firm_dynam_gb`
+to
 
 .. math::
     :label: firm_dynam
@@ -320,8 +368,18 @@ Application: Firm Dynamics
     s_{t+1} = a_{t+1} s_t + b_{t+1}
     
 
-If :math:`\{a_t\}` and :math:`\{b_t\}` are both iid, then :eq:`firm_dynam`
-is a Kesten process.
+where :math:`\{a_t\}` and :math:`\{b_t\}` are both iid and independent of each
+other.
+
+In the exercises below you are asked to show that :eq:`firm_dynam` is more
+consistent with the empirical findings than Gibrat's law in
+:eq:`firm_dynam_gb`.
+
+So what has this to do with heavy tails?
+
+The answer is that :eq:`firm_dynam` is a Kesten process.
+
+XX add discussion of conditions, refer to exercise below XX
 
 
 
@@ -344,6 +402,30 @@ Compare visually with the Nasdaq Composite Index returns :ref:`shown above <ndco
 While the time path differs, you should see bursts of high volatility.
 
 
+Exercise 2
+----------
+
+Forecast present discounted value of 10 years of corporate tax revenue, tax
+rate of 15%, 1e6 firms, revenue distribution predicted to follow either
+lognormal or Pareto.  
+
+Lognormal is matched to Pareto by maching the mean and the median.
+
+
+Exercise 3
+----------
+
+
+.. math::
+    :label: firm_dynam_ee
+
+    s_{t+1} = e_{t+1} \mathbb 1 \{s_t < \bar s\}
+        + (a_{t+1} s_t + b_{t+1}) \mathbb 1 \{s_t \geq \bar s\}
+
+Generate rank-size plot.
+
+
+
 Solutions
 =========
 
@@ -351,3 +433,9 @@ Exercise 1
 ----------
 
 Foobar
+
+
+Exercise 1
+----------
+
+Foobar

From 3460df770a96ec796a588c4c4d6558e1373341f5 Mon Sep 17 00:00:00 2001
From: John Stachurski <john.stachurski@gmail.com>
Date: Thu, 28 Nov 2019 11:13:17 +1100
Subject: [PATCH 3/3] various edits

---
 source/_static/quant-econ.bib   |  70 +++++++++++
 source/rst/heavy_tails.rst      | 177 +++++++++++++++++++++++++++-
 source/rst/kesten_processes.rst | 202 +++++++++++++++++++++++++-------
 3 files changed, 402 insertions(+), 47 deletions(-)

diff --git a/source/_static/quant-econ.bib b/source/_static/quant-econ.bib
index 408c647a..84ef6ce3 100644
--- a/source/_static/quant-econ.bib
+++ b/source/_static/quant-econ.bib
@@ -2,6 +2,76 @@
 QuantEcon Bibliography File used in conjuction with sphinxcontrib-bibtex package
 Note: Extended Information (like abstracts, doi, url's etc.) can be found in quant-econ-extendedinfo.bib file in _static/
 ###
+
+@techreport{kondo2018us,
+    title={On the US Firm and Establishment Size Distributions},
+    author={Kondo, Illenin and Lewis, Logan T and Stella, Andrea},
+    year={2018},
+    institution={SSRN}
+}
+
+@article{schluter2019size,
+    title={Size distributions reconsidered},
+    author={Schluter, Christian and Trede, Mark},
+    journal={Econometric Reviews},
+    volume={38},
+    number={6},
+    pages={695--710},
+    year={2019},
+    publisher={Taylor \& Francis}
+}
+
+@article{fujiwara2004pareto,
+    title={Do Pareto--Zipf and Gibrat laws hold true? An analysis with
+    European firms},
+    author={Fujiwara, Yoshi and Di Guilmi, Corrado and Aoyama, Hideaki and
+    Gallegati, Mauro and Souma, Wataru},
+    journal={Physica A: Statistical Mechanics and its Applications},
+    volume={335},
+    number={1-2},
+    pages={197--216},
+    year={2004},
+    publisher={Elsevier}
+}
+
+@article{dunne1989growth,
+    title={The growth and failure of US manufacturing plants},
+    author={Dunne, Timothy and Roberts, Mark J and Samuelson, Larry},
+    journal={The Quarterly Journal of Economics},
+    volume={104},
+    number={4},
+    pages={671--698},
+    year={1989},
+    publisher={MIT Press}
+}
+
+@article{hall1987relationship,
+    title={The Relationship Between Firm Size and Firm Growth in the US
+    Manufacturing Sector},
+    author={Hall, Bronwyn H},
+    journal={The Journal of Industrial Economics},
+    pages={583--606},
+    year={1987},
+    publisher={JSTOR}
+}
+
+@article{evans1987relationship,
+    title={The relationship between firm growth, size, and age: Estimates for
+           100 manufacturing industries}, author={Evans, David S},
+    journal={The Journal of Industrial Economics},
+    pages={567--581},
+    year={1987},
+    publisher={JSTOR}
+}
+
+@phdthesis{gibrat1931inegalites,
+    title={Les in{\'e}galit{\'e}s {\'e}conomiques: Applications d'une loi
+           nouvelle, la loi de l'effet proportionnel},
+    author={Gibrat, Robert},
+    year={1931},
+    school={Recueil Sirey}
+}
+
 @book{buraczewski2016stochastic,
     title={Stochastic models with power-law tails},
     author={Buraczewski, Dariusz and Damek, Ewa and Mikosch, Thomas and others},
diff --git a/source/rst/heavy_tails.rst b/source/rst/heavy_tails.rst
index cfd79a01..576fd346 100644
--- a/source/rst/heavy_tails.rst
+++ b/source/rst/heavy_tails.rst
@@ -55,7 +55,7 @@ settings include
 
 * the distribution of city sizes (:cite:`rozenfeld2011area`, :cite:`gabaix2016power`).
 
-These heavy tails turn out to be important for our understanding economic outcomes and their impact.
+These heavy tails turn out to be important for our understanding of economic outcomes.
 
 As one example, the heaviness of the tail in the wealth distribution is one
 natural measure of inequality. 
@@ -341,7 +341,7 @@ for some positive constants :math:`\bar x` and :math:`\alpha`.
 
 It is easy to see that if :math:`X \sim F`, then :math:`\mathbb P\{X > x\}` satisfies :eq:`plrt`.  
 
-Thus, in line with the terminology, a Pareto distributed random variables have a Pareto tail.
+Thus, in line with the terminology, Pareto distributed random variables have a Pareto tail.
 
 
 Rank-Size Plots
@@ -362,11 +362,11 @@ A discussion of why this occurs can be found in :cite:`nishiyama2004estimation`.
 
 The figure below provides one example, using simulated data.
 
-The rank-size plots shows draws from three different distributions: folded normal, chi squared with 1 degree of freedom and Pareto.  
+The rank-size plots shows draws from three different distributions: folded normal, chi-squared with 1 degree of freedom and Pareto.  
 
 In each case, the largest 5\% of 1,000 draws are shown.  
 
-The Pareto sample produces a straight line, while the line produced by the other samples is concave.  
+The Pareto sample produces a straight line, while the lines produced by the other samples are concave.  
 
 .. _rank_size_fig1:
 
@@ -413,6 +413,66 @@ Replicate the rank-size plot figure :ref:`presented above <rank_size_fig1>`.
 Use ``np.random.seed(13)`` to set the seed.
 
 
+Exercise 5
+----------
+
+There is an ongoing argument about whether the firm size distribution should
+be modeled as a Pareto distribution or a lognormal distribution (see, e.g.,
+:cite:`fujiwara2004pareto`, :cite:`kondo2018us` or :cite:`schluter2019size`).
+
+This sounds esoteric but has real implications for a variety of economic
+phenomena.
+
+To illustrate this fact in a simple way, let us consider an economy with
+100,000 firms, an interest rate of ``r = 0.05`` and a corporate tax rate of
+15%.
+
+Your task is to estimate the present discounted value of projected corporate
+tax revenue over the next 10 years.
+
+Because we are forecasting, we need a model.
+
+We will suppose that 
+
+1. the number of firms and the firm size distribution (measured in profits) remain fixed and
+
+2. the firm size distribution is either lognormal or Pareto.
+
+Present discounted value of tax revenue will be estimated by 
+
+1. generating 100,000 draws of firm profit from the firm size distribution, 
+
+2. multiplying by the tax rate, and 
+
+#. summing the results with discounting to obtain present value.
+
+The Pareto distribution is assumed to take the form :eq:`pareto` with :math:`\bar x = 1` and :math:`\alpha = 1.05`.
+
+(The value the tail index :math:`\alpha` is plausible given the data :cite:`gabaix2016power`.)
+
+To make the lognormal option as similar as possible to the Pareto option,
+choose its parameters such that the mean and median of both distributions are
+the same.
+
+Note that, for each distribution, your estimate of tax revenue will be random
+because it is based on a finite number of draws.
+
+To take this into account, generate 100 draws in each case and compare the two
+samples by
+
+* producing a `violin plot <https://en.wikipedia.org/wiki/Violin_plot>`__ visualizing the two samples side-by-side and
+
+* printing the mean and standard deviation of both samples.
+
+For the seed use ``np.random.seed(1234)``.
+
+What differences do you observe?
+
+(Note: a better approach to this problem would be to model firm dynamics and
+try to track individual firms given the current distribution.  We will discuss
+firm dynamics in later lectures.)
+
+
 
 Solutions
 =========
@@ -556,3 +616,112 @@ First we will create a function and then generate the plot
     plt.show()
 
 
+Exercise 5
+----------
+
+To do the exercise, we need to choose the parameters :math:`\mu`
+and :math:`\sigma` of the lognormal distribution to match the mean and median
+of the Pareto distribution.
+
+Here we understand the lognormal distribution as that of the random variable
+:math:`\exp(\mu + \sigma Z)` when :math:`Z` is standard normal.
+
+The mean and median of the Pareto distribution :eq:`pareto` with
+:math:`\bar x = 1` are
+
+.. math::
+
+    \text{mean } = \frac{\alpha}{\alpha - 1}
+    \quad \text{and} \quad
+    \text{median } = 2^{1/\alpha}
+
+Using the corresponding expressions for the lognormal distribution leads us to
+the equations
+
+.. math::
+    \frac{\alpha}{\alpha - 1} = \exp(\mu + \sigma^2/2)
+    \quad \text{and} \quad
+    2^{1/\alpha} = \exp(\mu)
+
+which we solve for :math:`\mu` and :math:`\sigma` given :math:`\alpha = 1.05`
+    
+Here is code that generates the two samples, produces the violin plot and
+prints the mean and standard deviation of the two samples.
+
+
+.. code:: ipython3
+
+    num_firms = 100_000
+    num_years = 10
+    tax_rate = 0.15
+    r = 0.05
+
+    β = 1 / (1 + r)    # discount factor
+
+    x_bar = 1.0
+    α = 1.05
+    
+    def pareto_rvs(n):
+        "Uses a standard method to generate Pareto draws."
+        u = np.random.uniform(size=n)
+        y = x_bar / (u**(1/α))
+        return y
+
+Let's compute the lognormal parameters:
+
+.. code:: ipython3
+
+    μ = np.log(2) / α
+    σ_sq = 2 * (np.log(α/(α - 1)) - np.log(2)/α)
+    σ = np.sqrt(σ_sq)
+
+Here's a function to compute a single estimate of tax revenue for a particular
+choice of distribution ``dist``.
+
+.. code:: ipython3
+
+    def tax_rev(dist):
+        tax_raised = 0
+        for t in range(num_years):
+            if dist == 'pareto':
+                π = pareto_rvs(num_firms)
+            else:
+                π = np.exp(μ + σ * np.random.randn(num_firms))
+            tax_raised += β**t * np.sum(π * tax_rate)
+        return tax_raised
+
+Now let's generate the violin plot.
+
+.. code:: ipython3
+
+    num_reps = 100
+    np.random.seed(1234)
+    
+    tax_rev_lognorm = np.empty(num_reps)
+    tax_rev_pareto = np.empty(num_reps)
+    
+    for i in range(num_reps):
+        tax_rev_pareto[i] = tax_rev('pareto')
+        tax_rev_lognorm[i] = tax_rev('lognorm')
+
+    fig, ax = plt.subplots()
+    
+    data = tax_rev_pareto, tax_rev_lognorm
+    
+    ax.violinplot(data)
+    
+    plt.show()
+
+Finally, let's print the means and standard deviations.
+
+.. code:: ipython3
+
+    tax_rev_pareto.mean(), tax_rev_pareto.std()
+
+.. code:: ipython3
+
+    tax_rev_lognorm.mean(), tax_rev_lognorm.std()
+
+
+Looking at the output of the code, our main conclusion is that the Pareto
+assumption leads to a lower mean and greater dispersion.
diff --git a/source/rst/kesten_processes.rst b/source/rst/kesten_processes.rst
index 1917f23b..2b8e447e 100644
--- a/source/rst/kesten_processes.rst
+++ b/source/rst/kesten_processes.rst
@@ -30,7 +30,7 @@ Now we generalize these linear models slightly by allowing the multiplicative co
 
 Such processes are known as Kesten processes after German--American mathematician Harry Kesten (1931--2019)
 
-Although simple to write down, Kesten processes are important for two reasons.
+Although simple to write down, Kesten processes are interesting for at least two reasons:
 
 1. A number of significant economic processes are or can be described as Kesten processes.
 
@@ -67,7 +67,7 @@ where :math:`\{a_t\}_{t \geq 1}` and :math:`\{\eta_t\}_{t \geq 1}` are iid
 sequences.
 
 We will focus on the nonnegative scalar case, where :math:`x_t` takes values
-in :math:`\mathbb R_+`
+in :math:`\mathbb R_+`.
 
 In particular, we will assume that 
 
@@ -102,7 +102,7 @@ Composite Index for the period 1st January 2006 to 1st November 2019.
     
     fig, ax = plt.subplots()
     
-    ax.plot(r, alpha=0.7, ms=4)
+    ax.plot(r, alpha=0.7)
     
     ax.set_ylabel('returns', fontsize=12)
     ax.set_xlabel('date', fontsize=12)
@@ -277,18 +277,15 @@ Pareto-tailed distributions.
 The Kesten-Goldie Theorem
 -------------------------
 
-To state the conditions, we recall that a random variable is called
-**nonarithmetic** if its distribution is not concentrated on :math:`t \mathbb
-Z` for any :math:`t \geq 0`.
+To state the conditions under which the stationary distribution of a Kesten process has a Pareto tail, we first recall that a random variable is called **nonarithmetic** if its distribution is not concentrated on :math:`t \mathbb Z` for any :math:`t \geq 0`.
 
 For example, any random variable with a density is nonarithmetic.
 
-The famous Kesten--Goldie Theorem (see, e.g.,
-:cite:`buraczewski2016stochastic`, theorem 2.4.4) states that if
+The famous Kesten--Goldie Theorem (see, e.g., :cite:`buraczewski2016stochastic`, theorem 2.4.4) states that if
 
 1. The stationarity conditions in :eq:`kp_stat_cond` hold,
 
-2. The random variable $a_t$ is positive with probability one and nonarithmetic.
+2. The random variable :math:`a_t` is positive with probability one and nonarithmetic.
 
 3. :math:`\mathbb P\{a_t x + \eta_t = x\} < 1` for all :math:`x \in \mathbb R_+`.
 
@@ -314,11 +311,55 @@ More precisely, if :math:`F^*` is the unique stationary distribution and :math:`
 for some positive constant :math:`c`.
 
 
-An illustration Using Rank-Size Plots
---------------------------------------
+Intuition
+---------
 
-To be added.
+Later we will illustrate the Kesten-Goldie Theorem using rank-size plots.
 
+Prior to doing so, we can give the following intuition for the conditions.
+
+Two important conditions are that :math:`\mathbb E \ln a_t < 0`, so the model
+is stationary, and :math:`\mathbb E a_t^\alpha = 1` for some :math:`\alpha >
+0`.
+
+The first condition implies that the distribution of :math:`a_t` has a large amount of probability mass below 1.
+
+The second condition implies that the distribution of :math:`a_t` has at least some probability mass at or above 1.
+
+The first condition gives us existence of the stationary condition. 
+
+The second condition means that the current state can be expanded by :math:`a_t`.
+
+If this occurs for several concurrent periods, the effects compound each other, since :math:`a_t` is multiplicative.
+
+This leads to spikes in the time series, which fill out the extreme right hand tail of the distribution.
+
+The spikes in the time series are visible in the following simulation, which generates of 10 paths when :math:`a_t` and :math:`b_t` are lognormal.
+
+
+.. code:: ipython3
+
+    μ = -0.5
+    σ = 1.0
+    
+    def kesten_ts(ts_length=100):
+        x = np.zeros(ts_length)
+        for t in range(ts_length-1):
+            a = np.exp(μ + σ * np.random.randn())
+            b = np.exp(np.random.randn())
+            x[t+1] = a * x[t] + b
+        return x
+    
+    fig, ax = plt.subplots()
+    
+    num_paths = 10
+    np.random.seed(12)
+    
+    for i in range(num_paths):
+        ax.plot(kesten_ts())
+        
+    ax.set(xlabel='time', ylabel='$x_t$')    
+    plt.show()
 
 
 
@@ -326,38 +367,42 @@ Application: Firm Dynamics
 ==========================
 
 
-As noted in our :doc:`lecture on heavy tails`, for common measures of firm size such as revenue or employment,
-the US firm size distribution exhibits a Pareto tail (see, e.g., :cite:`axtell2001zipf`, :cite:`gabaix2016power`}).
+As noted in our :doc:`lecture on heavy tails <heavy_tails>`, for common measures of firm size such as revenue or employment, the US firm size distribution exhibits a Pareto tail (see, e.g., :cite:`axtell2001zipf`, :cite:`gabaix2016power`).
 
-We can explain this rather striking fact using the Kesten--Goldie Theorem.
+Let us try to explain this rather striking fact using the Kesten--Goldie Theorem.
+
+Gibrat's Law
+------------
 
-For starts, it was postulated many years ago that firm size evolves according to
-Gibrat's law of proportional growth (see, e.g., XXX ).
+It was postulated many years ago by Robert Gibrat :cite:`gibrat1931inegalites` that firm size evolves according to a simple rule whereby size next period is proportional to current size.
 
-This can be expressed by stating that, for some suitable measure of firm size :math:`s_t`, we have
+This is now know as Gibrat's law of proportional growth.
+
+We can express this idea by stating that a suitably defined measure 
+:math:`s_t` of firm size obeys
 
 .. math::
     :label: firm_dynam_gb
 
     \frac{s_{t+1}}{s_t} = a_{t+1} 
     
-where :math:`\{a_t\}` is some positive iid sequence.
+for some positive iid sequence :math:`\{a_t\}`.
 
-In particular, Gibrat's law asserts that the growth rate of individual firms does not depend on firm size.
+One implication of Gibrat's law is that the growth rate of individual firms
+does not depend on their size.
 
 However, over the last few decades, research contradicting Gibrat's law has
 accumulated in the literature.
 
 For example, it is commonly found that, on average,
 
-1. small firms grow faster than large firms and
+1. small firms grow faster than large firms (see, e.g., :cite:`evans1987relationship` and :cite:`hall1987relationship`) and
 
-2. the growth rate of small firms is more volatile than that of large firms.
+2. the growth rate of small firms is more volatile than that of large firms :cite:`dunne1989growth`.
 
-See, for example. XXXX.
+On the other hand, Gibrat's law is generally found to be a reasonable
+approximation for large firms :cite:`evans1987relationship`.
 
-On the other hand, Gibrat's law is generally agreed to be a reasonable
-approximation for large firms (see, e.g., XXXX).
 
 We can accommodate these empirical findings by modifying :eq:`firm_dynam_gb`
 to
@@ -371,17 +416,27 @@ to
 where :math:`\{a_t\}` and :math:`\{b_t\}` are both iid and independent of each
 other.
 
-In the exercises below you are asked to show that :eq:`firm_dynam` is more
-consistent with the empirical findings than Gibrat's law in
+In the exercises you are asked to show that :eq:`firm_dynam` is more
+consistent with the empirical findings presented above than Gibrat's law in
 :eq:`firm_dynam_gb`.
 
+
+Heavy Tails
+-----------
+
 So what has this to do with heavy tails?
 
 The answer is that :eq:`firm_dynam` is a Kesten process.
 
-XX add discussion of conditions, refer to exercise below XX
+If the conditions of the Kesten-Goldie Theorem are satisfied, then the firm
+size distribution is predicted to have heavy tails --- which is exactly what
+we see in the data.
 
+In the exercises below we explore this idea further, generalizing the firm
+size dynamics and examining the corresponding rank-size plots.
 
+We also try to illustrate why the Pareto tail finding is significant for
+quantitative analysis.
 
 
 Exercises
@@ -395,7 +450,7 @@ working days) the GARCH(1, 1) process in :eq:`garch11v`--:eq:`garch11r`.
 
 Take :math:`\xi_t` and :math:`\zeta_t` to be independent and standard normal.
 
-Set :math:`\alpha_0 = XX, \alpha_1 = XX, \beta = XX` and :math:`\sigma_0 = 0`.
+Set :math:`\alpha_0 = 0.00001, \alpha_1 = 0.1, \beta = 0.9` and :math:`\sigma_0 = 0`.
     
 Compare visually with the Nasdaq Composite Index returns :ref:`shown above <ndcode>`.
 
@@ -405,37 +460,98 @@ While the time path differs, you should see bursts of high volatility.
 Exercise 2
 ----------
 
-Forecast present discounted value of 10 years of corporate tax revenue, tax
-rate of 15%, 1e6 firms, revenue distribution predicted to follow either
-lognormal or Pareto.  
 
-Lognormal is matched to Pareto by maching the mean and the median.
+In our discussion of firm dynamics, it was claimed that :eq:`firm_dynam` is more consistent with the empirical literature than Gibrat's law in :eq:`firm_dynam_gb`.
 
+(The empirical literature was reviewed immediately above :eq:`firm_dynam`.) 
 
-Exercise 3
-----------
+In what sense is this true (or false)?
 
+.. Exercise 3
 
-.. math::
-    :label: firm_dynam_ee
+    .. math::
+        :label: firm_dynam_ee
 
-    s_{t+1} = e_{t+1} \mathbb 1 \{s_t < \bar s\}
-        + (a_{t+1} s_t + b_{t+1}) \mathbb 1 \{s_t \geq \bar s\}
+        s_{t+1} = e_{t+1} \mathbb 1 \{s_t < \bar s\}
+            + (a_{t+1} s_t + b_{t+1}) \mathbb 1 \{s_t \geq \bar s\}
 
-Generate rank-size plot.
+    Generate rank-size plot.
 
 
 
 Solutions
 =========
 
+
 Exercise 1
 ----------
 
-Foobar
+Here is one solution:
 
 
-Exercise 1
+.. code:: ipython3
+
+    α_0 = 1e-5
+    α_1 = 0.1
+    β = 0.9
+    
+    years = 15
+    days = years * 250
+    
+    def garch_ts(ts_length=days):
+        σ2 = 0
+        r = np.zeros(ts_length)
+        for t in range(ts_length-1):
+            ξ = np.random.randn()
+            σ2 = α_0 + σ2 * (α_1 * ξ**2 + β)
+            r[t] = np.sqrt(σ2) * np.random.randn()
+        return r
+    
+    fig, ax = plt.subplots()
+    
+    np.random.seed(12)
+    
+    ax.plot(garch_ts(), alpha=0.7)
+        
+    ax.set(xlabel='time', ylabel='$\\sigma_t^2$')    
+    plt.show()
+
+
+Exercise 2
 ----------
 
-Foobar
+The empirical findings are that
+
+
+1. small firms grow faster than large firms  and
+
+2. the growth rate of small firms is more volatile than that of large firms.
+
+Also, Gibrat's law is generally found to be a reasonable approximation for
+large firms than for small firms 
+
+The claim is that the dynamics in :eq:`firm_dynam` are more consistent with
+points 1-2 than Gibrat's law.
+
+To see why, we rewrite :eq:`firm_dynam` in terms of growth dynamics:
+
+.. math::
+    :label: firm_dynam_2
+
+    \frac{s_{t+1}}{s_t} = a_{t+1} + \frac{b_{t+1}}{s_t}
+
+Taking :math:`s_t = s` as given, the mean and variance of firm growth are
+
+.. math::
+
+    \mathbb E a
+    + \frac{\mathbb E b}{s}
+    \quad \text{and} \quad
+    \mathbb V a
+    + \frac{\mathbb V b}{s^2}
+    
+Both of these decline with firm size :math:`s`, consistent with the data.
+
+Moreover, the law of motion :eq:`firm_dynam_2` clearly approaches Gibrat's law
+:eq:`firm_dynam_gb` as :math:`s_t` gets large.
+