@@ -55,7 +55,7 @@ settings include
55
55
56
56
* the distribution of city sizes (:cite: `rozenfeld2011area `, :cite: `gabaix2016power `).
57
57
58
- These heavy tails turn out to be important for our understanding economic outcomes and their impact .
58
+ These heavy tails turn out to be important for our understanding of economic outcomes.
59
59
60
60
As one example, the heaviness of the tail in the wealth distribution is one
61
61
natural measure of inequality.
@@ -341,7 +341,7 @@ for some positive constants :math:`\bar x` and :math:`\alpha`.
341
341
342
342
It is easy to see that if :math: `X \sim F`, then :math: `\mathbb P\{ X > x\}` satisfies :eq: `plrt `.
343
343
344
- Thus, in line with the terminology, a Pareto distributed random variables have a Pareto tail.
344
+ Thus, in line with the terminology, Pareto distributed random variables have a Pareto tail.
345
345
346
346
347
347
Rank-Size Plots
@@ -362,11 +362,11 @@ A discussion of why this occurs can be found in :cite:`nishiyama2004estimation`.
362
362
363
363
The figure below provides one example, using simulated data.
364
364
365
- The rank-size plots shows draws from three different distributions: folded normal, chi squared with 1 degree of freedom and Pareto.
365
+ The rank-size plots shows draws from three different distributions: folded normal, chi- squared with 1 degree of freedom and Pareto.
366
366
367
367
In each case, the largest 5\% of 1,000 draws are shown.
368
368
369
- The Pareto sample produces a straight line, while the line produced by the other samples is concave.
369
+ The Pareto sample produces a straight line, while the lines produced by the other samples are concave.
370
370
371
371
.. _rank_size_fig1 :
372
372
@@ -413,6 +413,66 @@ Replicate the rank-size plot figure :ref:`presented above <rank_size_fig1>`.
413
413
Use ``np.random.seed(13) `` to set the seed.
414
414
415
415
416
+ Exercise 5
417
+ ----------
418
+
419
+ There is an ongoing argument about whether the firm size distribution should
420
+ be modeled as a Pareto distribution or a lognormal distribution (see, e.g.,
421
+ :cite: `fujiwara2004pareto `, :cite: `kondo2018us ` or :cite: `schluter2019size `).
422
+
423
+ This sounds esoteric but has real implications for a variety of economic
424
+ phenomena.
425
+
426
+ To illustrate this fact in a simple way, let us consider an economy with
427
+ 100,000 firms, an interest rate of ``r = 0.05 `` and a corporate tax rate of
428
+ 15%.
429
+
430
+ Your task is to estimate the present discounted value of projected corporate
431
+ tax revenue over the next 10 years.
432
+
433
+ Because we are forecasting, we need a model.
434
+
435
+ We will suppose that
436
+
437
+ 1. the number of firms and the firm size distribution (measured in profits) remain fixed and
438
+
439
+ 2. the firm size distribution is either lognormal or Pareto.
440
+
441
+ Present discounted value of tax revenue will be estimated by
442
+
443
+ 1. generating 100,000 draws of firm profit from the firm size distribution,
444
+
445
+ 2. multiplying by the tax rate, and
446
+
447
+ #. summing the results with discounting to obtain present value.
448
+
449
+ The Pareto distribution is assumed to take the form :eq: `pareto ` with :math: `\bar x = 1 ` and :math: `\alpha = 1.05 `.
450
+
451
+ (The value the tail index :math: `\alpha ` is plausible given the data :cite: `gabaix2016power `.)
452
+
453
+ To make the lognormal option as similar as possible to the Pareto option,
454
+ choose its parameters such that the mean and median of both distributions are
455
+ the same.
456
+
457
+ Note that, for each distribution, your estimate of tax revenue will be random
458
+ because it is based on a finite number of draws.
459
+
460
+ To take this into account, generate 100 draws in each case and compare the two
461
+ samples by
462
+
463
+ * producing a `violin plot <https://en.wikipedia.org/wiki/Violin_plot >`__ visualizing the two samples side-by-side and
464
+
465
+ * printing the mean and standard deviation of both samples.
466
+
467
+ For the seed use ``np.random.seed(1234) ``.
468
+
469
+ What differences do you observe?
470
+
471
+ (Note: a better approach to this problem would be to model firm dynamics and
472
+ try to track individual firms given the current distribution. We will discuss
473
+ firm dynamics in later lectures.)
474
+
475
+
416
476
417
477
Solutions
418
478
=========
@@ -556,3 +616,112 @@ First we will create a function and then generate the plot
556
616
plt.show()
557
617
558
618
619
+ Exercise 5
620
+ ----------
621
+
622
+ To do the exercise, we need to choose the parameters :math: `\mu `
623
+ and :math: `\sigma ` of the lognormal distribution to match the mean and median
624
+ of the Pareto distribution.
625
+
626
+ Here we understand the lognormal distribution as that of the random variable
627
+ :math: `\exp (\mu + \sigma Z)` when :math: `Z` is standard normal.
628
+
629
+ The mean and median of the Pareto distribution :eq: `pareto ` with
630
+ :math: `\bar x = 1 ` are
631
+
632
+ .. math ::
633
+
634
+ \text {mean } = \frac {\alpha }{\alpha - 1 }
635
+ \quad \text {and} \quad
636
+ \text {median } = 2 ^{1 /\alpha }
637
+
638
+ Using the corresponding expressions for the lognormal distribution leads us to
639
+ the equations
640
+
641
+ .. math ::
642
+ \frac {\alpha }{\alpha - 1 } = \exp (\mu + \sigma ^2 /2 )
643
+ \quad \text {and} \quad
644
+ 2 ^{1 /\alpha } = \exp (\mu )
645
+
646
+ which we solve for :math: `\mu ` and :math: `\sigma ` given :math: `\alpha = 1.05 `
647
+
648
+ Here is code that generates the two samples, produces the violin plot and
649
+ prints the mean and standard deviation of the two samples.
650
+
651
+
652
+ .. code :: ipython3
653
+
654
+ num_firms = 100_000
655
+ num_years = 10
656
+ tax_rate = 0.15
657
+ r = 0.05
658
+
659
+ β = 1 / (1 + r) # discount factor
660
+
661
+ x_bar = 1.0
662
+ α = 1.05
663
+
664
+ def pareto_rvs(n):
665
+ "Uses a standard method to generate Pareto draws."
666
+ u = np.random.uniform(size=n)
667
+ y = x_bar / (u**(1/α))
668
+ return y
669
+
670
+ Let's compute the lognormal parameters:
671
+
672
+ .. code :: ipython3
673
+
674
+ μ = np.log(2) / α
675
+ σ_sq = 2 * (np.log(α/(α - 1)) - np.log(2)/α)
676
+ σ = np.sqrt(σ_sq)
677
+
678
+ Here's a function to compute a single estimate of tax revenue for a particular
679
+ choice of distribution ``dist ``.
680
+
681
+ .. code :: ipython3
682
+
683
+ def tax_rev(dist):
684
+ tax_raised = 0
685
+ for t in range(num_years):
686
+ if dist == 'pareto':
687
+ π = pareto_rvs(num_firms)
688
+ else:
689
+ π = np.exp(μ + σ * np.random.randn(num_firms))
690
+ tax_raised += β**t * np.sum(π * tax_rate)
691
+ return tax_raised
692
+
693
+ Now let's generate the violin plot.
694
+
695
+ .. code :: ipython3
696
+
697
+ num_reps = 100
698
+ np.random.seed(1234)
699
+
700
+ tax_rev_lognorm = np.empty(num_reps)
701
+ tax_rev_pareto = np.empty(num_reps)
702
+
703
+ for i in range(num_reps):
704
+ tax_rev_pareto[i] = tax_rev('pareto')
705
+ tax_rev_lognorm[i] = tax_rev('lognorm')
706
+
707
+ fig, ax = plt.subplots()
708
+
709
+ data = tax_rev_pareto, tax_rev_lognorm
710
+
711
+ ax.violinplot(data)
712
+
713
+ plt.show()
714
+
715
+ Finally, let's print the means and standard deviations.
716
+
717
+ .. code :: ipython3
718
+
719
+ tax_rev_pareto.mean(), tax_rev_pareto.std()
720
+
721
+ .. code :: ipython3
722
+
723
+ tax_rev_lognorm.mean(), tax_rev_lognorm.std()
724
+
725
+
726
+ Looking at the output of the code, our main conclusion is that the Pareto
727
+ assumption leads to a lower mean and greater dispersion.
0 commit comments