diff --git a/examples/case_studies/multilevel_modeling.ipynb b/examples/case_studies/multilevel_modeling.ipynb
index 88a79b984..f1425083c 100644
--- a/examples/case_studies/multilevel_modeling.ipynb
+++ b/examples/case_studies/multilevel_modeling.ipynb
@@ -464,7 +464,80 @@
"outputs": [
{
"data": {
- "image/svg+xml": "\n\n\n\n\n",
+ "image/svg+xml": [
+ "\n",
+ "\n",
+ "\n",
+ "\n",
+ "\n"
+ ],
"text/plain": [
""
]
@@ -5033,7 +5106,94 @@
"outputs": [
{
"data": {
- "image/svg+xml": "\n\n\n\n\n",
+ "image/svg+xml": [
+ "\n",
+ "\n",
+ "\n",
+ "\n",
+ "\n"
+ ],
"text/plain": [
""
]
@@ -5338,15 +5498,15 @@
"\n",
"When we pool our data, we imply that they are sampled from the same model. This ignores any variation among sampling units (other than sampling variance) -- we assume that counties are all the same:\n",
"\n",
- "\n",
+ "\n",
"\n",
"When we analyze data unpooled, we imply that they are sampled independently from separate models. At the opposite extreme from the pooled case, this approach claims that differences between sampling units are too large to combine them -- we assume that counties have no similarity whatsoever:\n",
"\n",
- "\n",
+ "\n",
"\n",
"In a hierarchical model, parameters are viewed as a sample from a population distribution of parameters. Thus, we view them as being neither entirely different or exactly the same. This is ***partial pooling***:\n",
"\n",
- "\n",
+ "\n",
"\n",
"We can use PyMC to easily specify multilevel models, and fit them using Markov chain Monte Carlo."
]
@@ -5373,7 +5533,108 @@
"outputs": [
{
"data": {
- "image/svg+xml": "\n\n\n\n\n",
+ "image/svg+xml": [
+ "\n",
+ "\n",
+ "\n",
+ "\n",
+ "\n"
+ ],
"text/plain": [
""
]
@@ -5630,7 +5891,136 @@
"outputs": [
{
"data": {
- "image/svg+xml": "\n\n\n\n\n",
+ "image/svg+xml": [
+ "\n",
+ "\n",
+ "\n",
+ "\n",
+ "\n"
+ ],
"text/plain": [
""
]
@@ -6008,7 +6398,164 @@
"outputs": [
{
"data": {
- "image/svg+xml": "\n\n\n\n\n",
+ "image/svg+xml": [
+ "\n",
+ "\n",
+ "\n",
+ "\n",
+ "\n"
+ ],
"text/plain": [
""
]
@@ -6233,7 +6780,179 @@
"outputs": [
{
"data": {
- "image/svg+xml": "\n\n\n\n\n",
+ "image/svg+xml": [
+ "\n",
+ "\n",
+ "\n",
+ "\n",
+ "\n"
+ ],
"text/plain": [
""
]
@@ -6552,7 +7271,155 @@
"outputs": [
{
"data": {
- "image/svg+xml": "\n\n\n\n\n",
+ "image/svg+xml": [
+ "\n",
+ "\n",
+ "\n",
+ "\n",
+ "\n"
+ ],
"text/plain": [
""
]
@@ -7167,7 +8034,7 @@
"name": "python",
"nbconvert_exporter": "python",
"pygments_lexer": "ipython3",
- "version": "3.9.7"
+ "version": "3.9.6"
}
},
"nbformat": 4,
diff --git a/examples/case_studies/partial_pooled_model.png b/examples/case_studies/partial_pooled_model.png
new file mode 100644
index 000000000..c17f11e13
Binary files /dev/null and b/examples/case_studies/partial_pooled_model.png differ
diff --git a/examples/case_studies/pooled_model.png b/examples/case_studies/pooled_model.png
new file mode 100644
index 000000000..c146f8daa
Binary files /dev/null and b/examples/case_studies/pooled_model.png differ
diff --git a/examples/case_studies/unpooled_model.png b/examples/case_studies/unpooled_model.png
new file mode 100644
index 000000000..321b0ee17
Binary files /dev/null and b/examples/case_studies/unpooled_model.png differ
diff --git a/examples/generalized_linear_models/GLM-hierarchical.ipynb b/examples/generalized_linear_models/GLM-hierarchical.ipynb
index 1bee7c405..e5fef0e8b 100644
--- a/examples/generalized_linear_models/GLM-hierarchical.ipynb
+++ b/examples/generalized_linear_models/GLM-hierarchical.ipynb
@@ -233,7 +233,7 @@
"\n",
"Where $i$ represents the measurement, $c$ the county and floor contains a 0 or 1 if the house has a basement or not, respectively. If you need a refresher on Linear Regressions in `PyMC`, check out my [previous blog post](http://twiecki.github.io/blog/2013/08/12/bayesian-glms-1/). Critically, we are only estimating *one* intercept and *one* slope for all measurements over all counties pooled together as illustrated in the graphic below ($\\theta$ represents $(\\alpha, \\beta)$ in our case and $y_i$ are the measurements of the $i$th county).\n",
"\n",
- "\n",
+ "\n",
"\n",
"### Unpooled measurements: separate regressions\n",
"But what if we are interested in whether different counties actually have different relationships (slope) and different base-rates of radon (intercept)? Then you might say \"OK then, I'll just estimate $n$ (number of counties) different regressions -- one for each county\". In math-speak that model would be:\n",
@@ -242,7 +242,7 @@
"\n",
"Note that we added the subindex $c$ so we are estimating $n$ different $\\alpha$s and $\\beta$s -- one for each county.\n",
"\n",
- "\n",
+ "\n",
"\n",
"This is the extreme opposite model; where above we assumed all counties are exactly the same, here we are saying that they share no similarities whatsoever. As we show below, this type of model can be very noisy when we have little data per county, as is the case in this data set.\n",
"\n",
@@ -255,7 +255,7 @@
"\n",
"We thus assume the intercepts $\\alpha$ and slopes $\\beta$ to come from a normal distribution centered around their respective group mean $\\mu$ with a certain standard deviation $\\sigma^2$, the values (or rather posteriors) of which we also estimate. That's why this is called a multilevel, hierarchical or partial-pooling modeling.\n",
"\n",
- "\n",
+ "\n",
"\n",
"How do we estimate such a complex model you might ask? Well, that's the beauty of Probabilistic Programming -- we just formulate the model we want and press our [Inference Button(TM)](http://twiecki.github.io/blog/2013/08/12/bayesian-glms-1/). \n",
"\n",
@@ -1883,7 +1883,7 @@
"metadata": {
"anaconda-cloud": {},
"kernelspec": {
- "display_name": "Python 3",
+ "display_name": "Python 3 (ipykernel)",
"language": "python",
"name": "python3"
},
@@ -1897,7 +1897,7 @@
"name": "python",
"nbconvert_exporter": "python",
"pygments_lexer": "ipython3",
- "version": "3.8.10"
+ "version": "3.9.6"
},
"latex_envs": {
"bibliofile": "biblio.bib",
diff --git a/examples/generalized_linear_models/partial_pooled_model.png b/examples/generalized_linear_models/partial_pooled_model.png
new file mode 100644
index 000000000..c17f11e13
Binary files /dev/null and b/examples/generalized_linear_models/partial_pooled_model.png differ
diff --git a/examples/generalized_linear_models/pooled_model.png b/examples/generalized_linear_models/pooled_model.png
new file mode 100644
index 000000000..c146f8daa
Binary files /dev/null and b/examples/generalized_linear_models/pooled_model.png differ
diff --git a/examples/generalized_linear_models/unpooled_model.png b/examples/generalized_linear_models/unpooled_model.png
new file mode 100644
index 000000000..321b0ee17
Binary files /dev/null and b/examples/generalized_linear_models/unpooled_model.png differ