Skip to content

NegativeBinomial, ZeroInflatedPoisson, ZeroInflatedBinomial moments #5163

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 5 commits into from
Nov 9, 2021

Conversation

theorashid
Copy link
Contributor

Add moments and tests for the below distributions as part of #5078:

  • pymc.distributions.Discrete.NegativeBinomial
  • pymc.distributions.Discrete.ZeroInflatedPoisson
  • pymc.distributions.Discrete.ZeroInflatedBinomial

This is my first PR (first time with pytest, rebasing etc – ah the bliss life of the end user) so please check carefully.

@ricardoV94 ricardoV94 mentioned this pull request Nov 9, 2021
51 tasks
@codecov
Copy link

codecov bot commented Nov 9, 2021

Codecov Report

Merging #5163 (1381155) into main (bdd4d19) will increase coverage by 0.04%.
The diff coverage is 100.00%.

Impacted file tree graph

@@            Coverage Diff             @@
##             main    #5163      +/-   ##
==========================================
+ Coverage   78.02%   78.07%   +0.04%     
==========================================
  Files          88       88              
  Lines       14123    14138      +15     
==========================================
+ Hits        11020    11038      +18     
+ Misses       3103     3100       -3     
Impacted Files Coverage Δ
pymc/distributions/discrete.py 99.40% <100.00%> (+0.02%) ⬆️
pymc/backends/report.py 91.60% <0.00%> (+2.09%) ⬆️

@ricardoV94
Copy link
Member

@theorashid Could you provide some context on the moments you choose for the ZeroInflated distributions. I checked and in V3 we were simply using the mode of the non-zero component. If I read correctly the moment you used for the ZeroInflatedPoisson is the truncated mean of the mixture, and I guess the same is the case for the ZeroInflatedBinomial?

If that's the case, should we call the mu variable computed inside the ZeroInflatedPoisson mean or truncated_mean to be easier to read?

@theorashid
Copy link
Contributor Author

Sure, I used the mean specified in the class docstring. I then named and truncated them in accordance with their not-zero-inflated counterparts from #5150 (here, we don't call them truncated – just mu and mean).

@ricardoV94
Copy link
Member

Sure, I used the mean specified in the class docstring. I then named and truncated them in accordance with their not-zero-inflated counterparts from #5150 (here, we don't call them truncated – just mu and mean).

Right but in the ZeroInflatedPoisson it's no longer mu because you multiply it by psi, right?

@theorashid
Copy link
Contributor Author

True. Although ZIP uses theta not mu, you are right, it is confusing. Do you want to change the ZIP to mean?

@ricardoV94
Copy link
Member

True. Although ZIP uses theta not mu, you are right, it is confusing. Do you want to change the ZIP to mean?

Yeah I think that's less confusing

@ricardoV94 ricardoV94 merged commit 35f9966 into pymc-devs:main Nov 9, 2021
@ricardoV94
Copy link
Member

Thanks @theorashid just merged! Looking forward to your next PR

@theorashid theorashid deleted the discrete-moments branch November 9, 2021 18:41
@sagartomar
Copy link
Contributor

I might be wrong here, but the mean for zero inflated binomial distribution looks incorrect. Since psi is expected proportion of binomial variates, the mean should have been psi * n * p

@ricardoV94
Copy link
Member

ricardoV94 commented Nov 15, 2021

I was confused as well. If you are correct, then the docstrings of the distribution are incorrect, because they mention (1-psi). Do you want to investigate what is going on @sagartomar?

@sagartomar
Copy link
Contributor

The docstrings mention that psi is the expected proportion from the binomial distribution. We have 2 options here. We can either change psi to 1-psi in the docstrings or 1-psi to psi in the get_moment function and also update the tests. I think that second option is better as that way the distribution will remain consistent with how other zero inflated distributions are defined in pymc. What do you think @ricardoV94 ?

@ricardoV94
Copy link
Member

@sagartomar I think you are right. We should change the moment and the definition of the mean in the docstrings as that also has the (1-psi) in the equation. Do you want to open a PR for that?

@sagartomar
Copy link
Contributor

Yes, I'll open a PR. I am also working on the ZeroInflatedNegativeBinomial distribution. Is it okay to address both of these in the same PR or should I open separate PRs for both?

@ricardoV94
Copy link
Member

Yes, I'll open a PR. I am also working on the ZeroInflatedNegativeBinomial distribution. Is it okay to address both of these in the same PR or should I open separate PRs for both?

Fine to do them in the same, but try to keep the changes in separate commits

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants