-
-
Notifications
You must be signed in to change notification settings - Fork 2.1k
Add Multinomial moments #5201
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add Multinomial moments #5201
Conversation
…ents outcomes/events, second from last dimension represents batches (independent multinomial distributions) and first dimension is set by the size argument. Note that this is different from previous effects of the size argument
Before we didn't have |
Codecov Report
@@ Coverage Diff @@
## main #5201 +/- ##
==========================================
+ Coverage 78.94% 78.97% +0.02%
==========================================
Files 88 88
Lines 14256 14270 +14
==========================================
+ Hits 11255 11270 +15
+ Misses 3001 3000 -1
|
@pytest.mark.parametrize( | ||
"p, n, size, expected", | ||
[ | ||
(np.array([0.25, 0.25, 0.25, 0.25]), 1, None, np.repeat(0, 4)), |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is a bad moment, because it has a logp of -np.inf
. We always need the moment to add up to n
. IIRC there is no simple closed formula solution for the mode of the multinomial, but you should be able to use our hacky approximation from V3.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This?
pymc/pymc3/distributions/multivariate.py
Line 583 in c332f8c
mode = tt.inc_subtensor(mode[inc_bool_arr.nonzero()], diff[inc_bool_arr.nonzero()]) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah I think that's it. Basically it's a hack to make sure the values always add up to n
, and not just in the case where they would all be zero
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Cool, I pretty much copied the code but made some changes for the cases when either p
has more then 1 dimension or when n
is an array rather than a scalar. Added tests for the corner cases I could think of. Once this PR is finalized, the same code can be copied to the DirichletMultinomial distribution!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM! Just need to fix the conflicts with main
In the meantime feel free to start a PR on the DirichletMultinomial :) |
You got a pre-commit formatting complaint |
Darn! How do I run the same tests locally before commiting? Can't find instructions here: https://github.com/pymc-devs/pymc/blob/35f9966f15e76d8ba304914be69e4115fdcfd4ef/CONTRIBUTING.md |
They were recently moved to: https://github.com/pymc-devs/pymc/blob/main/docs/source/contributing/python_style.md |
/pre-commit-run |
NOTE that the size argument has a different behavior with this implementation compared to previously! Now, new dimensions are appended to the output tensor, as specified by size. Previously, no new dimensions were added to the mode if the size argument matched the shape of the mode tensor..
Also, the mode is rounded to the nearest integer in this implementation. The new tests are different from the previous ones - instead of making sure that the values of the last dimension summed to 1 (which isn't guaranteed with rounding), now each element of the output tensor are tested.
Thank your for opening a PR!
Depending on what your PR does, here are a few things you might want to address in the description: