Skip to content

Add Multinomial moments #5201

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 6 commits into from
Nov 26, 2021
Merged

Conversation

morganstrom
Copy link
Contributor

NOTE that the size argument has a different behavior with this implementation compared to previously! Now, new dimensions are appended to the output tensor, as specified by size. Previously, no new dimensions were added to the mode if the size argument matched the shape of the mode tensor..

Also, the mode is rounded to the nearest integer in this implementation. The new tests are different from the previous ones - instead of making sure that the values of the last dimension summed to 1 (which isn't guaranteed with rounding), now each element of the output tensor are tested.

Thank your for opening a PR!

Depending on what your PR does, here are a few things you might want to address in the description:

…ents outcomes/events, second from last dimension represents batches (independent multinomial distributions) and first dimension is set by the size argument. Note that this is different from previous effects of the size argument
@ricardoV94
Copy link
Member

ricardoV94 commented Nov 18, 2021

NOTE that the size argument has a different behavior with this implementation compared to previously!

Before we didn't have size, only shape. Indeed size behaves differently and is now what we use internally. The refactoring of the old tests can (but don't need to) keep using the shape definition, so that they are exactly the same as before. Any new tests should use size.

@ricardoV94 ricardoV94 mentioned this pull request Nov 18, 2021
51 tasks
@codecov
Copy link

codecov bot commented Nov 18, 2021

Codecov Report

Merging #5201 (f39892c) into main (5107dae) will increase coverage by 0.02%.
The diff coverage is 100.00%.

Impacted file tree graph

@@            Coverage Diff             @@
##             main    #5201      +/-   ##
==========================================
+ Coverage   78.94%   78.97%   +0.02%     
==========================================
  Files          88       88              
  Lines       14256    14270      +14     
==========================================
+ Hits        11255    11270      +15     
+ Misses       3001     3000       -1     
Impacted Files Coverage Δ
pymc/distributions/multivariate.py 71.91% <100.00%> (+0.56%) ⬆️
pymc/step_methods/hmc/base_hmc.py 91.12% <0.00%> (+0.80%) ⬆️

@pytest.mark.parametrize(
"p, n, size, expected",
[
(np.array([0.25, 0.25, 0.25, 0.25]), 1, None, np.repeat(0, 4)),
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is a bad moment, because it has a logp of -np.inf. We always need the moment to add up to n. IIRC there is no simple closed formula solution for the mode of the multinomial, but you should be able to use our hacky approximation from V3.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This?

mode = tt.inc_subtensor(mode[inc_bool_arr.nonzero()], diff[inc_bool_arr.nonzero()])

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah I think that's it. Basically it's a hack to make sure the values always add up to n, and not just in the case where they would all be zero

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Cool, I pretty much copied the code but made some changes for the cases when either p has more then 1 dimension or when n is an array rather than a scalar. Added tests for the corner cases I could think of. Once this PR is finalized, the same code can be copied to the DirichletMultinomial distribution!

Copy link
Member

@ricardoV94 ricardoV94 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM! Just need to fix the conflicts with main

@ricardoV94
Copy link
Member

In the meantime feel free to start a PR on the DirichletMultinomial :)

@ricardoV94
Copy link
Member

You got a pre-commit formatting complaint

@morganstrom
Copy link
Contributor Author

You got a pre-commit formatting complaint

Darn! How do I run the same tests locally before commiting? Can't find instructions here: https://github.com/pymc-devs/pymc/blob/35f9966f15e76d8ba304914be69e4115fdcfd4ef/CONTRIBUTING.md

@ricardoV94
Copy link
Member

You got a pre-commit formatting complaint

Darn! How do I run the same tests locally before commiting? Can't find instructions here: https://github.com/pymc-devs/pymc/blob/35f9966f15e76d8ba304914be69e4115fdcfd4ef/CONTRIBUTING.md

They were recently moved to: https://github.com/pymc-devs/pymc/blob/main/docs/source/contributing/python_style.md

@twiecki
Copy link
Member

twiecki commented Nov 26, 2021

/pre-commit-run

@ricardoV94 ricardoV94 merged commit e709c2c into pymc-devs:main Nov 26, 2021
@morganstrom morganstrom deleted the multinomial_moments branch November 26, 2021 15:52
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants