Skip to content

Bring back distribution moments #5078

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
51 tasks done
ricardoV94 opened this issue Oct 14, 2021 · 65 comments · Fixed by #5407
Closed
51 tasks done

Bring back distribution moments #5078

ricardoV94 opened this issue Oct 14, 2021 · 65 comments · Fixed by #5407

Comments

@ricardoV94
Copy link
Member

ricardoV94 commented Oct 14, 2021

With #4983 and #5087 finished, it's a good time to bring back moments for our distributions for more stable starting points.

With this we can then do the switch in #5009

We should also update the distribution developer guide to mention the implementation of moments: https://github.com/pymc-devs/pymc/blob/main/docs/source/developer_guide_implementing_distribution.md


How to help?

  1. This PR should give a template on how to implement and test new moments for distributions: https://github.com/pymc-devs/pymc/pull/5087/files

  2. In most cases we should be able to copy the moments we were using in the V3 branch. For example here is what we were doing for the Beta

    self.mean = self.alpha / (self.alpha + self.beta)

    2.1 We used to have multiple moments for some distributions such as mean, median, mode. We only support one moment now, and probably the "higher-order" one is the most useful (that is mean > median > mode)... You might need to truncate the moment if you are dealing with a discrete distribution.
    2.2 We left some of these moments commented out inside the distribution dist classmethod. Make sure they are removed when you implement them!
    # mean = median = mode = mu = at.as_tensor_variable(floatX(mu))
    # variance = 1.0 / self.tau

  3. We have to be careful with size != None and broadcasting properly when when some parameters that are not used in the moment may nevertheless inform about the shape of the distribution. E.g. pm.Normal.dist(mu=0, sigma=np.arange(1, 6)) returns a moment of [mu, mu, mu, mu, mu]. Again Add tests for distributions moments #5087 should give some template to think about this

    def get_moment(rv, size, mu, sigma):
    mu, _ = at.broadcast_arrays(mu, sigma)
    if not rv_size_is_none(size):
    mu = at.full(size, mu)
    return mu

    3.1 In the case where you have to manually broadcast the parameters with each other it's important to add test conditions that would fail if you were not to do that. A straightforward way to do this is to make the used parameter a scalar, the unused one(s) a vector (one at a time) and size None

  4. Just to keep things uniformish, please add the get_moment immediately below the dist classmethod and before logp or logcdf methods.

  5. New tests have to be added in test_distributions_moments.py. Make sure to test different combinations of size and broadcasting to cover the cases mentioned in point 3.

  6. Don't hesitate to ask any questions. You can grab as many distributions to implement moments as you want. Just make sure to write in this issue so that we can keep track of it.

  7. Profit with your new open source KARMA!

The following distributions don't have a moment method implemented:

@ricardoV94
Copy link
Member Author

@twiecki I updated #5087 which includes a possible template for adding and testing moments. Once we merge that, we can use it as an example

@ricardoV94
Copy link
Member Author

ricardoV94 commented Nov 5, 2021

Updated the top comment with a list of missing moments and some instructions

@michaeloriordan
Copy link
Contributor

michaeloriordan commented Nov 5, 2021

I'll add a PR for pymc.distributions.continuous.Beta as a warmup #5145. Will work on some more if I got this one right! 😄

ricardoV94 added a commit that referenced this issue Nov 6, 2021
@michaeloriordan
Copy link
Contributor

michaeloriordan commented Nov 6, 2021

I'll work on the next 5 distributions on the list. PR here #5147

  • pymc.distributions.continuous.Kumaraswamy
  • pymc.distributions.continuous.Exponential
  • pymc.distributions.continuous.Laplace
  • pymc.distributions.continuous.StudentT
  • pymc.distributions.continuous.Cauchy

@lucianopaz
Copy link
Member

lucianopaz commented Nov 6, 2021

I'll do:

  • HalfCauchy
  • Gamma
  • Weibull
  • LogNormal

@farhanreynaldo
Copy link
Contributor

I would like to take on:

  • Binomial
  • Poisson

@twiecki
Copy link
Member

twiecki commented Nov 7, 2021

Thanks @farhanreynaldo!

@azihna
Copy link
Contributor

azihna commented Nov 7, 2021

I'd like to work on HalfStudentT

@ricardoV94
Copy link
Member Author

I'd like to work on HalfStudentT

Sure go ahead :)

@patel-zeel
Copy link
Contributor

I have added ChiSquared moment.

@zoj613
Copy link
Contributor

zoj613 commented Nov 20, 2021

Yeah, as far as I understand. The CAR is modeled using a multivariate normal distribution with a specific covariance matrix. So it should be identical to the MvNormal if both accept the same kind of parameters mu and cov.

Sounds good. Do you want to open a PR for it? I'll mark it above as taken.

Sure, i'll open one.

@ricardoV94
Copy link
Member Author

All distributions have been taken. I've just unchecked those that don't yet have a PR open to facilitate tracking in the final stage.

@ricardoV94
Copy link
Member Author

I would like to try VonMises

@Domenico89 Any progress on this?

@ricardoV94
Copy link
Member Author

I will add KroneckerNormal, Wishart, DiscreteWeibull.

@anirudhacharya Any progress on this?

@Domenico89
Copy link
Contributor

I would like to try VonMises

@Domenico89 Any progress on this?

@ricardoV94 I should have an implementation for it, but I am getting an error for some values of the test parameters for which I haven't managed to figure out the reason yet

@ricardoV94
Copy link
Member Author

I would like to try VonMises

@Domenico89 Any progress on this?

@ricardoV94 I should have an implementation for it, but I am getting an error for some values of the test parameters for which I haven't managed to figure out the reason yet

Feel free to open a PR with what you have so far and we will help you there

@anirudhacharya
Copy link
Contributor

@anirudhacharya Any progress on this?

I will finish up the tests and raise a PR by tomorrow.

@anirudhacharya
Copy link
Contributor

@ricardoV94 Wishart Distribution documentation says - This distribution is unusable in a PyMC3 model should I be adding tests and moments for it? Is there an example usage I can look at for writing the tests?

@twiecki
Copy link
Member

twiecki commented Dec 4, 2021

@anirudhacharya No, we don't need that for Wishart.

@anirudhacharya
Copy link
Contributor

anirudhacharya commented Dec 13, 2021

@ricardoV94 I am having trouble writing tests for the DiscreteWeibull distribution.

Is there some example usage of this distribution that I can check out? The documentation did not provide any - https://docs.pymc.io/en/v3/api/distributions/discrete.html?highlight=Weibull#pymc3.distributions.discrete.DiscreteWeibull

@ricardoV94
Copy link
Member Author

ricardoV94 commented Dec 13, 2021

@ricardoV94 I am having trouble writing tests for the DiscreteWeibull distribution.

Is there some example usage of this distribution that I check out? The documentation did not provide any - https://docs.pymc.io/en/v3/api/distributions/discrete.html?highlight=Weibull#pymc3.distributions.discrete.DiscreteWeibull

Feel free to open a PR with what you have so far, it will be easier to help you then.

I don't know about any examples using the distribution, but the check_logp test gives an indication of some values and parameters that are being tested successfully:

def test_discrete_weibull(self):
self.check_logp(
DiscreteWeibull,
Nat,
{"q": Unit, "beta": NatSmall},
discrete_weibull_logpmf,
)

@ricardoV94 ricardoV94 unpinned this issue Dec 14, 2021
@ricardoV94 ricardoV94 modified the milestones: v4.0.0b2, v4.0.0b3 Jan 7, 2022
@ricardoV94
Copy link
Member Author

@anirudhacharya are you still interested in the DiscreteWeibull or should we put it up for grabs again?

@anirudhacharya
Copy link
Contributor

@anirudhacharya are you still interested in the DiscreteWeibull or should we put it up for grabs again?

Please put it up for grabs.

@zoj613
Copy link
Contributor

zoj613 commented Jan 26, 2022

@ricardoV94 The mean looks like an infinite sum. This is a job for scan (seeing that the computation is meant to be symbolic) with a terminal condition based on tolerance, right? If so I can take a stab at it.

@ricardoV94
Copy link
Member Author

ricardoV94 commented Jan 26, 2022

@zoj613 we are not interested in the exact moments, but just good enough moments to start sampling. It seems the mean is quite a costly computation, so we should pick something else, perhaps the median or the mode? All we care is that the initial point always has a non-zero probability and, provided that, it is in a place with relevant prior mass

@zoj613
Copy link
Contributor

zoj613 commented Jan 26, 2022

Okay I see. If my math is correct it looks like the median can be computed from the CDF expression as:

median = at.power(at.log(0.5) / at.log(q), 1 / beta) - 1

where CDF = 1 - q ** ((x + 1) ** beta)

@ricardoV94
Copy link
Member Author

Okay I see. If my math is correct it looks like the median can be computed from the CDF expression as:

median = at.power(at.log(0.5) / at.log(q), 1 / beta) - 1

where CDF = 1 - q ** ((x + 1) ** beta)

Great! Feel free to open a PR for it

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging a pull request may close this issue.