-
-
Notifications
You must be signed in to change notification settings - Fork 2.1k
Implement continuous distribution CDF methods #2688
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
3594af7
to
c6b5ae5
Compare
It seems the relevant tests pass for that last commit but the build errors out due to exceeded test time... The missing continuous distributions are 1/3 of all available continuous distributions:
It would be ideal to wait until everything is implemented, but it adds the work of keeping the commits relevant to master. Seeing that a majority of the CDF methods are implemented, could we review for inclusion onto master and add Do you have comments or suggestions, @fonnesbeck, @twiecki, @aseyboldt ? |
My instinct would be to wait until they are all done, so that users aren't confused when they don't exist for some distributions. OTOH, the early audience will probably be small, so perhaps not a big deal. The structure of the distributions code is pretty simple, so merging a large number of them at once shouldn't be a big deal. |
f5ef708
to
96cb41c
Compare
For the tricky ones (SkewNormal, Gamma family and VonMises) user feedback would be a good guide. Presumably more users will have a need when we get around to implementing censored distributions, but it will be difficult to judge if we do not get started there. |
pymc3/distributions/continuous.py
Outdated
based on Cephes library by Steve Moshier (incbet.c). | ||
small: Choose element-wise which continued fraction expansion to use. | ||
''' | ||
big = tt.constant(4.503599627370496e15, dtype='float64') |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Perhaps use CAPS for constants?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Will do!
I'm fine with merging this once tests pass. My only suggestion is to use caps for constant names, which makes things easier to read. |
ffccaac
to
2470a5e
Compare
One of the builds consistently times out. In the build that times out, the I've compared the |
The timeout is a bit difficult to deal with, one of the quick fix is modify |
Had a look today (I am in need of doing a censoring model), overall it looks amazing! with pm.Model():
nu = pm.HalfNormal('nu', 5)
mu = pm.Normal('mu', 0, 1)
sd = pm.HalfCauchy('sd', 2.5)
left = pm.StudentT.dist(nu=nu, mu=mu, sd=sd).logcdf(6.)
lcdf = pm.Potential('lcdf', left)
pm.sample() |
I've used truncated Gamma family distributions in many "waiting times" scenarios (broadly speaking), and they're not uncommon in that literature. |
Is there a status update on this? Would be great to have these in for 3.5. |
Hey all, I'm building a beta model on censored data, so need the beta survival function for incomplete observations. I've copied the incomplete beta functions in this PR to my local env, but I get this error when trying to fit the model:
Thanks. Update: I can confirm that the |
I don't think we should wait until all CDFs are implemented and just merge what we have. Seems like this branch needs a rebase, though. @domenzain Are you still interested in working on this? I think we could merge soon if there are no other blockers. |
Hi @twiecki, I will see what it takes to rebase onto master today and come back to you on that. |
Is this still in development? Am I allowed to help here? |
Absolutely, help would be appreciated!
…On Wed, Jul 25, 2018 at 4:32 PM Stephen Anthony Rose < ***@***.***> wrote:
Is this still in development? Am I allowed to help here?
—
You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub
<#2688 (comment)>, or mute
the thread
<https://github.com/notifications/unsubscribe-auth/AApJmGGjYCsxYh_Sx15iY0-yY726ci-gks5uKIFvgaJpZM4QRcl6>
.
|
Can we merge these? Need to rebase (or merge with current master) to resolve conflicts cc @domenzain |
2470a5e
to
bad522b
Compare
Hi @fonnesbeck, I have rebased the commits onto the current master, I hope this helps. |
Seems like there are some test errors, e.g.:
|
bad522b
to
a317160
Compare
I botched the rebase for the Beta distribution, it seems. |
a317160
to
acdcd0a
Compare
Looks like these are working as expected. For the future, which would be the most useful: the missing continuous log CDFs or the discrete log CDFs? |
Thanks for the great work @domenzain! |
Hi @junpenglao, it looks like the implementation of the upper/lower gamma function is solid and has clear references to go to in case of trouble. I'll have a look. |
We need a bit of boilerplate wrapping to make the scalar C functions available as Theano As the |
If there is no licence issue, to me it makes sense to put it in our code base. |
If it's licensed LGPL we can't :(. I suppose that's the reason for the fork. |
The original author has relicensed the code under MIT: And clearly the Theano authors would've preferred the license to be other than LGPL. |
Anything MIT/BSD/Apache v2 we can use. What specifically do you want to include? The c code? |
Without getting bogged down in these details, can we merge this @domenzain from your end? |
I think this is ready to merge as is.
Additional log CDFs can come with a different PRs as they become available.
Comments like the above about having a use for the Gamma log CDF give
useful pointers for future development.
…On Thu, Sep 27, 2018, 15:47 Thomas Wiecki ***@***.***> wrote:
Without getting bogged down in these details, can we merge this @domenzain
<https://github.com/domenzain> from your end?
—
You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub
<#2688 (comment)>, or mute
the thread
<https://github.com/notifications/unsubscribe-auth/AAraG5KFKy8lQ9UxN6OLGHUHfRFhee6Gks5ufNcMgaJpZM4QRcl6>
.
|
Congrats @domenzain, this is a solid piece of work! |
Nice – looking forward to try this for something I'm working on with censored data! As a side note I wasted hours trying to get the lower regularized gamma function working in Tensorflow: tensorflow/tensorflow#17995 (not sure what the situation is in Theano though). This is needed for the Gamma distribution (as mentioned further up in the thread). |
@erikbern, in https://github.com/domenzain/Theano_chi2sf I have taken the relicensed code from @gBokiau's link and wrapped most of the remaining Gamma functions. |
@domenzain I have a small obstacle, if I implement incomplete gamma from Theano_chi2sf, then I can't get the NUTS sampling working, because the calculation of the gradients is missing ( |
Hi @aakhmetz, I forgot to add the log CDF of the Gamma distribution after the merge of the pull request in my previous comment... I have just created #3356 to fix that. In your linked issue, you are using the scipy implementation. You should use the Theano tensor implementation: import theano.tensor as tt
def logCDF(alpha, beta, x)
return tt.log(tt.gammainc(alpha, beta*x)) |
Hi @domenzain, Thank you! It looks like working. But I also found that my Theano did not contain gammainc function (I assume it was a conda version, precisely: Theano-1.0.3+2.g3e47d39ac.dirty): ---------------------------------------------------------------------------
AttributeError Traceback (most recent call last)
<ipython-input-27-08263549df00> in <module>
----> 1 pm.Gamma.dist(10,2).logcdf(.5).eval()
~/anaconda3/lib/python3.6/site-packages/pymc3-3.6-py3.6.egg/pymc3/distributions/continuous.py in logcdf(self, value)
2377 beta = self.beta
2378 return bound(
-> 2379 tt.log(tt.gammainc(alpha, beta * value)),
2380 value >= 0,
2381 alpha > 0,
AttributeError: module 'theano.tensor' has no attribute 'gammainc' I installed the version from your github, but then I arrive to a similar problem as I had before: MethodNotDefined: ('grad', <class 'theano.scalar.basic_scipy.GammaInc'>, 'GammaInc') It looks like I came back to the same point :/ The code I am using looks like this: with pm.Model() as model_Gamma:
dmean = pm.Uniform('dmean',0,35)
dsd = pm.Uniform('dsd',0,35)
# delay_p = tt.exp(pm.Gamma.dist(mu=dmean,sd=dsd).logcdf(shared(df.δt0.get_values())))/\
# tt.exp(pm.Gamma.dist(mu=dmean,sd=dsd).logcdf(shared(df.δt.get_values())))
delay_p = tt.gammainc((dmean/dsd)**2, dmean/(dsd**2)*shared(df.δt0.get_values()))/\
tt.gammainc((dmean/dsd)**2, dmean/(dsd**2)*shared(df.δt.get_values()))
counts = pm.Poisson('counts',\
mu=df.counts.get_values()*delay_p+0.001,\
observed=df.confirmed0.get_values())
trace_Gamma = pm.sample(draws = number_of_iterations, tune=length_of_tunein,
cores = number_of_jobs) and both repos of theano and pymc3 are the latest from github (pymc3 includes your last fix of course) Ok, wakarimashita more or less. @domenzain, thanks! |
I have also been running into: MethodNotDefined: ('grad', <class 'theano.scalar.basic_scipy.GammaInc'>, 'GammaInc') whilst trying to use Presumably this should be implemented in theano, rather than pymc3, so I could submit a feature request on theano? |
I think the |
Sadly I don't think my maths is up to scratch :( This would massively help my research if anyone is willing and able to help out! |
@davidpatti Probably it would be a strange place to say it, but me, I've switched to Stan partially because of that (some problems were with truncated likelihoods) |
Thanks for the suggestion - I've switched too now |
This completes the work started in #2048 and continued in #2073 and includes #2678, but rebased on top of a recent
master
and in more compact commits.These can then be used to more adequately implement censored distributions as described in #1867 and #1864 .