Skip to content

Interaction between logcdf and new check_bounds flag #4429

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
ricardoV94 opened this issue Jan 22, 2021 · 2 comments · Fixed by #5233
Closed

Interaction between logcdf and new check_bounds flag #4429

ricardoV94 opened this issue Jan 22, 2021 · 2 comments · Fixed by #5233

Comments

@ricardoV94
Copy link
Member

ricardoV94 commented Jan 22, 2021

The new check_bounds flag can be used to skip the parameter and value checks done by the bound method of most logps (as intended), but it also affects the logcdf methods of distributions evaluated within a model context.

Unlike most logps, however, the logcdf is usually always defined over the real range, even if the respective pdf is only defined/discussed within a finite/non-real domain. Thus, it might make sense to have the cdf give the right result (for valid parameters) even when check_bounds is False.

In [4]: with pm.Model(check_bounds=False) as m: 
   ...:     dist = pm.Bernoulli.dist(p=.1) 
   ...:     print(dist.logcdf(-2).eval()) 
   ...:                                                                         
-0.10536051565782631

This would require changing most of the logcdf methods to apply the bound on distribution parameters, but not on the input values, as is done now. For instance, in the Bernoulli example:

https://github.com/pymc-devs/pymc3/blob/89916ad724ae88ebb4c1af3d65919b87b318f180/pymc3/distributions/discrete.py#L473-L482

The function would be changed to:

return bound(
    tt.switch(
        tt.le(value, 0),
        -np.inf,
        tt.switch(
            tt.lt(value, 1),
            tt.log1p(-p),
            0,
        ),
    ),
    0 <= p,
    p <= 1,
)

So that the logcdf does not return the wrong result when check_bounds is False due to the input value (but can still do due to invalid parameters). I think this may make the check_bounds flag behavior more consistent/intuitive. I would also extend the docstrings of the check_bounds flag to explain the logcdf implications.

This might become more relevant if, and when, we implement a generic Truncated and Censored class as discussed in #1864.

If people agree this makes sense, I am happy to do a PR for it.

@twiecki
Copy link
Member

twiecki commented Jan 22, 2021

Alternatively couldn't these just set check_bounds=True to overwrite whatever setting we have?

@ricardoV94
Copy link
Member Author

ricardoV94 commented Jan 22, 2021

Alternatively couldn't these just set check_bounds=True to overwrite whatever setting we have?

That's an option as well, but in this case I was thinking that we can still benefit from not checking the bounds when the user knows the parameters cannot be invalid (due to careful model specification, etc...).

with pm.Model(check_bounds=False) as m:  
    ...:     p = pm.Beta('p', alpha=1, beta=1) 
    ...:     dist = pm.Bernoulli.dist(p=p)  
    ...:     cdf = pm.Deterministic('cdf', dist.logcdf(-2))   # Something dynamic other than -2

Because in a sense the logcdf value is "never" out of bounds.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants