Skip to content

Changing amount of data gives Theano error #3007

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
gpfins opened this issue Jun 5, 2018 · 17 comments
Closed

Changing amount of data gives Theano error #3007

gpfins opened this issue Jun 5, 2018 · 17 comments

Comments

@gpfins
Copy link

gpfins commented Jun 5, 2018

Description of your problem

When I change the amount of data in a probabilistic model, it seems that the gradient still had the original size. Is there any way to change this without rebuilding the entire model?

Please provide a minimal, self-contained, and reproducible example.

from pymc3 import Normal
import numpy as np
import pymc3
import theano as tt

x0 = np.linspace(0, 1, 10)
x_t = tt.shared(x0)
y_t = tt.shared(5*x0)

model = pymc3.Model()
with model:
    w = Normal('W', mu=0., tau=1.)

    f = Normal('f', mu=w*x_t, sd=1., observed=y_t)

    trace = pymc3.find_MAP()
    trace = pymc3.sample(10, tune=2, start=trace)

    x1 = np.linspace(0, 1, 11)
    x_t.set_value(x1)
    y_t.set_value(np.sin(x1))

    trace = pymc3.sample(10, tune=2, start=trace)

Please provide the full traceback.

/home/nknudde/.anaconda3/envs/gpf/bin/python /home/nknudde/PycharmProjects/BayesianVF/rep.py
WARNING (theano.tensor.blas): Using NumPy C-API based implementation for BLAS functions.
/home/nknudde/.anaconda3/envs/gpf/lib/python3.6/site-packages/h5py/__init__.py:36: FutureWarning: Conversion of the second argument of issubdtype from `float` to `np.floating` is deprecated. In future, it will be treated as `np.float64 == np.dtype(float).type`.
  from ._conv import register_converters as _register_converters
logp = -54.09, ||grad|| = 17.593: 100%|██████████| 4/4 [00:00<00:00, 1004.86it/s]
Only 10 samples in chain.
Auto-assigning NUTS sampler...
Initializing NUTS using jitter+adapt_diag...
Multiprocess sampling (2 chains in 2 jobs)
NUTS: [W]
100%|██████████| 12/12 [00:00<00:00, 516.87it/s]
/home/nknudde/.anaconda3/envs/gpf/lib/python3.6/site-packages/pymc3/sampling.py:472: UserWarning: The number of samples is too small to check convergence reliably.
  "The number of samples is too small to check "
The acceptance probability does not match the target. It is 0.05690858162084905, but should be close to 0.8. Try to increase the number of tuning steps.
The acceptance probability does not match the target. It is 0.010757361148668246, but should be close to 0.8. Try to increase the number of tuning steps.
Only 10 samples in chain.
Auto-assigning NUTS sampler...
Initializing NUTS using jitter+adapt_diag...
Traceback (most recent call last):
  File "/home/nknudde/PycharmProjects/BayesianVF/rep.py", line 23, in <module>
    trace = pymc3.sample(10, tune=2, start=trace)
  File "/home/nknudde/.anaconda3/envs/gpf/lib/python3.6/site-packages/pymc3/sampling.py", line 397, in sample
    progressbar=progressbar, **args)
  File "/home/nknudde/.anaconda3/envs/gpf/lib/python3.6/site-packages/pymc3/sampling.py", line 1390, in init_nuts
    step = pm.NUTS(potential=potential, model=model, **kwargs)
  File "/home/nknudde/.anaconda3/envs/gpf/lib/python3.6/site-packages/pymc3/step_methods/hmc/nuts.py", line 152, in __init__
    super(NUTS, self).__init__(vars, **kwargs)
  File "/home/nknudde/.anaconda3/envs/gpf/lib/python3.6/site-packages/pymc3/step_methods/hmc/base_hmc.py", line 63, in __init__
    dtype=dtype, **theano_kwargs)
  File "/home/nknudde/.anaconda3/envs/gpf/lib/python3.6/site-packages/pymc3/step_methods/arraystep.py", line 215, in __init__
    vars, dtype=dtype, **theano_kwargs)
  File "/home/nknudde/.anaconda3/envs/gpf/lib/python3.6/site-packages/pymc3/model.py", line 709, in logp_dlogp_function
    return ValueGradFunction(self.logpt, grad_vars, extra_vars, **kwargs)
  File "/home/nknudde/.anaconda3/envs/gpf/lib/python3.6/site-packages/pymc3/model.py", line 442, in __init__
    grad = tt.grad(self._cost_joined, self._vars_joined)
  File "/home/nknudde/.anaconda3/envs/gpf/lib/python3.6/site-packages/theano/gradient.py", line 605, in grad
    grad_dict, wrt, cost_name)
  File "/home/nknudde/.anaconda3/envs/gpf/lib/python3.6/site-packages/theano/gradient.py", line 1371, in _populate_grad_dict
    rval = [access_grad_cache(elem) for elem in wrt]
  File "/home/nknudde/.anaconda3/envs/gpf/lib/python3.6/site-packages/theano/gradient.py", line 1371, in <listcomp>
    rval = [access_grad_cache(elem) for elem in wrt]
  File "/home/nknudde/.anaconda3/envs/gpf/lib/python3.6/site-packages/theano/gradient.py", line 1326, in access_grad_cache
    term = access_term_cache(node)[idx]
  File "/home/nknudde/.anaconda3/envs/gpf/lib/python3.6/site-packages/theano/gradient.py", line 1021, in access_term_cache
    output_grads = [access_grad_cache(var) for var in node.outputs]
  File "/home/nknudde/.anaconda3/envs/gpf/lib/python3.6/site-packages/theano/gradient.py", line 1021, in <listcomp>
    output_grads = [access_grad_cache(var) for var in node.outputs]
  File "/home/nknudde/.anaconda3/envs/gpf/lib/python3.6/site-packages/theano/gradient.py", line 1326, in access_grad_cache
    term = access_term_cache(node)[idx]
  File "/home/nknudde/.anaconda3/envs/gpf/lib/python3.6/site-packages/theano/gradient.py", line 1021, in access_term_cache
    output_grads = [access_grad_cache(var) for var in node.outputs]
  File "/home/nknudde/.anaconda3/envs/gpf/lib/python3.6/site-packages/theano/gradient.py", line 1021, in <listcomp>
    output_grads = [access_grad_cache(var) for var in node.outputs]
  File "/home/nknudde/.anaconda3/envs/gpf/lib/python3.6/site-packages/theano/gradient.py", line 1326, in access_grad_cache
    term = access_term_cache(node)[idx]
  File "/home/nknudde/.anaconda3/envs/gpf/lib/python3.6/site-packages/theano/gradient.py", line 1021, in access_term_cache
    output_grads = [access_grad_cache(var) for var in node.outputs]
  File "/home/nknudde/.anaconda3/envs/gpf/lib/python3.6/site-packages/theano/gradient.py", line 1021, in <listcomp>
    output_grads = [access_grad_cache(var) for var in node.outputs]
  File "/home/nknudde/.anaconda3/envs/gpf/lib/python3.6/site-packages/theano/gradient.py", line 1326, in access_grad_cache
    term = access_term_cache(node)[idx]
  File "/home/nknudde/.anaconda3/envs/gpf/lib/python3.6/site-packages/theano/gradient.py", line 1021, in access_term_cache
    output_grads = [access_grad_cache(var) for var in node.outputs]
  File "/home/nknudde/.anaconda3/envs/gpf/lib/python3.6/site-packages/theano/gradient.py", line 1021, in <listcomp>
    output_grads = [access_grad_cache(var) for var in node.outputs]
  File "/home/nknudde/.anaconda3/envs/gpf/lib/python3.6/site-packages/theano/gradient.py", line 1326, in access_grad_cache
    term = access_term_cache(node)[idx]
  File "/home/nknudde/.anaconda3/envs/gpf/lib/python3.6/site-packages/theano/gradient.py", line 1237, in access_term_cache
    "of shape %s" % (node.op, t_shape, i, i_shape))
ValueError: Elemwise{sub,no_inplace}.grad returned object of shape (10,) as gradient term on input 0 of shape (11,)

Please provide any additional information below.

Versions and main components

  • PyMC3 Version: 3.4.1
  • Theano Version: 1.0.2
  • Python Version: 3.6.2
  • Operating system: Ubuntu 18.04
  • How did you install PyMC3: pip
@twiecki
Copy link
Member

twiecki commented Jun 6, 2018

f = Normal('f', mu=w*x0, sd=1., observed=y_t) I think you want x_t here, rather than x0 which is fixed size.

Also, no reason to call find_MAP() anymore.

@gpfins
Copy link
Author

gpfins commented Jun 6, 2018

You're absolutely right about x_t, which was a mistake in the example. But correcting this doesn't change the error.

@OriolAbril
Copy link
Member

I am having the same problem trying to implement reloo (refit the model again for each observation with pareto_k > 0.7 to get their exact predictive accuracy as pareto_k > 0.7 indicates the approximation cannot be trusted, here the PR with links to working examples in emcee and PyStan) and Leave Future Out (see this repo for algorithm details) in ArviZ.

My approach is basically the following:

In [1]: import pymc3 as pm
   ...: with pm.Model() as model:
   ...:     x = pm.Data('x', [1., 2., 3.])
   ...:     y = pm.Data('y', [1., 2., 3.])
   ...:     beta = pm.Normal('beta', 0, 1)
   ...:     obs = pm.Normal('obs', x * beta, 1, observed=y)
   ...:     trace = pm.sample(1000, tune=1000)
WARNING (theano.tensor.blas): Using NumPy C-API based implementation for BLAS functions.
Auto-assigning NUTS sampler...
Initializing NUTS using jitter+adapt_diag...
Multiprocess sampling (4 chains in 4 jobs)
NUTS: [beta]
Sampling 4 chains, 0 divergences: 100%|██████████| 8000/8000 [00:01<00:00, 5438.67draws/s]
In [2]: with model:
   ...:     pm.set_data({'x': [1, 2], 'y': [1, 2]})
   ...:     loo_trace = pm.sample(1000, tune=1000)
Auto-assigning NUTS sampler...
Initializing NUTS using jitter+adapt_diag...
Traceback (most recent call last):

  File "<ipython-input-2-8443eb08e117>", line 3, in <module>
    loo_trace = pm.sample(1000, tune=1000)
See full error trace
  File "/home/oriol/venvs/arviz-dev/lib/python3.6/site-packages/pymc3/sampling.py", line 398, in sample
    progressbar=progressbar, **kwargs)

  File "/home/oriol/venvs/arviz-dev/lib/python3.6/site-packages/pymc3/sampling.py", line 1552, in init_nuts
    step = pm.NUTS(potential=potential, model=model, **kwargs)

  File "/home/oriol/venvs/arviz-dev/lib/python3.6/site-packages/pymc3/step_methods/hmc/nuts.py", line 152, in __init__
    super().__init__(vars, **kwargs)

  File "/home/oriol/venvs/arviz-dev/lib/python3.6/site-packages/pymc3/step_methods/hmc/base_hmc.py", line 72, in __init__
    super().__init__(vars, blocked=blocked, model=model, dtype=dtype, **theano_kwargs)

  File "/home/oriol/venvs/arviz-dev/lib/python3.6/site-packages/pymc3/step_methods/arraystep.py", line 228, in __init__
    vars, dtype=dtype, **theano_kwargs)

  File "/home/oriol/venvs/arviz-dev/lib/python3.6/site-packages/pymc3/model.py", line 723, in logp_dlogp_function
    return ValueGradFunction(self.logpt, grad_vars, extra_vars, **kwargs)

  File "/home/oriol/venvs/arviz-dev/lib/python3.6/site-packages/pymc3/model.py", line 456, in __init__
    grad = tt.grad(self._cost_joined, self._vars_joined)

  File "/home/oriol/venvs/arviz-dev/lib/python3.6/site-packages/theano/gradient.py", line 605, in grad
    grad_dict, wrt, cost_name)

  File "/home/oriol/venvs/arviz-dev/lib/python3.6/site-packages/theano/gradient.py", line 1371, in _populate_grad_dict
    rval = [access_grad_cache(elem) for elem in wrt]

  File "/home/oriol/venvs/arviz-dev/lib/python3.6/site-packages/theano/gradient.py", line 1371, in <listcomp>
    rval = [access_grad_cache(elem) for elem in wrt]

  File "/home/oriol/venvs/arviz-dev/lib/python3.6/site-packages/theano/gradient.py", line 1326, in access_grad_cache
    term = access_term_cache(node)[idx]

  File "/home/oriol/venvs/arviz-dev/lib/python3.6/site-packages/theano/gradient.py", line 1021, in access_term_cache
    output_grads = [access_grad_cache(var) for var in node.outputs]

  File "/home/oriol/venvs/arviz-dev/lib/python3.6/site-packages/theano/gradient.py", line 1021, in <listcomp>
    output_grads = [access_grad_cache(var) for var in node.outputs]

  File "/home/oriol/venvs/arviz-dev/lib/python3.6/site-packages/theano/gradient.py", line 1326, in access_grad_cache
    term = access_term_cache(node)[idx]

  File "/home/oriol/venvs/arviz-dev/lib/python3.6/site-packages/theano/gradient.py", line 1021, in access_term_cache
    output_grads = [access_grad_cache(var) for var in node.outputs]

  File "/home/oriol/venvs/arviz-dev/lib/python3.6/site-packages/theano/gradient.py", line 1021, in <listcomp>
    output_grads = [access_grad_cache(var) for var in node.outputs]

  File "/home/oriol/venvs/arviz-dev/lib/python3.6/site-packages/theano/gradient.py", line 1326, in access_grad_cache
    term = access_term_cache(node)[idx]

  File "/home/oriol/venvs/arviz-dev/lib/python3.6/site-packages/theano/gradient.py", line 1021, in access_term_cache
    output_grads = [access_grad_cache(var) for var in node.outputs]

  File "/home/oriol/venvs/arviz-dev/lib/python3.6/site-packages/theano/gradient.py", line 1021, in <listcomp>
    output_grads = [access_grad_cache(var) for var in node.outputs]

  File "/home/oriol/venvs/arviz-dev/lib/python3.6/site-packages/theano/gradient.py", line 1326, in access_grad_cache
    term = access_term_cache(node)[idx]

  File "/home/oriol/venvs/arviz-dev/lib/python3.6/site-packages/theano/gradient.py", line 1021, in access_term_cache
    output_grads = [access_grad_cache(var) for var in node.outputs]

  File "/home/oriol/venvs/arviz-dev/lib/python3.6/site-packages/theano/gradient.py", line 1021, in <listcomp>
    output_grads = [access_grad_cache(var) for var in node.outputs]

  File "/home/oriol/venvs/arviz-dev/lib/python3.6/site-packages/theano/gradient.py", line 1326, in access_grad_cache
    term = access_term_cache(node)[idx]
  File "/home/oriol/venvs/arviz-dev/lib/python3.6/site-packages/theano/gradient.py", line 1237, in access_term_cache
    "of shape %s" % (node.op, t_shape, i, i_shape))

ValueError: Elemwise{sub,no_inplace}.grad returned object of shape (3,) as gradient term on input 0 of shape (2,)

Is there any way to resample the same model on a subset without having to write a new model? (for cross validation purposes for example) I have had no problem with sampling posterior predictive samples, nor to calculate the pointwise log likelihood (using logp_elemwise) on differently shaped data.

@Monikasinghjmi
Copy link

same issue while running bayesian regression model
l

@fonnesbeck
Copy link
Member

@Monikasinghjmi this warning is because you are sampling far too few samples. A good default is 1000 samples with tune=2000.

@Monikasinghjmi
Copy link

@fonnesbeck -when I tried with defaults, I get another warning "The gelman-rubin statistic is larger than 1.4 for some parameters. The sampler did not converge".

@fonnesbeck
Copy link
Member

Yeah, the default tuning is 500 iterations which is sometimes too few. The message implies that you need to sample more (specifically, under tuning). So add tune=2000 as an argument to sample and see what happens.

@Monikasinghjmi
Copy link

500-2000
1000-2000

@Monikasinghjmi
Copy link

Same results in both cases "The derivative of RV EQ_Category.ravel()[0] is zero."

@fonnesbeck
Copy link
Member

Something is wrong with your model specification relative to your data. What does your model formula look like, and what does your data look like (maybe show the head() of it)?

@Monikasinghjmi
Copy link

aa
b

Please see the attached files.

@Monikasinghjmi
Copy link

And there are 39 rows and 34 columns in the data

@fonnesbeck
Copy link
Member

fonnesbeck commented Oct 11, 2019 via email

@Monikasinghjmi
Copy link

Thank you so much @fonnesbeck . It was because of the normalization issue.

@fonnesbeck
Copy link
Member

Excellent. If you have any usage questions going forward, our Discourse page is a good resource.

@MMingyar
Copy link

You're absolutely right about x_t, which was a mistake in the example. But correcting this doesn't change the error.

Did you ever figure this issue out, or did you just move on? I'm getting this same issue when I add more parameters in my model and I'm not sure where to go from here.

@ricardoV94
Copy link
Member

This one seems to work on main, and I believe we have tests for this:

import numpy as np

import pymc as pm
import aesara

x0 = np.linspace(0, 1, 10)
x_t = aesara.shared(x0)
y_t = aesara.shared(5*x0)

with pm.Model():
    w = pm.Normal('W', mu=0., tau=1.)
    f = pm.Normal('f', mu=w*x_t, sd=1., observed=y_t)
    trace = pm.sample(10, tune=2)

    x1 = np.linspace(0, 1, 11)
    x_t.set_value(x1)
    y_t.set_value(np.sin(x1))

    trace = pm.sample(10, tune=2)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

8 participants