-
Notifications
You must be signed in to change notification settings - Fork 2.2k
Description
Describe the issue:
I am trying to do Example 1 from the paper of Hogg et al with PyMC. The original set of x, y, and sigma_y contains 16 entries. The PyMC model, with these data, works fine. Now I am trying to obtain predictions for a new set x_new.
The suggested way is:
with model:
pm.set_data({"x": x_new}) # Update the shared data container
y_new = pm.sample_posterior_predictive(trace)
where trace are the results of sampling obtained in the previous step. One can average y_new over the chains and the draws to obtain some predictions.
The problem is that the x_new has to be of the same length as the original x or y, otherwise the above procedure fails (for len(x_new)=18:
ValueError: shape mismatch: objects cannot be broadcast to a single shape. Mismatch is between arg 0 with shape (16,) and arg 1 with shape (18,).
This is disappointing if one wants to mimic the technique used in big data analysis where a given set is divided into a large train set and a smaller test set.
Reproduceable code example:
import numpy as np
import matplotlib.pyplot as plt
import xarray as xr
import pymc as pm
import arviz as az
a = np.linspace(0, x1.min(), 8)
b = np.linspace(x1.max(), 1.5*x1.max(), 10)
predictors_out_of_sample = np.zeros(18)
predictors_out_of_sample[0:8] = a
predictors_out_of_sample[8:118] = b
print(len(predictors_out_of_sample),' > than x1 or y1')
# not working wuth len(predictors_out_of_sample) not ew len(y1)
#x_new = xr.DataArray(predictors_out_of_sample)
with model:
pm.set_data({"x": predictors_out_of_sample}) # Update the shared data container
y_test = pm.sample_posterior_predictive(trace)Error message:
---------------------------------------------------------------------------
ValueError Traceback (most recent call last)
File d:\miniconda3\Lib\site-packages\pytensor\compile\function\types.py:1039, in Function.__call__(self, output_subset, *args, **kwargs)
1038 try:
-> 1039 outputs = vm() if output_subset is None else vm(output_subset=output_subset)
1040 except Exception:
File d:\miniconda3\Lib\site-packages\pytensor\graph\op.py:544, in Op.make_py_thunk.<locals>.rval(p, i, o, n, cm)
536 @is_thunk_type
537 def rval(
538 p=p,
(...) 542 cm=node_compute_map,
543 ):
--> 544 r = p(n, [x[0] for x in i], o)
545 for entry in cm:
File d:\miniconda3\Lib\site-packages\pytensor\tensor\random\op.py:428, in RandomVariable.perform(self, node, inputs, outputs)
426 outputs[0][0] = rng
427 outputs[1][0] = np.asarray(
--> 428 self.rng_fn(rng, *args, None if size is None else tuple(size)),
429 dtype=self.dtype,
430 )
File d:\miniconda3\Lib\site-packages\pytensor\tensor\random\op.py:194, in RandomVariable.rng_fn(self, rng, *args, **kwargs)
193 """Sample a numeric random variate."""
--> 194 return getattr(rng, self.name)(*args, **kwargs)
File numpy/random/_generator.pyx:1290, in numpy.random._generator.Generator.normal()
File numpy/random/_common.pyx:619, in numpy.random._common.cont()
File numpy/random/_common.pyx:536, in numpy.random._common.cont_broadcast_2()
File d:\miniconda3\Lib\site-packages\numpy\__init__.cython-30.pxd:783, in numpy.PyArray_MultiIterNew3()
ValueError: shape mismatch: objects cannot be broadcast to a single shape. Mismatch is between arg 0 with shape (16,) and arg 1 with shape (18,).
During handling of the above exception, another exception occurred:
ValueError Traceback (most recent call last)
Cell In[703], line 5
3 with model:
4 pm.set_data({"x": predictors_out_of_sample}) # Update the shared data container
----> 5 y_test = pm.sample_posterior_predictive(trace)
File d:\miniconda3\Lib\site-packages\pymc\sampling\forward.py:951, in sample_posterior_predictive(trace, model, var_names, sample_dims, random_seed, progressbar, progressbar_theme, return_inferencedata, extend_inferencedata, predictions, idata_kwargs, compile_kwargs)
946 # there's only a single chain, but the index might hit it multiple times if
947 # the number of indices is greater than the length of the trace.
948 else:
949 param = _trace[idx % len_trace]
--> 951 values = sampler_fn(**param)
953 for k, v in zip(vars_, values):
954 ppc_trace_t.insert(k.name, v, idx)
File d:\miniconda3\Lib\site-packages\pymc\util.py:390, in point_wrapper.<locals>.wrapped(**kwargs)
388 def wrapped(**kwargs):
389 input_point = {k: v for k, v in kwargs.items() if k in ins}
--> 390 return core_function(**input_point)
File d:\miniconda3\Lib\site-packages\pytensor\compile\function\types.py:1049, in Function.__call__(self, output_subset, *args, **kwargs)
1047 if hasattr(self.vm, "thunks"):
1048 thunk = self.vm.thunks[self.vm.position_of_error]
-> 1049 raise_with_op(
1050 self.maker.fgraph,
1051 node=self.vm.nodes[self.vm.position_of_error],
1052 thunk=thunk,
1053 storage_map=getattr(self.vm, "storage_map", None),
1054 )
1055 else:
1056 # old-style linkers raise their own exceptions
1057 raise
File d:\miniconda3\Lib\site-packages\pytensor\link\utils.py:526, in raise_with_op(fgraph, node, thunk, exc_info, storage_map)
521 warnings.warn(
522 f"{exc_type} error does not allow us to add an extra error message"
523 )
524 # Some exception need extra parameter in inputs. So forget the
525 # extra long error message in that case.
--> 526 raise exc_value.with_traceback(exc_trace)
File d:\miniconda3\Lib\site-packages\pytensor\compile\function\types.py:1039, in Function.__call__(self, output_subset, *args, **kwargs)
1037 t0_fn = time.perf_counter()
1038 try:
-> 1039 outputs = vm() if output_subset is None else vm(output_subset=output_subset)
1040 except Exception:
1041 self._restore_defaults()
File d:\miniconda3\Lib\site-packages\pytensor\graph\op.py:544, in Op.make_py_thunk.<locals>.rval(p, i, o, n, cm)
536 @is_thunk_type
537 def rval(
538 p=p,
(...) 542 cm=node_compute_map,
543 ):
--> 544 r = p(n, [x[0] for x in i], o)
545 for entry in cm:
546 entry[0] = True
File d:\miniconda3\Lib\site-packages\pytensor\tensor\random\op.py:428, in RandomVariable.perform(self, node, inputs, outputs)
424 rng = deepcopy(rng)
426 outputs[0][0] = rng
427 outputs[1][0] = np.asarray(
--> 428 self.rng_fn(rng, *args, None if size is None else tuple(size)),
429 dtype=self.dtype,
430 )
File d:\miniconda3\Lib\site-packages\pytensor\tensor\random\op.py:194, in RandomVariable.rng_fn(self, rng, *args, **kwargs)
192 def rng_fn(self, rng, *args, **kwargs) -> int | float | np.ndarray:
193 """Sample a numeric random variate."""
--> 194 return getattr(rng, self.name)(*args, **kwargs)
File numpy/random/_generator.pyx:1290, in numpy.random._generator.Generator.normal()
File numpy/random/_common.pyx:619, in numpy.random._common.cont()
File numpy/random/_common.pyx:536, in numpy.random._common.cont_broadcast_2()
File d:\miniconda3\Lib\site-packages\numpy\__init__.cython-30.pxd:783, in numpy.PyArray_MultiIterNew3()
ValueError: shape mismatch: objects cannot be broadcast to a single shape. Mismatch is between arg 0 with shape (16,) and arg 1 with shape (18,).
Apply node that caused the error: normal_rv{"(),()->()"}(RNG(<Generator(PCG64) at 0x2038BFA0BA0>), MakeVector{dtype='int64'}.0, Composite{((i0 * i1) + i2)}.0, Composite{sqrt((246.05275315838355 + sqr((0.09374713619993619 * i0))))}.0)
Toposort index: 6
Inputs types: [RandomGeneratorType, TensorType(int64, shape=(1,)), TensorType(float64, shape=(None,)), TensorType(float64, shape=(None,))]
Inputs shapes: ['No shapes', (1,), (18,), (18,)]
Inputs strides: ['No strides', (8,), (8,), (8,)]
Inputs values: [Generator(PCG64) at 0x2038BFA0BA0, array([16]), 'not shown', 'not shown']
Outputs clients: [[output[7](normal_rv{"(),()->()"}.0)], [output[6](y_obs), DeepCopyOp(y_obs), DeepCopyOp(y_obs), DeepCopyOp(y_obs), DeepCopyOp(y_obs), DeepCopyOp(y_obs), DeepCopyOp(y_obs)]]
Backtrace when the node is created (use PyTensor flag traceback__limit=N to make it longer):
File "d:\miniconda3\Lib\site-packages\IPython\core\async_helpers.py", line 128, in _pseudo_sync_runner
coro.send(None)
File "d:\miniconda3\Lib\site-packages\IPython\core\interactiveshell.py", line 3362, in run_cell_async
has_raised = await self.run_ast_nodes(code_ast.body, cell_name,
File "d:\miniconda3\Lib\site-packages\IPython\core\interactiveshell.py", line 3607, in run_ast_nodes
if await self.run_code(code, result, async_=asy):
File "d:\miniconda3\Lib\site-packages\IPython\core\interactiveshell.py", line 3667, in run_code
exec(code_obj, self.user_global_ns, self.user_ns)
File "C:\Users\sylvi\AppData\Local\Temp\ipykernel_18312\1755518607.py", line 22, in <module>
y_obs = pm.Normal("y_obs", mu=μ, sigma=σ_μ, observed=y)
File "d:\miniconda3\Lib\site-packages\pymc\distributions\distribution.py", line 529, in __new__
rv_out = cls.dist(*args, **kwargs)
File "d:\miniconda3\Lib\site-packages\pymc\distributions\continuous.py", line 491, in dist
return super().dist([mu, sigma], **kwargs)
File "d:\miniconda3\Lib\site-packages\pymc\distributions\distribution.py", line 598, in dist
return cls.rv_op(*dist_params, size=create_size, **kwargs)
HINT: Use the PyTensor flag `exception_verbosity=high` for a debug print-out and storage map footprint of this Apply node.
1
PyMC version information:
'5.25.1'
win11
Context for the issue:
No response