The time series module provides the base classes for building and composing
forecasting models. Models are built by combining components using arithmetic
operators.
This class provides the foundation for building time series models in vangja.
It handles data preprocessing, scaling, model fitting, and prediction. Model
components can be combined using arithmetic operators (+, , *) to create
complex models.
>>> fromvangjaimportLinearTrend,FourierSeasonality>>> # Create an additive model>>> model=LinearTrend()+FourierSeasonality(period=365.25,series_order=10)>>> model.fit(data)>>> predictions=model.predict(horizon=30)
>>> # Create a multiplicative model>>> model=LinearTrend()**FourierSeasonality(period=7,series_order=3)>>> model.fit(data)
Notes
Subclasses should implement:
- definition: Add parameters to the PyMC model
- _get_initval: Provide initial values for parameters
- _predict_map: Predict using MAP estimates
- _predict_mcmc: Predict using MCMC samples
- _plot: Plot the component’s contribution
data (pd.DataFrame) – A pandas dataframe that must at least have columns ds (predictor), y
(target) and series (name of time series).
scaler (Scaler) – Whether to use maxabs or minmax scaling of the y (target).
scale_mode (ScaleMode) – Whether to scale each series individually or together.
t_scale_params (TScaleParams|None) – Whether to override scale parameters for ds (predictor).
sigma_sd (float) – The standard deviation of the Normal prior of y (target).
sigma_pool_type (PoolType) – Type of pooling for the sigma parameter that is performed when sampling.
sigma_shrinkage_strength (float) – Shrinkage between groups for the hierarchical modeling.
method (Method) – The Bayesian inference method to be used. Either a point estimate MAP), a
VI method (advi etc.) or full Bayesian sampling (MCMC).
optimization_method (OptimizationMethod) – The optimization method to be used for MAP inference. See
scipy.optimize.minimize documentation for details.
maxiter (int) – The maximum number of iterations for the L-BFGS-B optimization algorithm
when using MAP inference.
n (int) – The number of iterations to be used for the VI methods.
samples (int) – Denotes the number of samples to be drawn from the posterior for MCMC and
VI methods.
chains (int) – Denotes the number of independent chains drawn from the posterior. Only
applicable to the MCMC methods.
nuts_sampler (NutsSampler) – The sampler for the NUTS method.
progressbar (bool) – Whether to show a progressbar while fitting the model.
idata (az.InferenceData|None) – Sample from a posterior. If it is not None, Vangja will use this to set the
parameters’ priors in the model. If idata is not None, each component from
the model should specify how idata should be used to set its parameters’
priors.
For MCMC/VI methods, uncertainty is derived from posterior samples.
Each posterior draw is propagated through the model to produce a
family of prediction trajectories, from which percentile-based
credible intervals are computed.
For MAP methods, uncertainty is estimated using a hybrid approach:
The fitted observation noise sigma provides a base noise level.
In-sample residuals are used to calibrate the noise estimate.
A forecast-distance scaling factor sqrt(1+h/n) widens
the intervals for predictions further from the training data,
reflecting increasing epistemic uncertainty.
Parameters:
horizon (int) – Number of future steps to forecast.
freq (FreqStr, default"D") – Frequency of forecast steps.
uncertainty_samples (int, default200) – Number of posterior draws to use for interval estimation.
Only used for MCMC/VI methods.
interval_width (float, default0.95) – Width of the prediction interval (e.g. 0.95 for 95%% interval).
Returns:
The future DataFrame with columns yhat_<group>,
yhat_lower_<group>, and yhat_upper_<group> for each group.
Return type:
pd.DataFrame
Notes
See uncertainty.md for a detailed description of the approaches.
y_true (pd.DataFrame|None) – A pandas dataframe containing the true values for the inference period that
must at least have columns ds (predictor), y (target) and series (name of
time series).
clip_to_data (bool) – If True, clip predictions to the date range of the training data
(and y_true if provided). This avoids plotting predictions for
periods before the target series’ start date, which can happen
when transfer learning shifts t_scale_params.
Generates simulated observations from the model’s priors before
conditioning on data, enabling visual and quantitative verification
that the chosen priors are scientifically plausible.
Parameters:
samples (int, default500) – Number of samples to draw from the prior predictive.
Returns:
ArviZ InferenceData with prior and prior_predictive groups.
Return type:
az.InferenceData
Raises:
RuntimeError – If the model has not been fit yet (self.model does not exist).
Notes
The model must be fit first so that the PyMC model graph exists.
Calling this method does not alter the fitted posterior.
Sample from the posterior predictive distribution.
Generates replicated datasets from the posterior to assess goodness of
fit. Requires the model to have been fitted with an MCMC or VI
method so that self.trace is available.
Returns:
ArviZ InferenceData with a posterior_predictive group added.
This class serves as the foundation for composing multiple time series
components together. It provides common functionality for combining
two components (left and right) and propagating method calls to both.
Parameters:
left (TimeSeriesModel|int|float) – The left operand of the combination. Can be a model component
or a numeric constant.
right (TimeSeriesModel|int|float) – The right operand of the combination. Can be a model component
or a numeric constant.
Combines two components using addition: y = left + right.
This class is created when using the + operator between time series
components. The resulting model sums the contributions from both
components.
Parameters:
left (TimeSeriesModel|int|float) – The left operand of the addition.
right (TimeSeriesModel|int|float) – The right operand of the addition.
Examples
>>> fromvangjaimportLinearTrend,FourierSeasonality>>> # Create an additive model with trend + seasonality>>> model=LinearTrend()+FourierSeasonality(period=365.25,series_order=10)>>> print(model)LT(n=25,r=0.8,tm=None) + FS(p=365.25,n=10,tm=None)
Combines two components using y = left * (1 + right).
This class is created when using the ** operator between time series
components. This follows the Prophet-style multiplicative seasonality
where the right component modulates the left component around its value.
This formulation is useful when the amplitude of seasonality scales
with the trend level (heteroscedastic seasonal patterns).
Parameters:
left (TimeSeriesModel|int|float) – The base component (typically a trend).
right (TimeSeriesModel|int|float) – The multiplicative modifier (typically seasonality).
Examples
>>> fromvangjaimportLinearTrend,FourierSeasonality>>> # Create a model with multiplicative seasonality>>> model=LinearTrend()**FourierSeasonality(period=365.25,series_order=10)>>> print(model)LT(n=25,r=0.8,tm=None) * (1 + FS(p=365.25,n=10,tm=None))
Notes
The ** operator was chosen because * is used for simple
multiplication of components.
Combines two components using simple multiplication: y = left * right.
This class is created when using the * operator between time series
components. The resulting model multiplies the contributions from both
components directly.
This is useful for applying scaling factors or when components should
truly multiply (not modulate around 1).
Parameters:
left (TimeSeriesModel|int|float) – The left operand of the multiplication.
right (TimeSeriesModel|int|float) – The right operand of the multiplication.
Examples
>>> fromvangjaimportLinearTrend,UniformConstant>>> # Create a model with a scaling factor>>> model=LinearTrend()*UniformConstant(lower=0.8,upper=1.2)>>> print(model)LT(n=25,r=0.8,tm=None) * UC(l=0.8,u=1.2,tm=None)
Components are the building blocks for time series models. They can be combined
using + (additive), * (simple multiplicative), or ** (Prophet-style
multiplicative).
A piecewise linear trend component with optional changepoints.
This component models the trend of a time series as a piecewise linear
function, following the Prophet approach. The trend can have multiple
changepoints where the slope is allowed to change.
The trend is defined as:
trend(t)=(k+a(t)^T*delta)*t+(m+a(t)^T*gamma)
where:
k is the base slope
m is the intercept
delta is a vector of slope changes at changepoints
a(t) is an indicator vector for changepoints before time t
gamma is computed to make the trend continuous
Parameters:
n_changepoints (int, default25) – The number of potential changepoints. Changepoints are placed
uniformly in the first changepoint_range fraction of data.
changepoint_range (float, default0.8) – The proportion of the time range where changepoints are allowed.
For example, 0.8 means changepoints only in the first 80% of data.
slope_mean (float, default0) – The mean of the Normal prior for the slope parameter.
slope_sd (float, default5) – The standard deviation of the Normal prior for the slope parameter.
intercept_mean (float, default0) – The mean of the Normal prior for the intercept parameter.
intercept_sd (float, default5) – The standard deviation of the Normal prior for the intercept parameter.
delta_mean (float, default0) – The mean of the Laplace prior for the slope changes at changepoints.
delta_sd (float|None, default0.05) – The scale of the Laplace prior for slope changes. If None, the scale
is learned as a random variable with an Exponential(1.5) prior.
delta_side ({"left","right"}, default"left") – If “left”, the slope parameter controls the slope at the earliest
time point. If “right”, it controls the slope at the latest time.
pool_type (PoolType, default"complete") –
Type of pooling for multi-series data. One of:
”complete”: All series share the same trend parameters
”partial”: Hierarchical pooling with shared hyperpriors
”individual”: Each series has independent parameters
delta_pool_type (PoolType, default"complete") – Pooling type specifically for changepoint deltas. Only used when
pool_type="partial".
”parametric”: Use posterior mean/std as new priors
”prior_from_idata”: Use posterior samples directly
None: No transfer learning
delta_tune_method (TuneMethod|None, defaultNone) – Transfer learning method for changepoint deltas.
override_slope_mean_for_tune (np.ndarray|None, defaultNone) – Override the slope mean during transfer learning.
override_slope_sd_for_tune (np.ndarray|None, defaultNone) – Override the slope standard deviation during transfer learning.
override_delta_loc_for_tune (np.ndarray|None, defaultNone) – Override the delta location during transfer learning.
override_delta_scale_for_tune (np.ndarray|None, defaultNone) – Override the delta scale during transfer learning.
shrinkage_strength (float, default100) – Controls hierarchical shrinkage. Higher values pull individual
series parameters more strongly toward the shared mean.
loss_factor_for_tune (float, default0) – Regularization factor for transfer learning. Adds a penalty to
keep transferred parameters close to original values.
>>> # With hierarchical pooling for multiple series>>> model=LinearTrend(... pool_type="partial",... shrinkage_strength=50,... n_changepoints=10... )
>>> # Transfer learning from a pre-trained model>>> target_model=LinearTrend(tune_method="parametric")>>> target_model.fit(short_series,idata=source_trace)
The changepoint formulation follows the Facebook Prophet paper [1]_.
The delta_side="right" option is an extension that allows the
slope parameter to represent the end slope rather than the start slope.
model (TimeSeriesModel) – The model to which the parameters are added.
data (pd.DataFrame) – A pandas dataframe that must at least have columns ds (predictor), y
(target) and series (name of time series).
model_idxs (dict[str, int]) – Count of the number of components from each type.
priors (dict[str, pt.TensorVariable]|None) – A dictionary of multivariate normal random variables approximating the
posterior sample in idata.
idata (az.InferenceData|None) – Sample from a posterior. If it is not None, Vangja will use this to set the
parameters’ priors in the model. If idata is not None, each component from
the model should specify how idata should be used to set its parameters’
priors.
This is the simplest possible trend component: a single intercept
parameter with no slope and no changepoints. It models the baseline
level of the time series as a constant.
The model is:
trend(t)=intercept
This is useful when:
The time series has no discernible upward or downward trend.
You want a minimal trend component that adds only one parameter.
The series is short and estimating a slope would overfit.
Parameters:
intercept_mean (float, default0) – The mean of the Normal prior for the intercept parameter.
intercept_sd (float, default5) – The standard deviation of the Normal prior for the intercept.
pool_type (PoolType, default"complete") –
Type of pooling for multi-series data. One of:
”complete”: All series share the same intercept.
”partial”: Hierarchical pooling with shared hyperpriors.
”individual”: Each series has an independent intercept.
>>> # With hierarchical pooling for multiple series>>> model=FlatTrend(pool_type="partial",shrinkage_strength=50)
>>> # Transfer learning from a pre-trained model>>> target_model=FlatTrend(tune_method="parametric")>>> target_model.fit(short_series,idata=source_trace)
FlatTrend is equivalent to LinearTrend(n_changepoints=0) with
the slope fixed to 0, but is more explicit and has fewer parameters
to estimate. When composing models, it serves as a clean baseline
that relies on other components (seasonality, GP, etc.) to explain
temporal variation.
A seasonal component using Fourier series representation.
This component models periodic patterns in time series using a Fourier
series, following the Prophet approach. It allows flexible representation
of seasonal effects with controllable complexity via the number of terms.
The Fourier series representation is based on the Prophet paper [2]_.
Using more Fourier terms allows fitting more complex seasonal patterns
but increases the risk of overfitting.
Add the FourierSeasonality parameters to the model.
Parameters:
model (TimeSeriesModel) – The model to which the parameters are added.
data (pd.DataFrame) – A pandas dataframe that must at least have columns ds (predictor), y
(target) and series (name of time series).
model_idxs (dict[str, int]) – Count of the number of components from each type.
priors (dict[str, pt.TensorVariable]|None) – A dictionary of multivariate normal random variables approximating the
posterior sample in idata.
idata (az.InferenceData|None) – Sample from a posterior. If it is not None, Vangja will use this to set the
parameters’ priors in the model. If idata is not None, each component from
the model should specify how idata should be used to set its parameters’
priors.
A constant component with a Normal (Gaussian) prior distribution.
This component adds a constant term to the model that is sampled from a
Normal distribution. It’s useful for modeling baseline offsets or intercept
terms that may vary across different time series.
Parameters:
mu (float, default0) – The mean of the Normal prior for the constant parameter.
sd (float, default1) – The standard deviation of the Normal prior for the constant parameter.
pool_type (PoolType, default"complete") –
Type of pooling performed when sampling. Options are:
”complete”: All series share the same constant value.
”partial”: Series have individual constants with shared hyperpriors.
”individual”: Each series has a completely independent constant.
How the transfer learning is to be performed. Options are:
”parametric”: Use posterior mean and std from idata as new priors.
”prior_from_idata”: Use the posterior samples directly as priors.
None: This component will not be tuned even if idata is provided.
override_mu_for_tune (float|None, defaultNone) – Override the mean of the Normal prior for the constant parameter with
this value during transfer learning.
override_sd_for_tune (float|None, defaultNone) – Override the standard deviation of the Normal prior for the constant
parameter with this value during transfer learning.
shrinkage_strength (float, default1) – Shrinkage between groups for the hierarchical modeling. Higher values
result in stronger shrinkage toward the shared mean.
>>> fromvangjaimportLinearTrend,NormalConstant>>> # Add a normal constant offset to a linear trend>>> model=LinearTrend()+NormalConstant(mu=0,sd=10)>>> model.fit(data)>>> predictions=model.predict(horizon=30)
>>> # Use partial pooling for multi-series data>>> model=LinearTrend()+NormalConstant(mu=0,sd=10,pool_type="partial")
A constant component with a Beta prior distribution scaled to a range.
This component adds a constant term to the model that is sampled from a
Beta distribution and then scaled to lie within [lower, upper]. It’s useful
for modeling parameters that should be bounded and have flexible shapes
controlled by the alpha and beta parameters.
Parameters:
lower (float) – The lower bound for the constant parameter after scaling.
upper (float) – The upper bound for the constant parameter after scaling.
alpha (float, default0.5) – The alpha parameter of the Beta distribution. Controls the shape.
beta (float, default0.5) – The beta parameter of the Beta distribution. Controls the shape.
pool_type (PoolType, default"complete") –
Type of pooling performed when sampling. Options are:
”complete”: All series share the same constant value.
”partial”: Series have individual constants with shared hyperpriors.
”individual”: Each series has a completely independent constant.
How the transfer learning is to be performed. Options are:
”parametric”: Use posterior samples to derive new Beta parameters.
”prior_from_idata”: Use the posterior samples directly as priors.
None: This component will not be tuned even if idata is provided.
shrinkage_strength (float, default1) – Shrinkage between groups for the hierarchical modeling. Higher values
result in stronger shrinkage toward the shared mean.
The transformation from Beta to the scaled constant is:
c = beta_value * (upper - lower) + lower
Common choices for alpha and beta:
alpha=beta=0.5: Jeffrey’s prior (U-shaped, more mass at extremes)
alpha=beta=1: Uniform distribution
alpha=beta=2: Symmetric bell-shaped
Examples
>>> fromvangjaimportLinearTrend,BetaConstant>>> # Add a beta-distributed scaling factor between 0.8 and 1.2>>> model=LinearTrend()*BetaConstant(lower=0.8,upper=1.2,alpha=2,beta=2)>>> model.fit(data)>>> predictions=model.predict(horizon=30)
>>> # Use partial pooling for multi-series data>>> model=LinearTrend()*BetaConstant(lower=0.5,upper=1.5,... pool_type="partial")
A constant component with a Uniform prior distribution.
This component adds a constant term to the model that is sampled from a
Uniform distribution bounded by lower and upper limits. It’s useful for
modeling parameters that should be constrained to a specific range.
Parameters:
lower (float) – The lower bound of the Uniform prior for the constant parameter.
upper (float) – The upper bound of the Uniform prior for the constant parameter.
pool_type (PoolType, default"complete") –
Type of pooling performed when sampling. Options are:
”complete”: All series share the same constant value.
”partial”: Series have individual constants with shared hyperpriors.
”individual”: Each series has a completely independent constant.
How the transfer learning is to be performed. Options are:
”parametric”: Use posterior mean and std from idata to create a
truncated Normal prior.
”prior_from_idata”: Use the posterior samples directly as priors.
None: This component will not be tuned even if idata is provided.
shrinkage_strength (float, default1) – Shrinkage between groups for the hierarchical modeling. Higher values
result in stronger shrinkage toward the shared mean.
Filter predictions to only include dates relevant to a specific series.
When fitting multiple series simultaneously with different date ranges,
the predict() method generates predictions for the entire combined time
range. This function filters predictions to only include dates within a
specific series’ range, which is essential for correct metric calculation
and plotting.
Parameters:
future (pd.DataFrame) – Predictions dataframe from model.predict() containing ‘ds’ and yhat columns.
series_data (pd.DataFrame) – The original data for a specific series (train + test combined, or just
the portion you want to filter to). Must have ‘ds’ column.
yhat_col (str, default"yhat_0") – The name of the prediction column to include in the output.
horizon (int, default0) – Additional days beyond the series’ max date to include (for forecast period).
Returns:
Filtered predictions with columns [‘ds’, ‘yhat_0’] containing only dates
within the series’ range plus the specified horizon.
Return type:
pd.DataFrame
Examples
>>> # After fitting a multi-series model>>> future_combined=model.predict(horizon=365)>>> # Filter to only Air Passengers' relevant dates>>> future_passengers=filter_predictions_by_series(... future_combined,... air_passengers,# full dataset (train + test)... yhat_col=f"yhat_{passengers_group}",... horizon=365... )
Calculate evaluation metrics for time series predictions.
Computes Mean Squared Error (MSE), Root Mean Squared Error (RMSE),
Mean Absolute Error (MAE), and Mean Absolute Percentage Error (MAPE)
for each time series in the dataset.
Parameters:
y_true (pd.DataFrame) – A pandas dataframe containing the true values for the inference period
that must at least have columns ds (predictor), y (target) and series
(name of time series).
future (pd.DataFrame) – Pandas dataframe containing the timestamps and predictions. Must have
columns named ‘yhat_{group_code}’ for each group. The ‘ds’ column is
used to match predictions to test data by date.
pool_type (PoolType) – Type of pooling performed when sampling. Used to determine group
assignments in y_true.
Returns:
A dataframe with series names as index and columns for each metric:
‘mse’, ‘rmse’, ‘mae’, ‘mape’.
Predictions are matched to test data by merging on the ‘ds’ column. This
correctly handles cases where predictions are at a different frequency
than the test data (e.g., daily predictions vs monthly test data).
Remove random continuous intervals (gaps) from a time series DataFrame.
Creates realistic missing-data scenarios by removing n_gaps
non-overlapping contiguous blocks from the data. Each block removes
approximately gap_fraction of the total data points.
Parameters:
df (pd.DataFrame) – A time series DataFrame. Must have at least a ds column.
n_gaps (int, default4) – Number of contiguous intervals to remove.
gap_fraction (float, default0.2) – Fraction of total data points removed per gap.
Returns:
A copy of the input DataFrame with the specified gaps removed,
index reset.
Return type:
pd.DataFrame
Raises:
ValueError – If the total number of points to remove exceeds the length of the
DataFrame.
Compare multiple fitted models using information criteria.
Wraps arviz.compare to produce a ranked table of models scored by
WAIC or LOO-CV (PSIS).
Parameters:
model_dict (dict[str, az.InferenceData|object]) – Mapping of model names to either arviz.InferenceData objects or
fitted vangja model objects that expose a .trace attribute.
ic ({"loo","waic"}, default"loo") – Information criterion to use.
Returns:
Comparison table sorted by the chosen criterion (best model first).
Calculate the fraction of prior predictive samples within a plausible range.
This is a quantitative complement to visual prior predictive checks.
Because vangja scales the data so that \(y \approx [-1, 1]\) and
\(t \in [0, 1]\), comparing the prior predictive samples against a
fixed plausible window (default [-2,2]) reveals how informative or
diffuse the chosen priors are.
How to interpret the result:
< 5 % — priors are too loose. The sampler wastes time in
physically impossible regions. Reduce the prior standard deviations.
> 95 % — priors may be too tight. The model risks being unable to
capture sudden spikes or changepoints. Increase the prior standard
deviations.
30–60 % — a reasonable sweet spot for flexible models like Prophet.
The prior covers the data range without encouraging absurd values.
Parameters:
prior_predictive (az.InferenceData) – Result of model.sample_prior_predictive().
low (float, default-2.0) – Lower bound of the plausible range (in scaled space).
high (float, default2.0) – Upper bound of the plausible range (in scaled space).
series_idx (int or None) – If the prior predictive contains multiple series (e.g. from a hierarchical
model), specify which one to calculate coverage for. If None, calculates for everything.
If series_idx is not None you must also pass the corresponding group array
to the group parameter.
group (np.ndarray or None) – If the prior predictive contains multiple groups (e.g. from a hierarchical
model), specify which element belongs to which group.
Returns:
Fraction of individual sample values inside [low,high],
between 0 and 1.
>>> model=LinearTrend()+FourierSeasonality(365.25,10)>>> model.fit(data,method="mapx")>>> ppc=model.sample_prior_predictive(samples=500)>>> coverage=prior_predictive_coverage(ppc)>>> print(f"{coverage*100:.1f}% of prior samples are within [-2, 2]")
Plot prior and posterior densities on the same axes.
Generates a grid of subplots, one per parameter, showing the prior
density (from the analytic specification) and the posterior density
(from MCMC/VI samples).
Parameters:
trace (az.InferenceData) – Posterior samples from a fitted model.
prior_params (dict[str, dict[str, float]]) – Mapping of variable names to dicts describing the prior. Each dict
must contain "dist" (one of "normal", "halfnormal",
"laplace") and the relevant parameters ("mu"/"sigma" for
Normal, "sigma" for HalfNormal, "mu"/"b" for Laplace).
var_names (list[str] or None) – Subset of variables to include. Defaults to all keys in
prior_params.
Plot posterior predictive samples, overlaid on observed data.
Parameters:
posterior_predictive (az.InferenceData) – Result of model.sample_posterior_predictive().
series_idx (int or None) – If the posterior predictive contains multiple series (e.g. from a hierarchical
model), specify which one to plot. If None, plots everything.
If series_idx is not None you must also pass the corresponding group array
to the group parameter.
group (np.ndarray or None) – If the posterior predictive contains multiple groups (e.g. from a hierarchical
model), specify which element belongs to which group.
data (pd.DataFrame or None) – Observed data with columns ds and y.
n_samples (int, default50) – Number of posterior predictive traces to draw.
show_hdi (bool, defaultFalse) – If True, shade the Highest Density Interval across time.
hdi_prob (float, default0.9) – Probability mass for the HDI band (ignored when show_hdi=False).
show_ref_lines (bool, defaultFalse) – If True, draw horizontal dashed lines at the scaled-data bounds
given by ref_values.
ref_values (tuple[float, float], default(-1.0, 1.0)) – (lower,upper) values for the reference lines.
t (np.ndarray or None) – x-axis values. When None the observation index is used. Pass
model.data["t"].values for the normalised time axis, or
model.data["ds"].values for calendar dates.
Plot prior predictive samples, optionally overlaid on observed data.
Draws a “spaghetti plot” of prior predictive traces and, optionally,
an HDI envelope and horizontal reference lines to help judge whether
the chosen priors are plausible in the scaled data space.
Parameters:
prior_predictive (az.InferenceData) – Result of model.sample_prior_predictive().
series_idx (int or None) – If the prior predictive contains multiple series (e.g. from a hierarchical
model), specify which one to plot. If None, plots everything.
If series_idx is not None you must also pass the corresponding group array
to the group parameter.
group (np.ndarray or None) – If the prior predictive contains multiple groups (e.g. from a hierarchical
model), specify which element belongs to which group.
data (pd.DataFrame or None) – Observed data with columns ds and y.
n_samples (int, default50) – Number of prior predictive traces to draw.
ax (matplotlibaxes or None) – Axes to plot on. Created if None.
show_hdi (bool, defaultFalse) – If True, shade the Highest Density Interval across time.
hdi_prob (float, default0.9) – Probability mass for the HDI band (ignored when show_hdi=False).
show_ref_lines (bool, defaultFalse) – If True, draw horizontal dashed lines at the scaled-data bounds
given by ref_values. Useful for checking whether the prior
predictive concentrates within the plausible region of scaled data.
ref_values (tuple[float, float], default(-1.0, 1.0)) – (lower,upper) values for the reference lines (ignored when
show_ref_lines=False). The defaults correspond to the
approximate extent of maxabs-scaled data.
t (np.ndarray or None) – x-axis values. When None the observation index is used. Pass
model.data["t"].values for the normalised time axis, or
model.data["ds"].values for calendar dates.
Return type:
matplotlib.axes.Axes
Examples
>>> model=LinearTrend()+FourierSeasonality(365.25,10)>>> model.fit(data,method="mapx")>>> ppc=model.sample_prior_predictive(samples=200)>>> # Simple spaghetti plot>>> plot_prior_predictive(ppc)>>> # With HDI, reference lines and scaled time axis>>> plot_prior_predictive(... ppc,... data=data,... show_hdi=True,... show_ref_lines=True,... t=model.data["t"].values,... )
The Air Passengers dataset is a classic time series dataset containing
monthly totals of international airline passengers from January 1949 to
December 1960 (144 observations).
This dataset exhibits:
- Clear upward trend
- Strong yearly seasonality
- Multiplicative seasonality (variance increases with level)
Returns:
DataFrame with columns:
- ds: datetime, monthly timestamps from 1949-01 to 1960-12
- y: float, number of passengers (in thousands)
Return type:
pd.DataFrame
Examples
>>> fromvangja.datasetsimportload_air_passengers>>> df=load_air_passengers()>>> print(f"Shape: {df.shape}")Shape: (144, 2)>>> print(f"Date range: {df['ds'].min()} to {df['ds'].max()}")Date range: 1949-01-01 to 1960-12-01
Notes
Data is downloaded from the Prophet examples repository on GitHub.
Original source: Box, G. E. P., Jenkins, G. M. and Reinsel, G. C. (1976)
Time Series Analysis, Forecasting and Control. Third Edition.
This dataset contains daily bike ride counts from Citi Bike station 360
in New York City (2013-07-01 to 2014-10-31). It is used to demonstrate
forecasting short time series with transfer learning.
The dataset exhibits:
Strong weekly seasonality (weekday vs weekend patterns)
Yearly seasonality correlated with temperature/weather
Approximately 3 months of initial data used for training (~106 days)
Returns:
DataFrame with columns:
ds: datetime, daily timestamps from 2013-07-01 to 2014-10-31
This dataset is from Tim Radtke’s blog post “Modeling Short Time Series
with Prior Knowledge”. The vangja library was partially inspired by this
work and Juan Orduz’s PyMC implementation.
Requires the pyreadr package (install with pipinstallvangja[datasets]).
Load New York City historical daily temperature data.
This dataset contains daily maximum temperatures (Fahrenheit) for
New York City from 2012-10-01 to 2017-11-29. It is used to learn
yearly seasonality patterns that can be transferred to short time series.
This dataset is from Tim Radtke’s blog post “Modeling Short Time Series
with Prior Knowledge”. The temperature seasonality can be used as prior
information for forecasting related short time series (e.g., bike sales).
Load historical hourly temperature data from Kaggle.
Downloads the temperature.csv file from the
Historical Hourly Weather Data
dataset. Returns data for the requested city, filtered to the given
date range and aggregated to the specified frequency.
The raw data contains hourly observations in Kelvin. Values are
converted to Celsius before returning.
Parameters:
city (KaggleTemperatureCity, default"NewYork") – City column to extract. Must be one of the 36 cities in the
dataset (see KaggleTemperatureCity).
start_date (str, pd.Timestamp, or None, defaultNone) – Start of the date range (inclusive). If None, the earliest
available date is used (~2012-10-01).
end_date (str, pd.Timestamp, or None, defaultNone) – End of the date range (inclusive). If None, the latest
available date is used (~2017-11-30).
freq (str, default"D") – Pandas offset alias for temporal aggregation (e.g. "D" for
daily mean, "W" for weekly mean, "h" for hourly — no
aggregation). The aggregation function is mean.
Returns:
DataFrame with columns:
ds: datetime
y: float, temperature in degrees Celsius
series: str, the original city name from the Kaggle dataset
Downloads the HomeC.csv file from the
Smart Home Dataset with Weather Information
dataset. Returns data for the requested appliance or total column(s),
filtered to the given date range and aggregated to the specified
frequency.
The raw data has 1-minute resolution and covers roughly
2016-01-01 to 2016-12-16. Each column is in kW.
Parameters:
column (SmartHomeColumn or list[SmartHomeColumn], default"use[kW]") –
The appliance or total column(s) to extract (see
SmartHomeColumn). When a single string is passed the
returned DataFrame has columns ds and y. When a list
is passed the result is in long format with an additional
series column identifying each appliance.
Common choices:
"use[kW]" — total energy use
"gen[kW]" — total energy generation
"Houseoverall[kW]" — house overall consumption
"Dishwasher[kW]", "Fridge[kW]", etc. — individual
appliances
start_date (str, pd.Timestamp, or None, defaultNone) – Start of the date range (inclusive). If None, the earliest
available date is used (~2016-01-01).
end_date (str, pd.Timestamp, or None, defaultNone) – End of the date range (inclusive). If None, the latest
available date is used (~2016-12-16).
freq (str or None, defaultNone) – Pandas offset alias for temporal aggregation (e.g. "D" for
daily mean, "h" for hourly mean, "W" for weekly mean).
The aggregation function is mean. If None, no aggregation
is performed and the original 1-minute data is returned.
Returns:
DataFrame with columns:
ds: datetime
y: float, energy reading in kW
series: str (only when ``column`` is a list) —
the original column name from the Kaggle dataset
Creates 5 synthetic time series representing different stores, all sharing
the same date range. Each series has:
- Linear trend with different slopes and intercepts
- Yearly seasonality with different amplitudes
- Weekly seasonality
- Random noise
This dataset is ideal for demonstrating:
- Simultaneous vs sequential fitting
- Individual pooling across multiple series
- Vectorized multi-series forecasting
Parameters:
start_date (str, default"2015-01-01") – Start date for the time series
end_date (str, default"2019-12-31") – End date for the time series
freq (str, default"D") – Frequency of the time series (e.g., “D” for daily)
seed (int or None, default42) – Random seed for reproducibility. Set to None for random data.
Returns:
df (pd.DataFrame) – Combined DataFrame with columns:
- ds: datetime timestamps
- y: target values
- series: store name (e.g., “store_north”)
params (list of dict) – List of parameter dictionaries for each store, containing:
- name: store name
- trend_slope, trend_intercept: trend parameters
- yearly_amplitude, weekly_amplitude: seasonality amplitudes
- noise_std: noise standard deviation
Load historical stock data split into training and test sets.
Downloads daily OHLCV data for the specified tickers using Yahoo
Finance and computes the typical price as
(Open+High+Low+Close)/4. The data is split into a
training window and a test horizon around split_date.
Parameters:
tickers (list[str]) – List of ticker symbols to download (e.g., ["AAPL","MSFT"]).
split_date (str or pd.Timestamp) – The date separating training and test data. Training data
covers [split_date-window_size,split_date) and test
data covers [split_date,split_date+horizon_size].
window_size (int) – Number of calendar days for the training window (before
split_date).
horizon_size (int) – Number of calendar days for the test horizon (from
split_date onwards).
cache_path (Path or None, defaultNone) – Directory for caching downloaded data. Each ticker is stored
as a CSV file. If None, data is downloaded without caching.
If provided, parent directories are created if they do not
exist.
interpolate (bool, defaultFalse) – If True, missing days (weekends, holidays) within each series
are filled using linear interpolation after reindexing to a
daily calendar.
Get tickers consistently in the S&P 500 during a date range.
Returns tickers that were part of the S&P 500 for the entire duration
between start_date and end_date. A ticker is excluded if it
was removed at any point during the range, even if it was later
re-added.
Parameters:
start_date (str, datetime, or pd.Timestamp) – Start of the date range (inclusive).
end_date (str, datetime, or pd.Timestamp) – End of the date range (inclusive).
cache_path (Path or None, defaultNone) – Directory for caching Wikipedia data as CSV files. If None,
data is fetched without caching. If provided, parent directories
are created if they do not exist.
Returns:
Sorted list of ticker symbols that were consistently in the
S&P 500 during the entire date range.
Accuracy depends on Wikipedia’s “List of S&P 500 companies”
historical changes table, which has comprehensive data from
approximately 1997 onwards. Results for earlier periods may be
less accurate.