vangja#

A Bayesian time series forecasting package that extends Facebook Prophet with hierarchical modeling and transfer learning capabilities. Vangja enables practitioners to model short time series using prior knowledge derived from similar long time series and is particularly good at forecasting horizons longer than the available data.

The package has been inspired by:

Key Features#

🚀 Vectorized Multi-Series Fitting — Fit multiple time series simultaneously with vectorized computations, significantly faster than fitting sequentially with Facebook Prophet
📊 Hierarchical Bayesian Modeling — Model multiple related time series with flexible pooling strategies (complete, partial, or individual) for each component
🔄 Bayesian Transfer Learning — Learn from long time series and transfer knowledge to short time series, enabling accurate long-horizon forecasts from limited data
↔️ Bidirectional Changepoints — Interpret trend changepoints from right-to-left (in addition to left-to-right), essential for hierarchical modeling of time series with different lengths
🎯 Component-Level Flexibility — Independently configure pooling strategies and transfer learning methods for each model component (trend, seasonalities, etc.)

Installation#

You need to create a conda PyMC environment before installing vangja. The recommended way of installing PyMC is by running:

conda create -c conda-forge -n pymc_env python=3.13 "pymc>=5.27.1"

Install vangja with pip:

pip install vangja

Usage#

The data used for fitting the models is expected to be in the same format as the data used for fitting the Facebook Prophet model i.e. it should be a pandas dataframe, where the timestamp is stored in column ds and the value is stored in column y.

The API is heavily inspired by TimeSeers. A simple model consisting of a linear trend, a yearly seasonality and a weekly seasonality can be fitted like this:

from vangja import LinearTrend, FourierSeasonality

model = LinearTrend() + FourierSeasonality(365.25, 10) + FourierSeasonality(7, 3)
model.fit(data)
predictions = model.predict(horizon=365)

Vectorized Multi-Series Fitting#

Unlike Facebook Prophet, which fits time series one at a time, vangja can fit multiple time series simultaneously using vectorized computations. This is significantly faster when you have many related time series:

# Data must have a 'series' column identifying each time series
# Example: sales data from multiple stores
multi_series_data = pd.DataFrame({
    'ds': [...],  # timestamps
    'y': [...],   # values
    'series': ['store_A', 'store_A', ..., 'store_B', 'store_B', ...]
})

# Fit all series at once with independent parameters (no pooling)
model = LinearTrend(pool_type="individual") + FourierSeasonality(365.25, 10, pool_type="individual")
model.fit(multi_series_data)

Multiplicative operators#

There are two types of multiplicative operators that vangja supports. The first one supports creating models from components $g(t)$ and $s(t)$ in the form $y(t)=g(t) * (1 + s(t))$. Using vangja, this can be written by using the __pow__ operator:

model = LinearTrend() ** FourierSeasonality(365.25, 10)

The second multiplicative operator supports creating models from components $g(t)$ and $s(t)$ in the form $y(t)=g(t) * s(t)$. Using vangja, this can be written by using the __mul__ operator:

model = LinearTrend() * FourierSeasonality(365.25, 10)

Components#

Currently, vangja supports the following components:

LinearTrend#

A piecewise linear trend with changepoints. Vangja extends Prophet’s trend component with bidirectional changepoint interpretation.

LinearTrend(
    n_changepoints=25,       # Number of potential changepoints
    changepoint_range=0.8,   # Proportion of data for changepoint placement
    slope_mean=0,            # Prior mean for initial slope
    slope_sd=5,              # Prior std for initial slope
    intercept_mean=0,        # Prior mean for intercept
    intercept_sd=5,          # Prior std for intercept
    delta_mean=0,            # Prior mean for changepoint adjustments
    delta_sd=0.05,           # Prior std for changepoint adjustments
    delta_side="left",       # "left" or "right" - direction for changepoint interpretation
    pool_type="complete",    # Pooling: "complete", "partial", or "individual"
    delta_pool_type="complete",  # Separate pooling strategy for changepoints
    tune_method=None         # Transfer learning: "parametric" or "prior_from_idata"
)

Bidirectional Changepoints (`delta_side`)#

By default (delta_side="left"), the slope parameter controls the trend slope at the earliest timestamp, and changepoints modify the slope going forward in time.

Setting delta_side="right" reverses this: the slope parameter controls the trend slope at the latest timestamp, and changepoints modify the slope going backward in time. This is essential for hierarchical modeling when you have:

A long time series spanning many years
Multiple short time series that only cover the recent period

With delta_side="right", the slope parameter is informed by both the long and short time series (since they overlap at the end), rather than only the long time series (which alone covers the beginning).

FourierSeasonality#

Seasonal patterns modeled using Fourier series.

FourierSeasonality(
    period,                  # Period in days (e.g., 365.25 for yearly)
    series_order,            # Number of Fourier terms (higher = more flexible)
    beta_mean=0,             # Prior mean for Fourier coefficients
    beta_sd=10,              # Prior std for Fourier coefficients
    pool_type="complete",    # Pooling: "complete", "partial", or "individual"
    tune_method=None         # Transfer learning: "parametric" or "prior_from_idata"
)

NormalConstant#

A constant term with a Normal prior, useful for baseline offsets.

NormalConstant(
    mu=0,                    # Prior mean
    sigma=1,                 # Prior standard deviation
    pool_type="complete",    # Pooling: "complete", "partial", or "individual"
    tune_method=None         # Transfer learning: "parametric" or "prior_from_idata"
)

UniformConstant#

A constant term with a Uniform prior.

UniformConstant(
    lower=0,                 # Lower bound
    upper=1,                 # Upper bound
    pool_type="complete",    # Pooling: "complete", "partial", or "individual"
    tune_method=None         # Transfer learning: "parametric" or "prior_from_idata"
)

FlatTrend#

A constant-level baseline (intercept only, no slope, no changepoints). Useful when the time series has no discernible trend or is too short to estimate one reliably.

FlatTrend(
    intercept_mean=0,        # Prior mean for intercept
    intercept_sd=5,          # Prior std for intercept
    pool_type="complete",    # Pooling: "complete", "partial", or "individual"
    tune_method=None         # Transfer learning: "parametric" or "prior_from_idata"
)

BetaConstant#

A constant term with a scaled Beta prior, bounded between [lower, upper].

BetaConstant(
    lower=0,                 # Lower bound for scaling
    upper=1,                 # Upper bound for scaling
    alpha=2,                 # Beta distribution alpha parameter
    beta=2,                  # Beta distribution beta parameter
    pool_type="complete",    # Pooling: "complete", "partial", or "individual"
    tune_method=None         # Transfer learning: "parametric" or "prior_from_idata"
)

Pooling Types (Hierarchical Modeling)#

When modeling multiple time series together, you can control how parameters are shared using hierarchical Bayesian modeling. This is inspired by TimeSeers but with greater flexibility — vangja allows different pooling strategies for each component and even for different parameters within the same component.

"complete": All series share the same parameters. Best when series are very similar.
"partial": Hierarchical pooling with shared hyperpriors — parameters are drawn from a common distribution but can differ between series. This balances information sharing with individual variation.
"individual": Each series has completely independent parameters. Equivalent to fitting each series separately, but vectorized for speed.

Note: The pandas dataframe must have a series column that identifies which rows belong to which time series.

# Example: Hierarchical model with different pooling per component
# - Trend slope: partial pooling (similar but not identical across series)
# - Changepoints: complete pooling (shared across all series)
# - Yearly seasonality: complete pooling (same seasonal pattern)
# - Weekly seasonality: partial pooling (similar weekly patterns)

model = (
    LinearTrend(pool_type="partial", delta_pool_type="complete") 
    + FourierSeasonality(365.25, 10, pool_type="complete")
    + FourierSeasonality(7, 3, pool_type="partial")
)
model.fit(multi_series_data)

Why Different Pooling Strategies?#

Unlike TimeSeers which applies the same pooling to all components, vangja lets you choose based on domain knowledge:

Yearly seasonality: Often similar across related series → use "complete"
Weekly seasonality: May vary by series (e.g., different stores have different weekly patterns) → use "partial"
Trend slope: Usually similar for related series → use "partial"
Changepoints: When dealing with a long context series and short target series, changepoints are only observable in the long series → use "complete"

Model Tuning (Bayesian Transfer Learning)#

A core feature of vangja is the ability to transfer knowledge from a long time series to multiple short time series. This is particularly useful when:

You have only a few months of data but need to model yearly seasonality
You want to forecast a horizon longer than your available short time series
You have a “context” time series (e.g., market index) and want to use it to inform forecasts for related series (e.g., individual stocks)

Forecasting short time series is challenging because:

Long-period seasonalities (e.g., yearly) cannot be estimated from short data
Overfitting is likely when the forecast horizon exceeds the data length
Standard methods like Facebook Prophet will produce unreliable forecasts

Vangja implements Bayesian transfer learning: fit a model on a long time series, extract the posterior distributions of parameters, and use them as informed priors when fitting short time series.

Transfer Learning Methods#

There are two tuning methods available:

1. Parametric Transfer (`"parametric"`)#

Uses the posterior mean (you can also set the mode, or any other value that you need) and standard deviation from the fitted model to set new priors while keeping the same distribution form:

# Step 1: Fit on long time series
base_model = (
    LinearTrend(tune_method="parametric") 
    + FourierSeasonality(365.25, 10, tune_method="parametric")
)
base_model.fit(long_time_series, method="nuts", samples=1000, chains=4)

# Step 2: Transfer to short time series
# The posterior from step 1 becomes the prior for step 2
target_model = (
    LinearTrend(tune_method="parametric") 
    + FourierSeasonality(365.25, 10, tune_method="parametric")
)
target_model.fit(short_time_series, idata=base_model.trace)

# Step 3: Forecast with confidence
predictions = target_model.predict(horizon=365)  # Can forecast beyond the short series length!

2. Prior from InferenceData (`"prior_from_idata"`)#

Uses the full posterior samples via multivariate normal approximation, preserving correlations between parameters:

base_model = (
    LinearTrend(tune_method="prior_from_idata") 
    + FourierSeasonality(365.25, 10, tune_method="prior_from_idata")
)
base_model.fit(long_time_series, method="nuts", samples=1000, chains=4)

target_model = (
    LinearTrend(tune_method="prior_from_idata") 
    + FourierSeasonality(365.25, 10, tune_method="prior_from_idata")
)
target_model.fit(short_time_series, idata=base_model.trace)

This method captures parameter dependencies (e.g., correlation between trend slope and seasonality amplitude) that the parametric method ignores.

Combining Hierarchical Modeling with Transfer Learning#

Vangja uniquely allows you to combine both approaches:

# Step 1: Fit base model on long "context" time series
base_model = (
    LinearTrend(tune_method="parametric") 
    + FourierSeasonality(365.25, 10, tune_method="parametric")
)
base_model.fit(long_context_series, method="nuts", samples=1000, chains=4)

# Step 2: Combine short target time series
target_data = pd.concat([
    short_series_1.assign(series='target_1'),
    short_series_2.assign(series='target_2'),
])

# Step 3: Hierarchical model with transfer learning on targets
target_model = (
    LinearTrend(
        pool_type="partial",           # Hierarchical pooling
        delta_side="right",            # Slope parameter informed by all series
        tune_method="parametric"       # Transfer from context to targets
    ) 
    + FourierSeasonality(365.25, 10, pool_type="complete", tune_method="parametric")
)

# Fit targets using posterior from context as priors
target_model.fit(target_data, idata=base_model.trace)
predictions = target_model.predict(horizon=365)

Regularization for Transfer Learning#

To prevent overfitting when transferring knowledge, vangja supports regularization via the loss_factor_for_tune parameter. This adds a penalty term that constrains parameters to stay close to the values learned from the long time series:

FourierSeasonality(
    365.25, 10, 
    tune_method="parametric",
    loss_factor_for_tune=1.0  # Higher = stronger regularization toward context series
)

Plotting#

After fitting, you can visualize the model components:

# Make predictions
predictions = model.predict(horizon=365)

# Plot the overall model fit and forecast
model.plot(predictions)

# Plot with actuals overlaid
model.plot(predictions, y_true=test_data)

Metrics#

Evaluate forecast accuracy using built-in metrics:

from vangja.utils import metrics

# Compare actual vs predicted values
results = metrics(test_data, predictions, pool_type="complete")
# Returns MSE, RMSE, MAE, MAPE per series

Vangja vs Facebook Prophet vs TimeSeers#

Feature	Facebook Prophet	TimeSeers	Vangja
Single time series	✅	✅	✅
Vectorized multi-series	❌	✅	✅
Hierarchical Bayesian	❌	✅	✅
Per-component pooling	❌	❌	✅
Bidirectional changepoints	❌	❌	✅
Transfer learning	❌	❌	✅
Parametric prior transfer	❌	❌	✅
Multivariate Gaussian prior	❌	❌	✅
Regularization for transfer	❌	❌	✅
Modern PyMC (5.x)	❌	❌	✅

Inference Methods#

Vangja supports multiple inference methods:

# MAP estimation (fast, recommended for quick results)
model.fit(data, method="mapx")  # Uses JAX backend via pymc-extras

# Full Bayesian inference with MCMC
model.fit(data, method="nuts", samples=1000, chains=4)

# Variational inference
model.fit(data, method="advi", samples=1000)

Contributing#

Pull requests and suggestions are always welcome. Please open an issue on the issue list before submitting in order to avoid doing unnecessary work.

vangja

Contents