Models¶

Contains all Gaussian Process models of the Library.

class ife_surrogate.gp.models.GPModel(kernel, X, Y)¶

Bases: ABC

Abstract base class of a GP, to be used as skeleton for actual GP implementations.

Parameters:

kernel (Kernel)
X (Array)
Y (Array)

abstract predict()¶

get_attributes()¶

save(filename='IfeSurrModel.pkl')¶

static load(filename)¶

class ife_surrogate.gp.models.WidebandGP(X, Y, kernel, frequency=None)¶

Bases: GPModel

Wideband Gaussian Process (GP) model for multi-output regression tasks.

This GP handles multiple correlated outputs across a frequency domain, where the outputs can be thought of as different frequency points.

\[f \sim \mathcal{GP}(\mathbf{0}, \sigma_i^2 \otimes k(x, x'))\]

Parameters:

X (Array) – Training input data of shape (n_samples, n_features).
Y (Array) – Training output data of shape (n_samples, n_outputs), where each output dimension typically corresponds to a different frequency.
kernel (Kernel) – Instance of a kernel class defining the covariance structure.
frequency (Array, optional) – Array of frequency values corresponding to the output dimensions. If provided, can be used to structure the multi-output modeling more explicitly.

train()¶

train_scipy()¶

train_swarm()¶

predict_old()¶

predict(X_test)¶

Predict the mean and variance for the given test inputs.

\[p(\mathbf{f}_{*,p} \mid \mathbf{X}, \mathbf{y}_p, \mathbf{X}_*) = \mathcal{N}(\mathbf{f}_* \mid \boldsymbol{\mu}_p, \boldsymbol{\Sigma}_p)\]

\[\boldsymbol{\mu}_p = \mathbf{K}_*^\top \mathbf{K}_x^{-1} \mathbf{y}_p\]

\[\boldsymbol{\Sigma}_p = \sigma_p^2 \left( \mathbf{K}_{**} - \mathbf{K}_*^\top \mathbf{K}_x^{-1} \mathbf{K}_* \right)\]

Parameters:: X_test (Array) – Test input data of shape (n_test_samples, n_features).
Returns:: Mean predictions and variances for each output dimension. Shapes: (n_test_samples, n_outputs).
Return type:: Tuple[Array, Array]

log_marginal_likelihood()¶

Compute the (negative) log marginal likelihood of the wideband model.

\[ -\log p(\mathbf{Y} \mid \mathbf{X}, \boldsymbol{\theta}, \boldsymbol{\sigma}^2) = -\sum_{p=1}^P \log\big( p(\mathbf{y}_p \mid \mathbf{X}, \boldsymbol{\theta}, \sigma_p^2) \big)\]

Returns:: The negative log marginal likelihood value.
Return type:: float

log_likelihood_scalar()¶

Compute the (negative) log marginal likelihood assuming scalar outputs.

\[\mathcal{L}_p = p(\mathbf{y}_p|\mathbf{X}, \mathbf{\theta}, \sigma^2_p)\]

Returns:: An array containing the value of the log likelihood at every task. If summed over it is the same as the complete log likelihood.
Return type:: Array

sample_posterior(key, X, n_samples)¶

Sample from the posterior distribution at new input locations.

Parameters:

key (Key) – Random key for sampling.
X (Array) – Input locations where to sample the posterior.
n_samples (int) – Number of posterior samples to draw.

Returns:

Posterior samples of shape (n_samples, n_test_samples, n_outputs).

Return type:

Array

sample_prior(key, X, n_samples)¶

Sample from the prior distribution at new input locations.

Parameters:

key (Key) – Random key for sampling.
X (Array) – Input locations where to sample the prior.
n_samples (int) – Number of prior samples to draw.

Returns:

Prior samples of shape (n_samples, n_test_samples, n_outputs).

Return type:

Array

class ife_surrogate.gp.models.WidebandGPBaysian(X, Y, kernel)¶

Bases: GPModel

Bayesian Wideband Gaussian Process model using MCMC sampling.

This model leverages full Bayesian inference over the kernel parameters using Hamiltonian Monte Carlo (HMC) sampling with the NUTS algorithm.

Parameters:

kernel (Kernel) – Kernel object specifying the covariance structure.
X (Array) – Training input data of shape (n_samples, n_features).
Y (Array) – Training output data of shape (n_samples, n_outputs).

model_forward()¶

Defines the probabilistic model for Bayesian inference.

Samples kernel parameters from their priors and defines the joint likelihood over all output frequencies under a multivariate normal distribution.

Notes

Assumes that the data variance per output is known and fixed.
Kernel parameters (like “power” and “lengthscale”) are sampled from priors.

train(key, num_samples=100, num_warmup=100)¶

Run MCMC to sample from the posterior distribution of the kernel parameters.

Parameters:

key (Key) – Random key for MCMC sampling.
num_samples (int, optional) – Number of MCMC samples to collect after warm-up.
num_warmup (int, optional) – Number of warm-up (burn-in) steps before sampling.

Returns:

Updates the model’s samples attribute with posterior samples.

Return type:

None

predict(X_test)¶

Predict mean and variance at test points using posterior samples.

For each posterior sample of the kernel parameters, predictions are made and aggregated.

Parameters:

X_test (Array) – Test input data of shape (n_test_samples, n_features).

Returns:

Mean predictions for each MCMC sample: (n_samples, n_test_samples, n_outputs).
Variances for each MCMC sample: (n_samples, n_test_samples, n_outputs).

Return type:

Tuple[Array, Array]

sample_posterior(key, X, n_samples)¶

Sample from the posterior distribution at new input locations.

Parameters:

key (Key) – Random key for sampling.
X (Array) – Input locations where to sample the posterior.
n_samples (int) – Number of posterior samples to draw.

Returns:

Posterior samples of shape (n_samples, n_test_samples, n_outputs).

Return type:

Array

sample_prior(key, X, n_samples)¶

Sample from the prior distribution at new input locations.

Parameters:

key (Key) – Random key for sampling.
X (Array) – Input locations where to sample the prior.
n_samples (int) – Number of prior samples to draw.

Returns:

Prior samples of shape (n_samples, n_test_samples, n_outputs).

Return type:

Array

save(path)¶

Save the model parameters to a file.

Parameters:: path (str) – Path where the model parameters should be saved.

class ife_surrogate.gp.models.ScalarGP(X, Y, kernel)¶

Bases: GPModel

Scalar Gaussian Process (GP) model for scalar regression tasks.

\[f \thicksim \mathcal{GP}(0, k(x, x'))\]

Parameters:

X (Array) – Training input data of shape (n_samples, n_features).
Y (Array) – Training output data of shape (n_samples, 1)
kernel (Kernel) – Instance of a kernel class defining the covariance structure.

train(key, optim_dictionary={'opt': <function adam>, 'settings': {'learning_rate': 0.01}}, n_steps=1000, n_restarts=1, save_history=False, verbose=False)¶

Optimize kernel parameters by minimizing the negative marginal log-likelihood using gradient-based optimizers.

Parameters:

key (Key) – Random key for parameter initialization and restarts.
optim_dictionary (Dict, optional) – Dictionary containing the optimizer class (“opt”) and its settings (“settings”).
num_steps (int, optional) – Number of optimization steps for each restart.
number_restarts (int, optional) – Number of independent optimization restarts with different initializations.
save_history (bool, optional) – If True, saves the full optimization trajectory (parameter history).
verbose (bool, optional) – If True, prints optimization progress.
n_steps (Int)
n_restarts (int)

Returns:

Optimized parameters and, if requested, parameter history.

Return type:

Tuple

train_scipy(key, number_restarts=1, opt_algorithm={'method': 'L-BFGS-B'}, save_history=False, verbose=True)¶

Optimize kernel parameters using a scipy-based optimizer (e.g., L-BFGS-B).

Parameters:

key (Key) – Random key for parameter initialization and restarts.
number_restarts (int, optional) – Number of independent optimization restarts with different initializations.
opt_algorithm (Dict, optional) – Dictionary specifying the optimization method and settings (e.g., “method”: “L-BFGS-B”).
save_history (bool, optional) – If True, saves the full optimization trajectory.
verbose (bool, optional) – If True, prints optimization progress.

Returns:

Optimized parameters and, if requested, parameter history.

Return type:

Tuple

predict(X_test)¶

Predict the mean and variance for the given test inputs.

Parameters:: X_test (Array) – Test input data of shape (n_test_samples, n_features).
Returns:: Mean predictions and variances for each output dimension. Shapes: (n_test_samples, n_outputs)
Return type:: Tuple[Array, Array]

log_marginal_likelihood()¶

Compute the (negative) log marginal likelihood of the current model.

Returns:: The negative log marginal likelihood value.
Return type:: float

sample_posterior(key, X, n_samples)¶

Sample from the posterior distribution at new input locations.

Parameters:

key (Key) – Random key for sampling.
X (Array) – Input locations where to sample the posterior.
n_samples (int) – Number of posterior samples to draw.

Returns:

Posterior samples of shape (n_samples, n_test_samples, n_outputs).

Return type:

Array

sample_prior(key, X, n_samples)¶

Sample from the prior distribution at new input locations.

Parameters:

key (Key) – Random key for sampling.
X (Array) – Input locations where to sample the prior.
n_samples (int) – Number of prior samples to draw.

Returns:

Prior samples of shape (n_samples, n_test_samples, n_outputs).

Return type:

Array

save(path)¶

Save the model parameters to a file.

Parameters:: path (str) – Path where the model parameters should be saved.

class ife_surrogate.gp.models.ScalarGPBaysian(X, Y, kernel)¶

Bases: GPModel

Fully baysian Scalar Gaussian Process (GP) model for scalar regression tasks.

\[f \thicksim \mathcal{GP}(0, k(x, x'))\]

Parameters:

X (Array) – Training input data of shape (n_samples, n_features).
Y (Array) – Training output data of shape (n_samples, 1)
kernel (Kernel) – Instance of a kernel class defining the covariance structure.

model_forward()¶

Forward pass of the model required for Bayesian inference with NumPyro.

This function defines the joint distribution over the observations and the kernel hyperparameters.

\[p(\mathbf{y} \mid \mathbf{X}, \boldsymbol{\theta}) p(\boldsymbol{\theta})\]

By specifying priors over hyperparameters and a Gaussian Process likelihood, we enable NumPyro to perform inference and approximate the posterior over kernel parameters:

\[p(\boldsymbol{\theta} \mid \mathcal{D}) \propto p(\mathbf{y} \mid \mathbf{X}, \boldsymbol{\theta}) \, p(\boldsymbol{\theta})\]

where:

\(p(\boldsymbol{\theta})\) are the user-defined priors over kernel parameters,
\(p(\mathbf{y} \mid \mathbf{X}, \boldsymbol{\theta})\) is the GP marginal likelihood:

\[\mathbf{y} \sim \mathcal{N}(\mathbf{0}, K_{\boldsymbol{\theta}} + \sigma^2 I)\]

The kernel matrix \(K_{\boldsymbol{\theta}}\) is computed using the sampled hyperparameters.

predict(samples, X_test)¶

Predict the mean and variance for the given test inputs.

Parameters:: X_test (Array) – Test input data of shape (n_test_samples, n_features).
Returns:: Mean predictions and variances for each output dimension. Shapes: (n_test_samples, n_outputs)
Return type:: Tuple[Array, Array]

log_marginal_likelihood()¶

Compute the (negative) log marginal likelihood of the current model.

Returns:: The negative log marginal likelihood value.
Return type:: float

sample_posterior(key, X, n_samples)¶

Sample from the posterior distribution at new input locations.

Parameters:

key (Key) – Random key for sampling.
X (Array) – Input locations where to sample the posterior.
n_samples (int) – Number of posterior samples to draw.

Returns:

Posterior samples of shape (n_samples, n_test_samples, n_outputs).

Return type:

Array

sample_prior(key, X, n_samples)¶

Sample from the prior distribution at new input locations.

Parameters:

key (Key) – Random key for sampling.
X (Array) – Input locations where to sample the prior.
n_samples (int) – Number of prior samples to draw.

Returns:

Prior samples of shape (n_samples, n_test_samples, n_outputs).

Return type:

Array

save(path)¶

Save the model parameters to a file.

Parameters:: path (str) – Path where the model parameters should be saved.

class ife_surrogate.gp.models.SeparableMultiOutputGP(X, Y, kernel, frequency=None)¶

Bases: GPModel

Separable Multi-Output Gaussian Process.

(1)¶\[\begin{split}\begin{aligned} \text{Model:} \quad & \operatorname{vec}(Y) \sim \mathcal{N}\big(0, K_w \otimes K_x \big) \\[1em] \text{NMLL:} \quad & \mathcal{L}(Y; K_x, K_w) \\ & = \tfrac{1}{2} \operatorname{vec}(Y)^{\mathsf{T}} \big( K_w^{-1} \otimes K_x^{-1} \big) \operatorname{vec}(Y) + \tfrac{1}{2}\log \big| K_w \otimes K_x \big| + \tfrac{NP}{2}\log(2\pi) \\[0.5em] & = \tfrac{1}{2} \operatorname{Tr}\big( K_x^{-1} Y K_w^{-1} Y^{\mathsf{T}} \big) + \tfrac{1}{2} \Big( P \log |K_x| + N \log |K_w| \Big) + \tfrac{NP}{2}\log(2\pi) \end{aligned}\end{split}\]

Parameters:

X (Array) – Training input data of shape (n_samples, n_features).
Y (Array) – Training output data of shape (n_samples, n_outputs).
kernel (Kernel) – Instance of a kernel class defining the covariance structure.
frequency (Array, optional) – Array of frequency values corresponding to the output dimensions.

train(key, optimizer={'opt': <function adam>, 'patience': 20, 'settings': {'learning_rate': 0.01}, 'tolerance': 1e-05}, sample_parameters=True, n_steps=1000, n_restarts=1, save_history=False, verbose=False)¶

Optax training function: Optimize kernel parameters by minimizing the negative marginal log-likelihood using gradient-based optimizers.

Parameters:

key (Key) – Random key for parameter initialization and restarts.
optimizer (Dict, optional) – Dictionary containing the optimizer class (“opt”) and its settings (“settings”).
sample_parameters (Bool) – Initial parameters for optimisation are sampled from the priors. If False then the current value of the hyperparameters is used to start the optimization.
n_steps (int, optional) – Number of optimization steps for each restart.
n_restarts (int, optional) – Number of independent optimization restarts with different initializations.
save_history (bool, optional) – If True, saves the full optimization trajectory (parameter history).
verbose (bool, optional) – If True, prints optimization progress.

Returns:

Optimized parameters and, if requested, parameter history.

Return type:

Tuple

train_scipy(key, n_restarts=1, optimizer={'method': 'L-BFGS-B'}, save_history=False, verbose=True)¶

Scipy training function: Optimize kernel parameters using a scipy-based optimizer (e.g., L-BFGS-B).

Parameters:

key (Key) – Random key for parameter initialization and restarts.
n_restarts (int, optional) – Number of independent optimization restarts with different initializations.
optimizer (Dict, optional) – Dictionary specifying the optimizer method and its settings (e.g., “method”: “L-BFGS-B”). (As defined in scipy: https://docs.scipy.org/doc/scipy/tutorial/optimize.html)
save_history (bool, optional) – If True, saves the full optimization trajectory.
verbose (bool, optional) – If True, prints optimization progress.

Returns:

The optimized hyperparameters are updated in the kernel.

Return type:

None

train_swarm(key, bounds, n_restarts=1, optimizer={'c1': 0.5, 'c2': 0.3, 'w': 0.9}, n_particles=20, n_iterations=100, save_history=False, verbose=True)¶

Pyswarms training function: Optimize kernel parameters using PSO

Parameters:

key (Key) – Random key for parameter initialization and restarts.
n_restarts (int, optional) – Number of independent optimization restarts with different initializations.
optimizer (Dict, optional) –

Dictionary specifying the optimization settings c1, c2, w as defined in pyswarms
c1: particles own best position c2: global best position w: balance between c1, c2 k: number of neighboring particles to consult p: distance metric (1: manhattan, 2: euclidean)
n_particles (Int, optional) – Number of particles in the swarm.
save_history (bool, optional) – If True, saves the full optimization trajectory.
verbose (bool, optional) – If True, prints optimization progress.
bounds (Tuple[Array, Array])
n_iterations (int)

Returns:

The optimized hyperparameters are updated in the kernel.

Return type:

None

predict_old()¶

predict(XW_test)¶

Predict the mean and variance for the given test inputs.

\[p(\mathbf{f_{*,p}}|\mathbf{X},\mathbf{y_p},\mathbf{X_*}) = \mathcal{N}(\mathbf{f_*}|\mathbf{\mu_{p}},\mathbf{\Sigma_p})\]

\[\mathbf{\mu_p} = \mathbf{K_*}^{T}\mathbf{K_x}^{-1} \mathbf{y_p}\]

\[\mathbf{\Sigma_p} = \sigma_p^2 (\mathbf{K_{**}} - \mathbf{K_*}^{T}\mathbf{K_x}^{-1}\mathbf{K_*})\]

Parameters:: XW_test (Tuple(Array, Array)) – A tuple containing X_test and f_test.
Returns:: Mean predictions and variances for each output dimension. Shapes: (n_test_samples, n_outputs).
Return type:: Tuple[Array, Array]

log_marginal_likelihood()¶

Compute the (negative) log marginal likelihood of the wideband model.

\[ -\log p(\mathbf{Y} \mid \mathbf{X}, \boldsymbol{\theta}, \boldsymbol{\sigma}^2) = -\sum_{p=1}^P \log\big( p(\mathbf{y}_p \mid \mathbf{X}, \boldsymbol{\theta}, \sigma_p^2) \big)\]

Returns:: The negative log marginal likelihood value.
Return type:: float

log_likelihood_scalar()¶

Compute the (negative) log marginal likelihood assuming scalar outputs.

\[\mathcal{L}_p = p(\mathbf{y}_p|\mathbf{X}, \mathbf{\theta}, \sigma^2_p)\]

Returns:: An array containing the value of the log likelihood at every task. If summed over it is the same as the complete log likelihood.
Return type:: Array

sample_posterior(key, X, n_samples)¶

Sample from the posterior distribution at new input locations.

Parameters:

key (Key) – Random key for sampling.
X (Array) – Input locations where to sample the posterior.
n_samples (int) – Number of posterior samples to draw.

Returns:

Posterior samples of shape (n_samples, n_test_samples, n_outputs).

Return type:

Array

sample_prior(key, X, n_samples)¶

Sample from the prior distribution at new input locations.

Parameters:

key (Key) – Random key for sampling.
X (Array) – Input locations where to sample the prior.
n_samples (int) – Number of prior samples to draw.

Returns:

Prior samples of shape (n_samples, n_test_samples, n_outputs).

Return type:

Array

save(path)¶

Save the model parameters to a file.

Parameters:: path (str) – Path where the model parameters should be saved.