Documentation

Calibration

Calibration

The calibration module estimates model parameters by comparing simulations with observations.

Use calibration when you have:

ItemMeaning
Observed dataMeasured time series, event values, or other reference outputs.
A simulation modelA function that maps parameter rows to simulated outputs.
Parameter boundsLower and upper limits for the parameters to calibrate.
A performance metricFor example rmse, nse, or kge.

Calibration methods in UQPyL work with ModelProblem, not ordinary Problem.

Choose a Calibration Method

Start from how you want to use candidate parameter sets.

MethodUse whenMain output
GLUEYou already have candidate parameters and want to keep behavioral samples under a threshold.behavioralDecs, behavioralSims
SUFI2You want elite samples and updated uncertainty bounds.eliteDecs, updatedLb, updatedUb, pfactor, rfactor
ESYou want one ensemble-smoother update.posteriorDecs, posteriorSims
IESYou want repeated ensemble-smoother updates.posteriorDecs, posteriorSims, iterative history

Practical default: use GLUE when you want a simple first calibration pass with existing samples. Use SUFI2 when uncertainty bounds matter. Use ES or IES when you are working with ensemble smoothing.

Calibration Workflow

The usual workflow is:

text
obs + simFunc + parameter bounds -> ModelProblem -> calibration.run(...) -> CalResult
StepAction
Prepare observationsStore observations as a 2D array with shape (n_time, n_series).
Define simulationWrite simFunc(X) for batched parameter rows.
Build ModelProblemProvide nInput, lb, ub, simFunc, obs, and optional mask.
Choose methodUse GLUE, SUFI2, ES, or IES.
Read resultInspect bestDecs, bestSim, posterior, elite, or behavioral samples.

Build a ModelProblem

ModelProblem connects parameter samples to simulation outputs.

In this toy model, the two parameters directly simulate two time steps:

text
params [p1, p2] -> simulation [p1, p2]
obs = [1.0, 2.0]
python
import numpy as np

from UQPyL.problem import ModelProblem

np.set_printoptions(precision=4, suppress=True)


obs = np.array([[1.0], [2.0]])


def simFunc(X):
    X = np.atleast_2d(X)
    sim = np.zeros((X.shape[0], 2, 1))
    sim[:, 0, 0] = X[:, 0]
    sim[:, 1, 0] = X[:, 1]
    return sim


problem = ModelProblem(nInput=2, ub=3.0, lb=0.0, simFunc=simFunc, obs=obs, simLabels=["Q"], name="ToyModel")

sim = problem.simFunc([[1.0, 2.0]])

print(sim.shape)
print(problem.flattenObs())
print(problem.flattenMask())

Example output:

text
(1, 2, 1)
[1. 2.]
[False False]

Shape rules:

ObjectRequired shapeMeaning
X(n_samples, n_input)Candidate parameter rows.
obs(n_time, n_series)Observed values.
simFunc(X)(n_samples, n_time, n_series)Simulated values for every candidate row.
flattened simulation(n_samples, n_time * n_series)Internal scoring layout.

For non-computer-science users, read X as a table:

text
one row = one parameter set
one column = one parameter

simFunc must return one simulation for each row of X.

Run GLUE

GLUE scores every candidate parameter row and keeps behavioral samples.

For lower-is-better metrics such as rmse, a sample is behavioral when:

text
score <= threshold

For higher-is-better metrics such as nse, a sample is behavioral when:

text
score >= threshold
python
import numpy as np

from UQPyL.calibration import GLUE
from UQPyL.problem import ModelProblem

np.set_printoptions(precision=4, suppress=True)


obs = np.array([[1.0], [2.0]])


def simFunc(X):
    X = np.atleast_2d(X)
    sim = np.zeros((X.shape[0], 2, 1))
    sim[:, 0, 0] = X[:, 0]
    sim[:, 1, 0] = X[:, 1]
    return sim


problem = ModelProblem(nInput=2, ub=3.0, lb=0.0, simFunc=simFunc, obs=obs, simLabels=["Q"], name="ToyModel")

X = np.array([[1.0, 2.0], [1.0, 2.4], [0.0, 0.0]])

result = GLUE(metric="rmse", verboseFlag=False, logFlag=False, saveFlag=False).run(problem, X, threshold=0.3)

print(result.bestDecs)
print(result.bestSim)
print(result.behavioralDecs)
print(result.diagnostics["scores"])
print(result.diagnostics["behavioralMask"])

Example output:

text
[[1. 2.]]
[[1. 2.]]
[[1.  2. ]
 [1.  2.4]]
[0.     0.2828 1.5811]
[ True  True False]

Interpretation:

OutputMeaning
bestDecsBest parameter row.
bestSimSimulation from the best parameter row.
behavioralDecsCandidate rows that pass the threshold.
scoresMetric value for every candidate row.
behavioralMaskBoolean mask showing which rows passed.

The first sample is perfect. The second sample has RMSE below 0.3, so it is also behavioral. The third sample is rejected.

Metric Direction

Calibration methods accept metric names or a custom callable.

MetricBetter direction
mseLower is better
maeLower is better
rmseLower is better
nseHigher is better
r2Higher is better
pbiasLower is better
pearson_rHigher is better
kgeHigher is better

For example, nse uses a higher-is-better threshold:

python
import numpy as np

from UQPyL.calibration import GLUE
from UQPyL.problem import ModelProblem

np.set_printoptions(precision=4, suppress=True)


obs = np.array([[1.0], [2.0]])


def simFunc(X):
    X = np.atleast_2d(X)
    sim = np.zeros((X.shape[0], 2, 1))
    sim[:, 0, 0] = X[:, 0]
    sim[:, 1, 0] = X[:, 1]
    return sim


problem = ModelProblem(nInput=2, ub=3.0, lb=0.0, simFunc=simFunc, obs=obs, simLabels=["Q"], name="ToyModel")

X = np.array([[1.0, 2.0], [1.0, 3.0], [0.0, 0.0]])
result = GLUE(metric="nse", verboseFlag=False, logFlag=False, saveFlag=False).run(problem, X, threshold=0.0)

print(result.diagnostics["scores"])
print(result.diagnostics["behavioralMask"])

Example output:

text
[ 1. -1. -9.]
[ True False False]

Only the first sample has nse >= 0.0.

Use Masks

Use mask to ignore observation entries during scoring.

mask must have the same shape as obs.

python
import numpy as np

from UQPyL.problem import ModelProblem


obs = np.array([[1.0, 10.0], [2.0, 20.0]])
mask = np.array([[False, True], [False, True]])


def simFunc(X):
    X = np.atleast_2d(X)
    sim = np.zeros((X.shape[0], 2, 2))
    sim[:, 0, 0] = X[:, 0]
    sim[:, 1, 0] = X[:, 1]
    sim[:, :, 1] = 999.0
    return sim


problem = ModelProblem(nInput=2, ub=3.0, lb=0.0, simFunc=simFunc, obs=obs, mask=mask, simLabels=["Q", "Ignored"], name="MaskedToyModel")

print(problem.obs.shape)
print(problem.mask.shape)
print(problem.flattenMask())

Example output:

text
(2, 2)
(2, 2)
[False  True False  True]

Masked entries are ignored by calibration metrics. In this example, the second series is ignored even though the simulation writes 999.0 into it.

Run SUFI2

SUFI2 selects elite samples and updates uncertainty bounds from those elite samples.

python
import numpy as np

from UQPyL.calibration import SUFI2
from UQPyL.problem import ModelProblem

np.set_printoptions(precision=4, suppress=True)


obs = np.array([[1.0], [2.0]])


def simFunc(X):
    X = np.atleast_2d(X)
    sim = np.zeros((X.shape[0], 2, 1))
    sim[:, 0, 0] = X[:, 0]
    sim[:, 1, 0] = X[:, 1]
    return sim


problem = ModelProblem(nInput=2, ub=3.0, lb=0.0, simFunc=simFunc, obs=obs, simLabels=["Q"], name="ToyModel")

X = np.array([[1.0, 2.0], [1.0, 2.4], [0.0, 0.0]])

result = SUFI2(verboseFlag=False, logFlag=False, saveFlag=False).run(problem, X, eliteSize=2)

print(result.bestDecs)
print(result.eliteDecs)
print(result.diagnostics["scores"])
print(result.diagnostics["updatedLb"])
print(result.diagnostics["updatedUb"])
print(result.diagnostics["pfactor"], result.diagnostics["rfactor"])

Example output:

text
[[1. 2.]]
[[1.  2. ]
 [1.  2.4]]
[0.     0.2828 1.5811]
[1. 2.]
[1.  2.4]
0.5 0.38000000000000034

Read this as:

OutputMeaning
eliteDecsBest eliteSize parameter rows.
updatedLb, updatedUbNew parameter bounds inferred from elite samples.
pfactorFraction of observations bracketed by the prediction uncertainty band.
rfactorAverage width of the uncertainty band relative to observation variability.

SUFI2 can also generate samples internally. Set nSamples on the method and omit X:

python
import numpy as np

from UQPyL.calibration import SUFI2
from UQPyL.problem import ModelProblem

np.set_printoptions(precision=4, suppress=True)


obs = np.array([[1.0], [2.0]])


def simFunc(X):
    X = np.atleast_2d(X)
    sim = np.zeros((X.shape[0], 2, 1))
    sim[:, 0, 0] = X[:, 0]
    sim[:, 1, 0] = X[:, 1]
    return sim


problem = ModelProblem(nInput=2, ub=3.0, lb=0.0, simFunc=simFunc, obs=obs, simLabels=["Q"], name="ToyModel")

result = SUFI2(maxIters=3, nSamples=12, verboseFlag=False, logFlag=False, saveFlag=False).run(problem, eliteSize=4, seed=123)

print(result.bestDecs)
print(result.posteriorDecs.shape)
print(len(result.history.metricsHistory))
print(result.history.metricsHistory[-1].keys())

Example output:

text
[[0.9825 1.8991]]
(12, 2)
3
dict_keys(['iter', 'pfactor', 'rfactor', 'updatedLb', 'updatedUb'])

Run ES

ES performs one ensemble-smoother update.

python
import numpy as np

from UQPyL.calibration import ES
from UQPyL.problem import ModelProblem

np.set_printoptions(precision=4, suppress=True)


obs = np.array([[1.0], [2.0]])


def simFunc(X):
    X = np.atleast_2d(X)
    sim = np.zeros((X.shape[0], 2, 1))
    sim[:, 0, 0] = X[:, 0]
    sim[:, 1, 0] = X[:, 1]
    return sim


problem = ModelProblem(nInput=2, ub=3.0, lb=0.0, simFunc=simFunc, obs=obs, simLabels=["Q"], name="ToyModel")

X = np.array([[0.0, 0.0], [2.0, 3.0], [1.5, 0.5]])

result = ES(verboseFlag=False, logFlag=False, saveFlag=False).run(problem, X)

print(result.posteriorDecs)
print(result.bestDecs)
print(result.posteriorSims.shape)
print(result.diagnostics["priorMean"])
print(result.diagnostics["posteriorMean"])
print(result.diagnostics["scores"])

Example output:

text
[[1. 2.]
 [1. 2.]
 [1. 2.]]
[[1. 2.]]
(3, 2)
[1.1667 1.1667]
[1. 2.]
[0. 0. 0.]

In this linear toy model, the ensemble update moves all members exactly to the observation-matching parameters.

Run IES

IES repeats the ensemble-smoother update for multiple iterations.

python
import numpy as np

from UQPyL.calibration import ES, IES
from UQPyL.problem import ModelProblem

np.set_printoptions(precision=4, suppress=True)


obs = np.array([[1.0], [2.0]])


def simFunc(X):
    X = np.atleast_2d(X)
    sim = np.zeros((X.shape[0], 2, 1))
    sim[:, 0, 0] = X[:, 0]
    sim[:, 1, 0] = X[:, 1] ** 2
    return sim


problem = ModelProblem(nInput=2, ub=3.0, lb=0.0, simFunc=simFunc, obs=obs, simLabels=["Q"], name="NonlinearToyModel")

X = np.array([[0.0, 0.5], [2.0, 1.0], [1.5, 2.0]])

esResult = ES(verboseFlag=False, logFlag=False, saveFlag=False).run(problem, X)
iesResult = IES(maxIters=4, lam=1e-6, verboseFlag=False, logFlag=False, saveFlag=False).run(problem, X)

print(iesResult.bestDecs)
print(iesResult.posteriorDecs.shape)
print(iesResult.posteriorSims.shape)
print(len(iesResult.history.metricsHistory))
print(np.mean(esResult.diagnostics["scores"]), np.mean(iesResult.diagnostics["scores"]))

Example output:

text
[[1.     1.2353]]
(3, 2)
(3, 2)
4
0.33520286859016274 0.3352026900493963

Use history.metricsHistory to inspect iteration-level summaries.

Use Verbose Output

Set verboseFlag=True to print a compact final summary.

python
import numpy as np

from UQPyL.calibration import GLUE
from UQPyL.problem import ModelProblem


obs = np.array([[1.0], [2.0]])


def simFunc(X):
    X = np.atleast_2d(X)
    sim = np.zeros((X.shape[0], 2, 1))
    sim[:, 0, 0] = X[:, 0]
    sim[:, 1, 0] = X[:, 1]
    return sim


problem = ModelProblem(nInput=2, ub=3.0, lb=0.0, simFunc=simFunc, obs=obs, simLabels=["Q"], name="ToyModel")
X = np.array([[1.0, 2.0], [1.0, 2.4], [0.0, 0.0]])

result = GLUE(metric="rmse", verboseFlag=True, logFlag=False, saveFlag=False).run(problem, X, threshold=0.3)

Example output:

text
GLUE finished
  problem   : ToyModel
  metric    : rmse
  bestScore : 0
  bestX     : [1.0000e+00, 2.0000e+00]
  iters     : 0
  runtime   : 0.000s

Read CalResult

Every calibration method returns a CalResult.

python
import numpy as np

from UQPyL.calibration import GLUE
from UQPyL.problem import ModelProblem

np.set_printoptions(precision=4, suppress=True)


obs = np.array([[1.0], [2.0]])


def simFunc(X):
    X = np.atleast_2d(X)
    sim = np.zeros((X.shape[0], 2, 1))
    sim[:, 0, 0] = X[:, 0]
    sim[:, 1, 0] = X[:, 1]
    return sim


problem = ModelProblem(nInput=2, ub=3.0, lb=0.0, simFunc=simFunc, obs=obs, simLabels=["Q"], name="ToyModel")
X = np.array([[1.0, 2.0], [1.0, 2.4], [0.0, 0.0]])

result = GLUE(metric="rmse", verboseFlag=False, logFlag=False, saveFlag=False).run(problem, X, threshold=0.3)

print(result.method)
print(result.bestDecs)
print(result.bestSim)
print(result.diagnostics["scores"])
print(result.summary()["best_score"])

Example output:

text
GLUE
[[1. 2.]]
[[1. 2.]]
[0.     0.2828 1.5811]
0.0

Important fields:

FieldMeaning
bestDecsBest parameter row under the configured metric.
bestSimSimulation output for bestDecs, flattened to the observation vector layout.
behavioralDecs, behavioralSimsGLUE samples that pass the threshold.
eliteDecs, eliteSimsSUFI2 elite samples.
posteriorDecs, posteriorSimsES or IES posterior ensemble.
diagnosticsMethod-specific scores, masks, bounds, and summary values.
history.metricsHistoryIteration summaries for iterative methods.
summary()Compact dictionary for reporting.

Read a Saved SQLite Result

Set saveFlag=True to save a sqlite file under Result/.

python
from pathlib import Path

import numpy as np

from docs_v2.assets.calibration_demo_model import sqliteSimFunc
from UQPyL.calibration import CalReader, GLUE
from UQPyL.problem import ModelProblem

np.set_printoptions(precision=4, suppress=True)


obs = np.array([[1.0], [2.0]])


problem = ModelProblem(nInput=2, ub=3.0, lb=0.0, simFunc=sqliteSimFunc, obs=obs, simLabels=["Q"], name="ToyModel")
X = np.array([[1.0, 2.0], [1.0, 2.4], [0.0, 0.0]])

resultDir = Path("Result")
before = set(resultDir.glob("*.sqlite3")) if resultDir.exists() else set()

GLUE(metric="rmse", verboseFlag=False, logFlag=False, saveFlag=True).run(problem, X, threshold=0.3)

after = set(resultDir.glob("*.sqlite3"))
dbPath = sorted(after - before)[0]

with CalReader(dbPath) as reader:
    summary = reader.get_run_summary()
    params = reader.get_run_params()
    loaded = reader.load_result()

print(dbPath.as_posix())
print(summary["method"], summary["problem_name"])
print(summary["metric"], summary["best_score"])
print(params["metric"])
print(loaded.bestDecs)
print(loaded.behavioralDecs)
print(loaded.diagnostics["scores"])

Example output:

text
Result/glue_ToyModel_YYYYMMDD_HHMM_xxxx.sqlite3
GLUE ToyModel
rmse 0.0
'rmse'
[[1. 2.]]
[[1.  2. ]
 [1.  2.4]]
[0.     0.2828 1.5811]

If you already know the sqlite path, use only the reader part:

text
from UQPyL.calibration import CalReader

dbPath = "Result/glue_ToyModel_YYYYMMDD_HHMM_xxxx.sqlite3"

with CalReader(dbPath) as reader:
    summary = reader.get_run_summary()
    result = reader.load_result()

Saved calibration runs include a serialized ModelProblem. If you define simFunc interactively inside a notebook cell or temporary script, Python may not be able to pickle it. For persistent sqlite runs, prefer importable simulation functions or model classes.

Calibration persistence currently saves the final CalResult and related artifacts. It does not save per-iteration snapshots.

Common Mistakes

MistakeWhat happensFix
Using Problem instead of ModelProblemCalibration methods reject the problem.Build a ModelProblem with simFunc and obs.
Returning the wrong simFunc shapeScoring fails or simulations do not align with observations.Return (n_samples, n_time, n_series).
Passing one-dimensional observationsModelProblem rejects obs.Use a 2D array, for example obs.reshape(-1, 1).
Forgetting metric directionGLUE may keep the wrong samples.Use <= threshold for lower-is-better metrics and >= threshold for higher-is-better metrics.
Setting a GLUE threshold too strictNo behavioral samples are found.Inspect diagnostics["scores"] and adjust the threshold.
Using a mask with the wrong shapeThe model problem cannot validate it.Make mask.shape == obs.shape.
Expecting bestSim to keep 3D shapeResult simulations are flattened for scoring.Reshape using obs.shape when you need time-series layout.
Saving an interactive simFuncPickling can fail.Define persistent functions in importable modules.

Next Steps

GoalRead
Build simulation problemsProblem
Generate candidate parameter setsDesign of Experiment
Look up constructors and result fieldsCalibration API
Compare with inference workflowsInference
See complete workflowsExamples