Documentation

Analysis

Analysis

The analysis module estimates how input variables affect objective or constraint outputs.

Use analysis when you need to answer questions such as:

QuestionSuitable method
Which input variables matter most?RBDFAST, RSA, DeltaTest, Morris
How much output variance is explained by each variable?Sobol, FAST
Are interactions likely to matter?Sobol with secondOrder=True, or compare S1 and ST
Which variables can be screened out early?Morris, DeltaTest
I already have ordinary samples and outputs. What can I run?RBDFAST, RSA, DeltaTest

Analysis does not optimize the model. It explains the relationship between sampled inputs and model outputs.

What You Need Before Analysis

Most analysis workflows need four objects:

ObjectMeaning
problemA Problem or benchmark problem defining inputs, bounds, labels, and outputs.
XInput sample matrix with shape (n_samples, n_input).
YOutput matrix with shape (n_samples, n_outputs). Each row must correspond to the same row in X.
metaSampling metadata from sampleWithMeta(). Required by Sobol, FAST, and Morris.

The usual workflow is:

text
define problem -> generate X -> evaluate Y -> run analysis -> read AnaResult

If your model is expensive, compute Y once and pass it into analyze(). If Y is omitted, UQPyL evaluates the problem internally.

Basic Workflow

This example asks: in f(x1, x2) = x1^2 + 0.2*x2^2, which input contributes more?

python
import numpy as np

from UQPyL.analysis import RBDFAST
from UQPyL.doe import LHS
from UQPyL.problem import Problem

np.set_printoptions(precision=4, suppress=True)


def objFunc(X):
    X = np.atleast_2d(X)
    y = X[:, 0] ** 2 + 0.2 * X[:, 1] ** 2
    return y.reshape(-1, 1)


problem = Problem(nInput=2, nObj=1, lb=-1.0, ub=1.0, objFunc=objFunc, optType="min", name="WeightedSphere2D", xLabels=["x1", "x2"])

X = LHS("classic").sample(problem, nSamples=256, seed=123)
Y = problem.evaluate(X).objs

result = RBDFAST(verboseFlag=False).analyze(problem, X, Y=Y)

print(X.shape, Y.shape)
print(result.metricNames)
print(result.getMetric("S1").values)
print(result.getMetric("S1").rowLabels)
print(result.getMetric("S1").colLabels)

Example output:

text
(256, 2) (256, 1)
['S1']
[[0.951  0.0475]]
['obj1']
['x1', 'x2']

Interpretation: x1 is much more influential than x2. The metric columns follow problem.xLabels, so the first value belongs to x1 and the second value belongs to x2.

Understand X, Y, and meta

X is a table of inputs.

text
X[row, column] = one sampled value
row = one model run
column = one input variable

Y is a table of outputs.

text
Y[row, column] = one output value
row = the same model run as X[row]
column = one objective or constraint output

The row order must stay aligned:

RowInput rowOutput row
0X[0, :]Y[0, :]
1X[1, :]Y[1, :]
2X[2, :]Y[2, :]

Do not shuffle X or Y separately after evaluation.

meta is a dictionary that describes how X was generated. It is not model output. Some analysis methods need it because the rows of X have a special structure.

Specified Designs and Free Designs

Analysis methods fall into two groups.

GroupMethodsMeaning
Specified-design methodsSobol, FAST, MorrisThe rows of X must follow a special structure generated by the matching DOE design. Use sampleWithMeta() and pass meta.
Free-design methodsRBDFAST, RSA, DeltaTest, MARSThe method can analyze ordinary sample matrices. You can use LHS, Random, low-discrepancy Sobol, or existing simulation samples.

Specified-design methods are strict:

Analysis methodRequired DOE designWhy it is required
SobolSaltelliDesign(secondOrder=...).sampleWithMeta(...)Sobol expects base samples and hybrid samples arranged in a Saltelli row pattern.
FASTFASTDesign(M=...).sampleWithMeta(...)FAST expects samples arranged by Fourier frequencies.
MorrisMorrisDesign(numLevels=...).sampleWithMeta(...)Morris expects trajectory samples, where each consecutive step changes one variable.

Free-design methods are more flexible:

Analysis methodTypical sample sourceNotes
RBDFASTLHS("classic"), Random, ordinary DOE Sobol, existing XA good first choice when you already have general samples.
RSALHS("classic"), Random, ordinary DOE Sobol, existing XNeeds enough samples in output regions.
DeltaTestLHS("classic"), Random, ordinary DOE Sobol, existing XUses nearest neighbors, so sample density matters.
MARSOrdinary training-style samplesAvailable only when optional dependencies are installed.

Here "free" means the method is not tied to a method-specific DOE design. It does not mean sample quality is unimportant. For interpretation, prefer space-filling samples and enough points to cover the input range.

Be careful with the name Sobol: UQPyL.doe.Sobol() is a low-discrepancy sampler, but UQPyL.analysis.Sobol is a sensitivity method. The analysis method requires SaltelliDesign, not ordinary Sobol() samples.

Choose a Method

Start from the question you want to answer.

SituationRecommended startNotes
You want a quick ranking from ordinary samples.RBDFASTGives first-order sensitivity S1.
You need standard variance-based sensitivity indexes.SobolRequires SaltelliDesign.
You need a Fourier-based variance method.FASTRequires FASTDesign.
You have many inputs and need screening.MorrisRequires MorrisDesign.
You already have simulation archives from ordinary DOE.RBDFAST, RSA, DeltaTestThese do not require method-specific metadata.
You want a nearest-neighbor contribution estimate.DeltaTestSensitive to sample density and neighborhood structure.

Practical default: if you already have LHS or Random samples, start with RBDFAST. If you can choose a new design and need formal variance-based indexes, use Sobol.

Sobol Analysis

Sobol estimates first-order and total-order variance contributions.

Use it when you can generate a SaltelliDesign before running the model.

python
import numpy as np

from UQPyL.analysis import Sobol
from UQPyL.doe import SaltelliDesign
from UQPyL.problem import Problem

np.set_printoptions(precision=4, suppress=True)


def objFunc(X):
    X = np.atleast_2d(X)
    y = X[:, 0] + 0.1 * X[:, 1]
    return y.reshape(-1, 1)


problem = Problem(nInput=2, nObj=1, lb=0.0, ub=1.0, objFunc=objFunc, optType="min", name="Linear2D", xLabels=["flow", "roughness"])

X, meta = SaltelliDesign(secondOrder=False).sampleWithMeta(problem, N=64, seed=123)
Y = problem.evaluate(X).objs

result = Sobol(verboseFlag=False).analyze(problem, X, Y=Y, meta=meta)

print(X.shape)
print(meta)
print(result.metricNames)
print(result.getMetric("S1").values)
print(result.getMetric("ST").values)
print(result.getMetric("S1_norm").values)

Example output:

text
(256, 2)
{'designType': 'saltelli', 'N': 64, 'secondOrder': False, 'skipValue': 0, 'scramble': True, 'blockSize': 4, 'seed': 123}
['S1', 'S1_norm', 'ST', 'ST_norm']
[[0.9878 0.0104]]
[[0.9908 0.0098]]
[[0.9896 0.0104]]

Interpretation: flow dominates the output variance. roughness has a small contribution because its coefficient is 0.1.

For nInput=2 and secondOrder=False, the total number of model runs is:

text
N * (nInput + 2) = 64 * 4 = 256

If secondOrder=True, Sobol also returns S2, and the sample count becomes:

text
N * (2*nInput + 2)

FAST Analysis

FAST also estimates first-order and total-order sensitivity, but it uses a Fourier design.

Use FASTDesign.sampleWithMeta(). The base sample size N must satisfy N > 4*M^2.

python
import numpy as np

from UQPyL.analysis import FAST
from UQPyL.doe import FASTDesign
from UQPyL.problem import Problem

np.set_printoptions(precision=4, suppress=True)


def objFunc(X):
    X = np.atleast_2d(X)
    y = X[:, 0] + 0.1 * X[:, 1]
    return y.reshape(-1, 1)


problem = Problem(nInput=2, nObj=1, lb=0.0, ub=1.0, objFunc=objFunc, optType="min", name="Linear2D", xLabels=["flow", "roughness"])

X, meta = FASTDesign(M=4).sampleWithMeta(problem, N=256, seed=123)
Y = problem.evaluate(X).objs

result = FAST(verboseFlag=False).analyze(problem, X, Y=Y, meta=meta)

print(X.shape)
print(meta)
print(result.getMetric("S1").values)
print(result.getMetric("ST").values)

Example output:

text
(512, 2)
{'designType': 'fast', 'N': 256, 'M': 4, 'blockSize': 256, 'seed': 123}
[[0.9879 0.01  ]]
[[0.9901 0.0101]]

For nInput=2, FASTDesign returns N * nInput = 512 rows.

Morris Screening

Morris is a screening method. It does not try to give a full variance decomposition. Instead, it estimates elementary effects along trajectories.

PatternMeaning
high mu_starThe variable has strong overall influence.
high sigmaThe variable may be nonlinear or involved in interactions.
low mu_starThe variable may be inactive for the analyzed output.
python
import numpy as np

from UQPyL.analysis import Morris
from UQPyL.doe import MorrisDesign
from UQPyL.problem import Problem

np.set_printoptions(precision=4, suppress=True)


def objFunc(X):
    X = np.atleast_2d(X)
    y = X[:, 0] + 0.5 * X[:, 1] + 0.0 * X[:, 2]
    return y.reshape(-1, 1)


problem = Problem(nInput=3, nObj=1, lb=0.0, ub=1.0, objFunc=objFunc, optType="min", name="Screening3D", xLabels=["rain", "soil", "unused"])

X, meta = MorrisDesign(numLevels=4).sampleWithMeta(problem, numTrajectory=8, seed=123)
Y = problem.evaluate(X).objs

result = Morris(verboseFlag=False).analyze(problem, X, Y=Y, meta=meta)

print(X.shape)
print(meta)
print(result.metricNames)
print(result.getMetric("mu_star").values)
print(result.getMetric("sigma").values)
print(result.getMetric("S1_norm").values)

Example output:

text
(32, 3)
{'designType': 'morris', 'numTrajectory': 8, 'numLevels': 4, 'trajectorySize': 4, 'seed': 123}
['mu', 'mu_star', 'sigma', 'S1_norm']
[[1.  0.5 0. ]]
[[0. 0. 0.]]
[[0.6667 0.3333 0.    ]]

Interpretation: rain is strongest, soil is weaker, and unused is inactive. sigma is zero because the toy model is linear and has no interactions.

For nInput=3, each Morris trajectory has nInput + 1 = 4 rows. With numTrajectory=8, the sample matrix has 32 rows.

Analyze Existing Outputs

When model evaluations are expensive, save Y and reuse it.

python
import numpy as np

from UQPyL.analysis import DeltaTest, RSA
from UQPyL.doe import LHS
from UQPyL.problem import Problem

np.set_printoptions(precision=4, suppress=True)


def objFunc(X):
    X = np.atleast_2d(X)
    y = X[:, 0] ** 2 + 0.2 * X[:, 1] ** 2
    return y.reshape(-1, 1)


problem = Problem(nInput=2, nObj=1, lb=-1.0, ub=1.0, objFunc=objFunc, optType="min", name="WeightedSphere2D", xLabels=["x1", "x2"])

X = LHS("classic").sample(problem, nSamples=128, seed=123)
Y = problem.evaluate(X).objs

deltaResult = DeltaTest(nNeighbors=2, verboseFlag=False).analyze(problem, X, Y=Y)
rsaResult = RSA(nRegion=4, verboseFlag=False).analyze(problem, X, Y=Y)

print(deltaResult.metricNames)
print(deltaResult.getMetric("S1_norm").values)
print(rsaResult.metricNames)
print(rsaResult.getMetric("S1_norm").values)

Example output:

text
['S1', 'S1_norm']
[[ 1.0331 -0.0331]]
['S1', 'S1_norm']
[[0.7945 0.2055]]

DeltaTest values can be negative after normalization when finite samples and nearest-neighbor estimates are noisy. Use the result as a screening signal, then confirm important variables with a larger sample or a variance-based method when needed.

Select Targets and Output Columns

Use target to choose which output block to analyze.

targetMeaning
"objs"Analyze objective outputs.
"cons"Analyze constraint outputs.

Use index to choose output columns.

indexMeaning
"all"Analyze all columns in the selected output block.
0Analyze only the first output column.
[0, 2]Analyze selected output columns.

This example has two objective columns and analyzes only the second one.

python
import numpy as np

from UQPyL.analysis import RBDFAST
from UQPyL.doe import LHS
from UQPyL.problem import Problem

np.set_printoptions(precision=4, suppress=True)


def objFunc(X):
    X = np.atleast_2d(X)
    first = X[:, 0] ** 2
    second = X[:, 1] ** 2
    return np.column_stack([first, second])


problem = Problem(nInput=2, nObj=2, lb=-1.0, ub=1.0, objFunc=objFunc, optType=["min", "min"], xLabels=["x1", "x2"])

X = LHS("classic").sample(problem, nSamples=128, seed=123)
Y = problem.evaluate(X).objs

result = RBDFAST(verboseFlag=False).analyze(problem, X, Y=Y, target="objs", index=1)

print(result.getMetric("S1").rowLabels)
print(result.getMetric("S1").values)

Example output:

text
['obj1']
[[0.0092 0.9801]]

The row label is obj1 because the analyzed result contains one selected output column. The selected column is the original second objective.

Use Verbose Output

Set verboseFlag=True when you want a compact runtime trace.

python
import numpy as np

from UQPyL.analysis import RBDFAST
from UQPyL.doe import LHS
from UQPyL.problem import Problem


def objFunc(X):
    X = np.atleast_2d(X)
    y = X[:, 0] ** 2 + 0.2 * X[:, 1] ** 2
    return y.reshape(-1, 1)


problem = Problem(nInput=2, nObj=1, lb=-1.0, ub=1.0, objFunc=objFunc, optType="min", name="WeightedSphere2D", xLabels=["x1", "x2"])

X = LHS("classic").sample(problem, nSamples=64, seed=123)
result = RBDFAST(verboseFlag=True).analyze(problem, X)

Example output:

text
Analysis: RBDFAST
Problem: WeightedSphere2D
nInput: 2
nOutput: 1
params: {'M': 4}
Analysis finished
runtime: 0.001s
[S1]
obj1: x1=9.3350e-01  x2=1.0117e-01

Use verboseFlag=False in scripts where you only need the result object.

Read AnaResult

Every analysis method returns an AnaResult.

python
import numpy as np

from UQPyL.analysis import RBDFAST
from UQPyL.doe import LHS
from UQPyL.problem import Problem

np.set_printoptions(precision=4, suppress=True)


def objFunc(X):
    X = np.atleast_2d(X)
    y = X[:, 0] ** 2 + 0.2 * X[:, 1] ** 2
    return y.reshape(-1, 1)


problem = Problem(nInput=2, nObj=1, lb=-1.0, ub=1.0, objFunc=objFunc, optType="min", name="WeightedSphere2D", xLabels=["x1", "x2"])

X = LHS("classic").sample(problem, nSamples=256, seed=123)
Y = problem.evaluate(X).objs
result = RBDFAST(verboseFlag=False).analyze(problem, X, Y=Y)

metric = result.getMetric("S1")

print(result.summary()["metric_names"])
print(metric.name)
print(metric.values)
print(metric.rowLabels)
print(metric.colLabels)

Example output:

text
['S1']
S1
[[0.951  0.0475]]
['obj1']
['x1', 'x2']

AnaResult groups all metrics from one analysis run.

Field or methodMeaning
result.methodAnalysis method name, such as "RBDFAST" or "Sobol".
result.targetThe analyzed output block, such as "objs" or "cons".
result.metricNamesNames of available metrics.
result.getMetric("S1")Return one AnaMetric.
result["S1"]Shortcut for result.getMetric("S1").
result.XRecorded input matrix.
result.YRecorded output matrix.
result.summary()Compact runtime summary.

AnaMetric stores one matrix.

FieldMeaning
metric.nameMetric name, such as S1, ST, or mu_star.
metric.valuesNumeric matrix. Rows are outputs; columns are variables or variable pairs.
metric.rowLabelsOutput labels, such as obj1 or con1.
metric.colLabelsInput labels, such as x1, rain, or soil.
metric.colDimColumn type. decsDim1 means one input per column; decsDim2 means input pairs.

Read a Saved SQLite Result

Set saveFlag=True to save a sqlite file under Result/.

python
from pathlib import Path

import numpy as np

from UQPyL.analysis import RBDFAST
from UQPyL.analysis.runtime import AnaReader
from UQPyL.doe import LHS
from UQPyL.problem import Sphere

np.set_printoptions(precision=4, suppress=True)


problem = Sphere(nInput=3, ub=1.0, lb=-1.0)
X = LHS("classic").sample(problem, nSamples=64, seed=123)
Y = problem.evaluate(X).objs

resultDir = Path("Result")
before = set(resultDir.glob("*.sqlite3")) if resultDir.exists() else set()

RBDFAST(verboseFlag=False, logFlag=False, saveFlag=True).analyze(problem, X, Y=Y)

after = set(resultDir.glob("*.sqlite3"))
dbPath = sorted(after - before)[0]

with AnaReader(dbPath) as reader:
    summary = reader.get_run_summary()
    params = reader.get_run_params()
    metric = reader.get_metric("S1")
    loaded = reader.load_result()

print(dbPath.as_posix())
print(summary["method"], summary["problem_name"], summary["target"])
print(summary["metric_names"])
print(params["M"])
print(metric.values)
print(loaded.X.shape, loaded.Y.shape)

Example output:

text
Result/rbdfast_Sphere_YYYYMMDD_HHMM_xxxx.sqlite3
RBDFAST Sphere objs
['S1']
4
[[0.1853 0.2958 0.3061]]
(64, 3) (64, 1)

If you already know the sqlite path, use only the reader part:

text
from UQPyL.analysis.runtime import AnaReader

dbPath = "Result/rbdfast_Sphere_YYYYMMDD_HHMM_xxxx.sqlite3"

with AnaReader(dbPath) as reader:
    summary = reader.get_run_summary()
    result = reader.load_result()

Saved runs include a serialized problem payload. If you define objFunc interactively inside a notebook cell or temporary script, Python may not be able to pickle it. For persistent sqlite runs, prefer importable problem classes or objective functions defined in importable modules.

Interpret Common Metrics

Metric availability depends on the method.

MetricProduced byMeaning
S1Sobol, FAST, RBDFAST, RSA, DeltaTest, MARSFirst-order or method-specific single-variable contribution. Larger usually means more important.
S1_normSobol, FAST, RSA, DeltaTest, Morris, MARSRow-normalized contribution. Useful for ranking variables within one output.
STSobol, FASTTotal-order contribution, including interactions with other variables.
ST_normSobol, FASTRow-normalized total-order contribution.
S2Sobol with secondOrder=TruePairwise second-order contribution. Columns are input pairs.
muMorrisMean elementary effect. Sign can indicate direction.
mu_starMorrisMean absolute elementary effect. Use this for Morris variable ranking.
sigmaMorrisSpread of elementary effects. Larger values suggest nonlinear effects or interactions.

Use metric magnitudes as rankings, not as exact physical constants. Sensitivity estimates depend on the sampled input ranges in problem.lb and problem.ub.

Common Mistakes

MistakeWhat happensFix
Passing Sobol, FAST, or Morris samples without metaThe method cannot interpret the row structure.Use X, meta = sampler.sampleWithMeta(...) and pass meta=meta.
Generating X with one sampler but meta with anotherResults are invalid or validation fails.Keep the X and meta returned by the same call.
Shuffling Y after evaluationInputs and outputs no longer match row by row.Keep X[i] aligned with Y[i].
Passing a one-dimensional Y with unclear shapeThe method may reshape it, but intent is less clear.Prefer Y with shape (n_samples, n_outputs).
Using too few samplesRankings can be noisy.Start small for smoke tests, then increase sample count for interpretation.
Comparing sensitivities across different boundsResults change because the input uncertainty range changed.Document lb and ub with every analysis result.
Treating S1_norm as an absolute physical contributionIt is normalized for ranking within one output.Use it to compare variables in the same result row.
Confusing S1 and STInteractions may be missed.If ST is much larger than S1, interactions or nonlinear effects may matter.
Saving an interactive objFuncPickling can fail when saveFlag=True.Use importable problem classes or functions for saved sqlite runs.

Next Steps

GoalRead
Generate compatible samplesDesign of Experiment
Define the evaluated systemProblem
Look up constructors and result fieldsAnalysis API
See complete workflowsExamples