![]() |
Dakota Reference Manual
Version 6.15
Explore and Predict with Confidence
|
Activates principal components analysis of the response matrix of N samples * L responses.
Alias: none
Argument(s): none
Child Keywords:
Required/Optional | Description of Group | Dakota Keyword | Dakota Keyword Description | |
---|---|---|---|---|
Optional | percent_variance_explained | Specifies the number of components to retain to explain the specified percent variance. |
Dakota can calculate the principal components of the response matrix of N samples * L responses using the keyword principal_components
. Principal components analysis (PCA) is a data reduction method. The Dakota implementation is under active development: the PCA capability may ultimately be specified elsewhere or used in different ways. For now, it is performed as a post-processing analysis based on a set of Latin Hypercube samples.
We now have field responses in Dakota. PCA is an initial approach in Dakota to analyze and represent the field data. Specifically, if we have a sample ensemble of field data responses, we want to identify the principal components responsible for the spread of that data. Then, we can generate a surrogate model by representing the overall response as weighted sum of M principal components, where the weights will be determined by GPs which are a function of the input uncertain variables. This reduced form then can be used for sensitivity analysis, calibration, etc.
The steps involved when one specifies principal_components
in Dakota are as follows:
principal_components
called percent_variance_explained
, which is a threshold that determines the number of components that are retained to explain at least that amount of variance. For example, if the user specifies percent_variance_explained
= 0.99, the number of components that accounts for at least 99 percent of the variance in the responses will be retained. The default for this percentage is 0.95. In many applications, only a few principal components explain the majority of the variance, resulting in significant data reduction. Default Behavior
principal_components
is turned off as a default. It may be used with either scalar responses or field responses, but it is intended to be used with large field responses as a data reduction method. For example, typically we expect the number of LHS samples, N, to be less than the number of field responses, L (e.g. if there is one field, the number of responses values is the length of that field).
Expected Outputs
When principal_components
is specified, the number of significant principal components is printed along with the predictions based on the principal components. If output
debug
is specified, additional information is printed, including the original response matrix, the centered data, the principal components, and the factor scores.
Usage Tips
This is a preliminary capability that is undergoing active development. Please contact the Dakota developers team if you have problems with using this capability or want to suggest additional features.
method, sampling sample_type lhs samples = 100 principal_components percent_variance_explained = 0.98
There is an extensive statistical literature available on PCA. We recommend that the interested user peruse some of this in using the PCA capability.