OPA#
- class OPA(n_modes: int, tau_max: int, center: bool = True, standardize: bool = False, use_coslat: bool = False, check_nans: bool = True, n_pca_modes: int = 100, compute: bool = True, sample_name: str = 'sample', feature_name: str = 'feature', solver: str = 'auto', random_state: int | None = None, solver_kwargs: dict = {})#
Optimal Persistence Analysis.
Optimal Persistence Analysis (OPA) [1] [2] identifies the patterns with the largest decorrelation time in a time-varying field, known as optimal persistence patterns or optimally persistent patterns (OPP).
- Parameters:
n_modes (int) – Number of optimal persistence patterns (OPP) to be computed.
tau_max (int) – Maximum time lag for the computation of the covariance matrix.
center (bool, default=True) – Whether to center the input data.
standardize (bool, default=False) – Whether to standardize the input data.
use_coslat (bool, default=False) – Whether to use cosine of latitude for scaling.
n_pca_modes (int) – Number of modes to be computed in the pre-processing step using EOF.
compute (bool, default=True) – Whether to compute elements of the model eagerly, or to defer computation. If True, four pieces of the fit will be computed sequentially: 1) the preprocessor scaler, 2) optional NaN checks, 3) SVD decomposition, 4) scores and components.
sample_name (str, default="sample") – Name of the sample dimension.
feature_name (str, default="feature") – Name of the feature dimension.
solver ({"auto", "full", "randomized"}, default="auto") – Solver to use for the SVD computation.
solver_kwargs (dict, default={}) – Additional keyword arguments to pass to the solver.
References
Examples
>>> from xeofs.single import OPA >>> model = OPA(n_modes=10, tau_max=50, n_pca_modes=100) >>> model.fit(X, dim=("time"))
Retrieve the optimally persistent patterns (OPP) and their time series:
>>> opp = model.components() >>> opp_ts = model.scores()
Retrieve the decorrelation time of the OPPs:
>>> decorrelation_time = model.decorrelation_time()
- __init__(n_modes: int, tau_max: int, center: bool = True, standardize: bool = False, use_coslat: bool = False, check_nans: bool = True, n_pca_modes: int = 100, compute: bool = True, sample_name: str = 'sample', feature_name: str = 'feature', solver: str = 'auto', random_state: int | None = None, solver_kwargs: dict = {})#
Methods
__init__(n_modes, tau_max[, center, ...])check_needed_module(module)Check if a necessary non-core dependency is available.
Return the optimally persistent patterns (OPPs).
compute(**kwargs)Compute and load delayed model results.
Return the decorrelation time of the optimal persistence pattern (OPP).
deserialize(dt)Deserialize the model and its preprocessors from a DataTree.
Return the filter patterns.
fit(X, dim[, weights])Fit the model to the input data.
fit_transform(data, dim[, weights])Fit the model to the input data and project the data onto the components.
Get the model parameters.
Get the attributes to serialize.
inverse_transform(scores[, normalized])Reconstruct the original data from transformed data.
load(path[, engine])Load a saved model.
save(path[, overwrite, save_data, engine])Save the model.
scores()Return the time series of the OPPs.
Serialize a complete model with its preprocessor.
transform(data[, normalized])Project data onto the components.
Attributes
extra_modulesuses_complex- check_needed_module(module: str)#
Check if a necessary non-core dependency is available.
- components() DataArray | Dataset | list[DataArray | Dataset]#
Return the optimally persistent patterns (OPPs).
- compute(**kwargs)#
Compute and load delayed model results.
- Parameters:
**kwargs – Additional keyword arguments to pass to dask.compute().
- decorrelation_time() DataArray#
Return the decorrelation time of the optimal persistence pattern (OPP).
- classmethod deserialize(dt: DataTree) Self#
Deserialize the model and its preprocessors from a DataTree.
- filter_patterns() DataArray | Dataset | list[DataArray | Dataset]#
Return the filter patterns.
- fit(X: DataArray | Dataset | list[DataArray | Dataset], dim: Sequence[Hashable] | Hashable, weights: DataArray | Dataset | list[DataArray | Dataset] | None = None) Self#
Fit the model to the input data.
- Parameters:
X (DataObject) – Input data.
dim (Sequence[Hashable] | Hashable) – Specify the sample dimensions. The remaining dimensions will be treated as feature dimensions.
weights (DataObject | None, default=None) – Weighting factors for the input data.
- fit_transform(data: DataArray | Dataset | list[DataArray | Dataset], dim: Sequence[Hashable] | Hashable, weights: DataArray | Dataset | list[DataArray | Dataset] | None = None, **kwargs) DataArray#
Fit the model to the input data and project the data onto the components.
- Parameters:
data (DataObject) – Input data.
dim (Sequence[Hashable] | Hashable) – Specify the sample dimensions. The remaining dimensions will be treated as feature dimensions.
weights (DataObject | None, default=None) – Weighting factors for the input data.
**kwargs – Additional keyword arguments to pass to the transform method.
- Returns:
projections – Projections of the data onto the components.
- Return type:
DataArray
- get_params() dict[str, Any]#
Get the model parameters.
- get_serialization_attrs() dict#
Get the attributes to serialize.
- inverse_transform(scores: DataArray, normalized: bool = False) DataArray | Dataset | list[DataArray | Dataset]#
Reconstruct the original data from transformed data.
- Parameters:
scores (DataArray) – Transformed data to be reconstructed. This could be a subset of the scores data of a fitted model, or unseen data. Must have a ‘mode’ dimension.
normalized (bool, default=False) – Whether the scores data have been normalized by the L2 norm.
- Returns:
data – Reconstructed data.
- Return type:
DataObject
- classmethod load(path: str, engine: Literal['zarr', 'netcdf4', 'h5netcdf'] = 'zarr', **kwargs) Self#
Load a saved model.
- Parameters:
path (str) – Path to the saved model.
engine ({"zarr", "netcdf4", "h5netcdf"}, default="zarr") – Xarray backend engine to use for reading the saved model.
**kwargs – Additional keyword arguments to pass to open_datatree().
- Returns:
model – The loaded model.
- Return type:
BaseModel
- save(path: str, overwrite: bool = False, save_data: bool = False, engine: Literal['zarr', 'netcdf4', 'h5netcdf'] = 'zarr', **kwargs)#
Save the model.
- Parameters:
path (str) – Path to save the model.
overwrite (bool, default=False) – Whether or not to overwrite the existing path if it already exists. Ignored unless engine=”zarr”.
save_data (str) – Whether or not to save the full input data along with the fitted components.
engine ({"zarr", "netcdf4", "h5netcdf"}, default="zarr") – Xarray backend engine to use for writing the saved model.
**kwargs – Additional keyword arguments to pass to DataTree.to_netcdf() or DataTree.to_zarr().
- scores() DataArray#
Return the time series of the OPPs.
The time series have a maximum decorrelation time that are uncorrelated with each other.
- serialize() DataTree#
Serialize a complete model with its preprocessor.
- transform(data: DataArray | Dataset | list[DataArray | Dataset], normalized=False) DataArray#
Project data onto the components.
- Parameters:
data (DataObject) – Data to be transformed.
normalized (bool, default=False) – Whether to normalize the scores by the L2 norm.
- Returns:
projections – Projections of the data onto the components.
- Return type:
DataArray