ComplexCPCCARotator#
- class ComplexCPCCARotator(**kwargs)#
Rotate a solution obtained from
xe.cross.ComplexCPCCA.Rotate the obtained components and scores of a CPCCA model to increase interpretability. The algorithm here is based on the approach of Cheng & Dunkerton (1995) [1] and adapted to the CPCCA framework [2]_.
- Parameters:
n_modes (int, default=10) – Specify the number of modes to be rotated.
power (int, default=1) – Set the power for the Promax rotation. A
powervalue of 1 results in a Varimax rotation.max_iter (int, default=1000) – Determine the maximum number of iterations for the computation of the rotation matrix.
rtol (float, default=1e-8) – Define the relative tolerance required to achieve convergence and terminate the iterative process.
squared_loadings (bool, default=False) – Specify the method for constructing the combined vectors of loadings. If True, the combined vectors are loaded with the singular values (termed “squared loadings”), conserving the squared covariance under rotation. This allows estimation of mode importance after rotation. If False, the combined vectors are loaded with the square root of the singular values, following the method described by Cheng & Dunkerton.
compute (bool, default=True) – Whether to compute the rotation immediately.
References
Examples
Perform a CPCCA analysis:
>>> model = ComplexCPCCA(n_modes=10) >>> model.fit(X, Y, dim='time')
Then, apply varimax rotation to first 5 components and scores:
>>> rotator = ComplexCPCCARotator(n_modes=5) >>> rotator.fit(model)
Retrieve the rotated components and scores:
>>> rotator.components() >>> rotator.scores()
- __init__(**kwargs)#
Methods
__init__(**kwargs)components([normalized])Get the components of the model.
components_amplitude([normalized])Get the amplitude of the components.
components_phase([normalized])Get the phase of the components.
compute(**kwargs)Compute and load delayed model results.
Get the correlation coefficients for the scores of \(X\).
Get the correlation coefficients for the scores of \(Y\).
Get the cross-correlation coefficients.
deserialize(dt)Deserialize the model and its preprocessors from a DataTree.
fit(model)Rotate the solution obtained from
xe.cross.CPCCA.Get the fraction of variance explained (FVE X).
Get the fraction of variance explained (FVE YX).
Get the fraction of variance explained (FVE Y).
Get the model parameters.
Get the attributes needed to serialize the model.
heterogeneous_patterns([correction, alpha])Get the heterogeneous correlation patterns.
homogeneous_patterns([correction, alpha])Get the homogeneous correlation patterns.
inverse_transform([X, Y])Reconstruct the original data from transformed data.
load(path[, engine])Load a saved model.
predict(X)Predict Y from X.
save(path[, overwrite, save_data, engine])Save the model.
scores([normalized])Get the scores of the model.
scores_amplitude([normalized])Get the amplitude of the scores.
scores_phase([normalized])Get the phase of the scores.
Serialize a complete model with its preprocessor.
Get the squared covariance fraction (SCF).
transform([X, Y, normalized])Transform the data.
- components(normalized=True) tuple[DataArray | Dataset | list[DataArray | Dataset], DataArray | Dataset | list[DataArray | Dataset]]#
Get the components of the model.
The components may be referred to differently depending on the model type. Common terms include canonical vectors, singular vectors, loadings or spatial patterns.
- Parameters:
normalized (bool, default=True) – Whether to return L2 normalized components.
- Returns:
Components of X and Y.
- Return type:
tuple[DataObject, DataObject]
- components_amplitude(normalized=True) tuple[DataArray | Dataset | list[DataArray | Dataset], DataArray | Dataset | list[DataArray | Dataset]]#
Get the amplitude of the components.
The amplitudes of the components are defined as
\[A_{x, ij} = |p_{x, ij}|\]\[A_{y, ij} = |p_{y, ij}|\]where \(p_{ij}\) is the \(i\)-th entry of the \(j\)-th component and \(|\cdot|\) denotes the absolute value.
- Returns:
Component amplitudes of \(X\) and \(Y\).
- Return type:
tuple[DataObject, DataObject]
- components_phase(normalized=True) tuple[DataArray | Dataset | list[DataArray | Dataset], DataArray | Dataset | list[DataArray | Dataset]]#
Get the phase of the components.
The phases of the components are defined as
\[\phi_{x, ij} = \arg(p_{x, ij})\]\[\phi_{y, ij} = \arg(p_{y, ij})\]where \(p_{ij}\) is the \(i\)-th entry of the \(j\)-th component and \(\arg(\cdot)\) denotes the argument of a complex number.
- Returns:
Component phases of \(X\) and \(Y\).
- Return type:
tuple[DataObject, DataObject]
- compute(**kwargs)#
Compute and load delayed model results.
- Parameters:
**kwargs – Additional keyword arguments to pass to dask.compute().
- correlation_coefficients_X()#
Get the correlation coefficients for the scores of \(X\).
The correlation coefficients of the scores of \(X\) are given by:
\[c_{x, ij} = \text{corr} \left(\mathbf{r}_{x, i}, \mathbf{r}_{x, j} \right)\]where \(\mathbf{r}_{x, i}\) and \(\mathbf{r}_{x, j}\) are the i`th and `j`th scores of :math:`X.
- correlation_coefficients_Y()#
Get the correlation coefficients for the scores of \(Y\).
The correlation coefficients of the scores of \(Y\) are given by:
\[c_{y, ij} = \text{corr} \left(\mathbf{r}_{y, i}, \mathbf{r}_{y, j} \right)\]where \(\mathbf{r}_{y, i}\) and \(\mathbf{r}_{y, j}\) are the i`th and `j`th scores of :math:`Y.
- cross_correlation_coefficients()#
Get the cross-correlation coefficients.
The cross-correlation coefficients between the scores of
XandYare computed as:\[c_{xy, i} = \text{corr} \left(\mathbf{r}_{x, i}, \mathbf{r}_{y, i} \right)\]where \(\mathbf{r}_{x, i}\) and \(\mathbf{r}_{y, i}\) are the i`th scores of ``X` and
Y,Notes
When \(\alpha=0\), the cross-correlation coefficients are equivalent to the canonical correlation coefficients.
- classmethod deserialize(dt: DataTree) Self#
Deserialize the model and its preprocessors from a DataTree.
- fit(model: CPCCA) Self#
Rotate the solution obtained from
xe.cross.CPCCA.- Parameters:
model (
xe.cross.CPCCA) – The CPCCA model to be rotated.
- fraction_variance_X_explained_by_X()#
Get the fraction of variance explained (FVE X).
The FVE X is the fraction of variance in \(X\) explained by the scores of \(X\). It is computed as a weighted mean-square error (see equation (15) in Swenson (2015)) :
\[FVE_{X|X,i} = 1 - \frac{\|\mathbf{d}_{X,i}\|_F^2}{\|X\|_F^2}\]where \(\mathbf{d}_{X,i}\) are the residuals of the input data \(X\) after reconstruction by the ith scores of \(X\).
References
- Swenson, E. Continuum Power CCA: A Unified Approach for Isolating
Coupled Modes. Journal of Climate 28, 1016–1030 (2015).
- fraction_variance_Y_explained_by_X() DataArray#
Get the fraction of variance explained (FVE YX).
The FVE YX is the fraction of variance in \(Y\) explained by the scores of \(X\). It is computed as a weighted mean-square error (see equation (15) in Swenson (2015)) :
\[FVE_{Y|X,i} = 1 - \frac{\|(X^TX)^{-1/2} \mathbf{d}_{X,i}^T \mathbf{d}_{Y,i}\|_F^2}{\|(X^TX)^{-1/2} X^TY\|_F^2}\]where \(\mathbf{d}_{X,i}\) and \(\mathbf{d}_{Y,i}\) are the residuals of the input data \(X\) and \(Y\) after reconstruction by the ith scores of \(X\) and \(Y\), respectively.
References
Swenson, E. Continuum Power CCA: A Unified Approach for Isolating Coupled Modes. Journal of Climate 28, 1016–1030 (2015).
- fraction_variance_Y_explained_by_Y()#
Get the fraction of variance explained (FVE Y).
The FVE Y is the fraction of variance in \(Y\) explained by the scores of \(Y\). It is computed as a weighted mean-square error (see equation (15) in Swenson (2015)) :
\[FVE_{Y|Y,i} = 1 - \frac{\|\mathbf{d}_{Y,i}\|_F^2}{\|Y\|_F^2}\]where \(\mathbf{d}_{Y,i}\) are the residuals of the input data \(Y\) after reconstruction by the ith scores of \(Y\).
References
- Swenson, E. Continuum Power CCA: A Unified Approach for Isolating
Coupled Modes. Journal of Climate 28, 1016–1030 (2015).
- get_params() dict[str, Any]#
Get the model parameters.
- get_serialization_attrs() dict#
Get the attributes needed to serialize the model.
- Returns:
Attributes needed to serialize the model.
- Return type:
dict
- heterogeneous_patterns(correction=None, alpha=0.05)#
Get the heterogeneous correlation patterns.
The heterogeneous patterns are the correlation coefficients between the input data and the scores of the other field:
\[G_{X, i} = \text{corr} \left(X, \mathbf{r}_{y,i} \right)\]\[G_{Y, i} = \text{corr} \left(Y, \mathbf{r}_{x,i} \right)\]where \(X\) and \(Y\) are the input data, and \(\mathbf{r}_{x,i}\) and \(\mathbf{r}_{y,i}\) are the i`th scores of :math:`X and \(Y\), respectively.
- Parameters:
correction (str, default=None) – Method to apply a multiple testing correction. If None, no correction is applied. Available methods are: - bonferroni : one-step correction - sidak : one-step correction - holm-sidak : step down method using Sidak adjustments - holm : step-down method using Bonferroni adjustments - simes-hochberg : step-up method (independent) - hommel : closed method based on Simes tests (non-negative) - fdr_bh : Benjamini/Hochberg (non-negative) (default) - fdr_by : Benjamini/Yekutieli (negative) - fdr_tsbh : two stage fdr correction (non-negative) - fdr_tsbky : two stage fdr correction (non-negative)
alpha (float, default=0.05) – The desired family-wise error rate. Not used if correction is None.
- Returns:
tuple[DataObject, DataObject] – Heterogenous correlation patterns of X and Y.
tuple[DataObject, DataObject] – p-values of the heterogenous correlation patterns of X and Y.
- homogeneous_patterns(correction=None, alpha=0.05)#
Get the homogeneous correlation patterns.
The homogeneous correlation patterns are the correlation coefficients between the input data and the scores. They are defined as:
\[H_{X, i} = \text{corr} \left(X, \mathbf{r}_{x,i} \right)\]\[H_{Y, i} = \text{corr} \left(Y, \mathbf{r}_{y,i} \right)\]where \(X\) and \(Y\) are the input data, and \(\mathbf{r}_{x,i}\) and \(\mathbf{r}_{y,i}\) are the i`th scores of :math:`X and \(Y\), respectively.
- Parameters:
correction (str, default=None) – Method to apply a multiple testing correction. If None, no correction is applied. Available methods are: - bonferroni : one-step correction - sidak : one-step correction - holm-sidak : step down method using Sidak adjustments - holm : step-down method using Bonferroni adjustments - simes-hochberg : step-up method (independent) - hommel : closed method based on Simes tests (non-negative) - fdr_bh : Benjamini/Hochberg (non-negative) (default) - fdr_by : Benjamini/Yekutieli (negative) - fdr_tsbh : two stage fdr correction (non-negative) - fdr_tsbky : two stage fdr correction (non-negative)
alpha (float, default=0.05) – The desired family-wise error rate. Not used if correction is None.
- Returns:
tuple[DataObject, DataObject] – Homogenous correlation patterns of X and Y.
tuple[DataObject, DataObject] – p-values of the homogenous correlation patterns of X and Y.
- inverse_transform(X: DataArray | None = None, Y: DataArray | None = None) Sequence[DataArray | Dataset | list[DataArray | Dataset]] | DataArray | Dataset | list[DataArray | Dataset]#
Reconstruct the original data from transformed data.
- Parameters:
X (DataArray | None) – Transformed data to be reconstructed. At least one of them must be provided.
Y (DataArray | None) – Transformed data to be reconstructed. At least one of them must be provided.
- Returns:
Reconstructed data.
- Return type:
Sequence[DataObject] | DataObject
- classmethod load(path: str, engine: Literal['zarr', 'netcdf4', 'h5netcdf'] = 'zarr', **kwargs) Self#
Load a saved model.
- Parameters:
path (str) – Path to the saved model.
engine ({"zarr", "netcdf4", "h5netcdf"}, default="zarr") – Xarray backend engine to use for reading the saved model.
**kwargs – Additional keyword arguments to pass to open_datatree().
- Returns:
model – The loaded model.
- Return type:
BaseModel
- predict(X: DataArray | Dataset | list[DataArray | Dataset]) DataArray#
Predict Y from X.
- Parameters:
X (DataObject) – Data to be used for prediction.
- Returns:
Predicted data in transformed space.
- Return type:
DataArray
- save(path: str, overwrite: bool = False, save_data: bool = False, engine: Literal['zarr', 'netcdf4', 'h5netcdf'] = 'zarr', **kwargs)#
Save the model.
- Parameters:
path (str) – Path to save the model.
overwrite (bool, default=False) – Whether or not to overwrite the existing path if it already exists. Ignored unless engine=”zarr”.
save_data (str) – Whether or not to save the full input data along with the fitted components.
engine ({"zarr", "netcdf4", "h5netcdf"}, default="zarr") – Xarray backend engine to use for writing the saved model.
**kwargs – Additional keyword arguments to pass to DataTree.to_netcdf() or DataTree.to_zarr().
- scores(normalized=False) tuple[DataArray, DataArray]#
Get the scores of the model.
The component scores may be referred to differently depending on the model type. Common terms include canonical variates, expansion coefficents, principal component (scores) or temporal patterns.
- Parameters:
normalized (bool, default=False) – Whether to return L2 normalized scores.
- Returns:
Scores of X and Y.
- Return type:
tuple[DataArray, DataArray]
- scores_amplitude(normalized=False) tuple[DataArray, DataArray]#
Get the amplitude of the scores.
The amplitudes of the scores are defined as
\[A_{x, ij} = |r_{y, ij}|\]\[A_{y, ij} = |r_{x, ij}|\]where \(r_{ij}\) is the \(i\)-th entry of the \(j\)-th score and \(|\cdot|\) denotes the absolute value.
- Returns:
Score amplitudes of \(X\) and \(Y\).
- Return type:
tuple[DataArray, DataArray]
- scores_phase(normalized=False) tuple[DataArray, DataArray]#
Get the phase of the scores.
The phases of the scores are defined as
\[\phi_{x, ij} = \arg(r_{x, ij})\]\[\phi_{y, ij} = \arg(r_{y, ij})\]where \(r_{ij}\) is the \(i\)-th entry of the \(j\)-th score and \(\arg(\cdot)\) denotes the argument of a complex number.
- Returns:
Score phases of \(X\) and \(Y\).
- Return type:
tuple[DataArray, DataArray]
- serialize() DataTree#
Serialize a complete model with its preprocessor.
- squared_covariance_fraction()#
Get the squared covariance fraction (SCF).
The SCF is computed as a weighted mean-square error (see equation (15) in Swenson (2015)) :
\[SCF_{i} = 1 - \frac{\|\mathbf{d}_{X,i}^T \mathbf{d}_{Y,i}\|_F^2}{\|X^TY\|_F^2}\]where \(\mathbf{d}_{X,i}\) and \(\mathbf{d}_{Y,i}\) are the residuals of the input data \(X\) and \(Y\) after reconstruction by the ith scores of \(X\) and \(Y\), respectively.
References
- Swenson, E. Continuum Power CCA: A Unified Approach for Isolating
Coupled Modes. Journal of Climate 28, 1016–1030 (2015).
- transform(X: DataArray | Dataset | list[DataArray | Dataset] | None = None, Y: DataArray | Dataset | list[DataArray | Dataset] | None = None, normalized: bool = False) DataArray | list[DataArray]#
Transform the data.
- Parameters:
X (DataObject | None) – Data to be transformed. At least one of them must be provided.
Y (DataObject | None) – Data to be transformed. At least one of them must be provided.
normalized (bool, default=False) – Whether to return L2 normalized scores.
- Returns:
Transformed data.
- Return type:
Sequence[DataArray] | DataArray