CPCCARotator#

class CPCCARotator(n_modes: int = 10, power: int = 1, max_iter: int | None = None, rtol: float = 1e-08, compute: bool = True)#

Rotate a solution obtained from xe.cross.CPCCA.

Rotate the obtained components and scores of a CPCCA model to increase interpretability. The algorithm here is based on the approach of Cheng & Dunkerton (1995) [1] and adapted to the CPCCA framework [2].

Parameters:

n_modes (int, default=10) – Specify the number of modes to be rotated.
power (int, default=1) – Set the power for the Promax rotation. A power value of 1 results in a Varimax rotation.
max_iter (int or None, default=None) – Determine the maximum number of iterations for the computation of the rotation matrix. If not specified, defaults to 1000 if compute=True and 100 if compute=False, since we can’t terminate a lazy computation based using rtol.
rtol (float, default=1e-8) – Define the relative tolerance required to achieve convergence and terminate the iterative process.
compute (bool, default=True) – Whether to compute the rotation immediately.

References

Examples

Perform a CPCCA analysis:

>>> model = CPCCA(n_modes=10)
>>> model.fit(X, Y, dim='time')

Then, apply varimax rotation to first 5 components and scores:

>>> rotator = CPCCARotator(n_modes=5)
>>> rotator.fit(model)

Retrieve the rotated components and scores:

>>> rotator.components()
>>> rotator.scores()

__init__(n_modes: int = 10, power: int = 1, max_iter: int | None = None, rtol: float = 1e-08, compute: bool = True)#

Methods

`__init__`([n_modes, power, max_iter, rtol, ...])
`components`([normalized])	Get the components of the model.
`compute`(**kwargs)	Compute and load delayed model results.
`correlation_coefficients_X`()	Get the correlation coefficients for the scores of \(X\).
`correlation_coefficients_Y`()	Get the correlation coefficients for the scores of \(Y\).
`cross_correlation_coefficients`()	Get the cross-correlation coefficients.
`deserialize`(dt)	Deserialize the model and its preprocessors from a DataTree.
`fit`(model)	Rotate the solution obtained from `xe.cross.CPCCA`.
`fraction_variance_X_explained_by_X`()	Get the fraction of variance explained (FVE X).
`fraction_variance_Y_explained_by_X`()	Get the fraction of variance explained (FVE YX).
`fraction_variance_Y_explained_by_Y`()	Get the fraction of variance explained (FVE Y).
`get_params`()	Get the model parameters.
`get_serialization_attrs`()	Get the attributes needed to serialize the model.
`heterogeneous_patterns`([correction, alpha])	Get the heterogeneous correlation patterns.
`homogeneous_patterns`([correction, alpha])	Get the homogeneous correlation patterns.
`inverse_transform`([X, Y])	Reconstruct the original data from transformed data.
`load`(path[, engine])	Load a saved model.
`predict`(X)	Predict Y from X.
`save`(path[, overwrite, save_data, engine])	Save the model.
`scores`([normalized])	Get the scores of the model.
`serialize`()	Serialize a complete model with its preprocessor.
`squared_covariance_fraction`()	Get the squared covariance fraction (SCF).
`transform`([X, Y, normalized])	Transform the data.

Get the components of the model.

The components may be referred to differently depending on the model type. Common terms include canonical vectors, singular vectors, loadings or spatial patterns.

Parameters:: normalized (bool, default=True) – Whether to return L2 normalized components.
Returns:: Components of X and Y.
Return type:: tuple[DataObject, DataObject]

compute(**kwargs)#

Compute and load delayed model results.

Parameters:: **kwargs – Additional keyword arguments to pass to dask.compute().

correlation_coefficients_X()#

Get the correlation coefficients for the scores of \(X\).

The correlation coefficients of the scores of \(X\) are given by:

\[c_{x, ij} = \text{corr} \left(\mathbf{r}_{x, i}, \mathbf{r}_{x, j} \right)\]

where \(\mathbf{r}_{x, i}\) and \(\mathbf{r}_{x, j}\) are the i`th and `j`th scores of :math:`X.

correlation_coefficients_Y()#

Get the correlation coefficients for the scores of \(Y\).

The correlation coefficients of the scores of \(Y\) are given by:

\[c_{y, ij} = \text{corr} \left(\mathbf{r}_{y, i}, \mathbf{r}_{y, j} \right)\]

where \(\mathbf{r}_{y, i}\) and \(\mathbf{r}_{y, j}\) are the i`th and `j`th scores of :math:`Y.

cross_correlation_coefficients()#

Get the cross-correlation coefficients.

The cross-correlation coefficients between the scores of X and Y are computed as:

\[c_{xy, i} = \text{corr} \left(\mathbf{r}_{x, i}, \mathbf{r}_{y, i} \right)\]

where \(\mathbf{r}_{x, i}\) and \(\mathbf{r}_{y, i}\) are the i`th scores of ``X` and Y,

Notes

When \(\alpha=0\), the cross-correlation coefficients are equivalent to the canonical correlation coefficients.

classmethod deserialize(dt: DataTree) → Self#: Deserialize the model and its preprocessors from a DataTree.

fit(model: CPCCA) → Self#

Rotate the solution obtained from xe.cross.CPCCA.

Parameters:: model (xe.cross.CPCCA) – The CPCCA model to be rotated.

fraction_variance_X_explained_by_X()#

Get the fraction of variance explained (FVE X).

The FVE X is the fraction of variance in \(X\) explained by the scores of \(X\). It is computed as a weighted mean-square error (see equation (15) in Swenson (2015)) :

\[FVE_{X|X,i} = 1 - \frac{\|\mathbf{d}_{X,i}\|_F^2}{\|X\|_F^2}\]

where \(\mathbf{d}_{X,i}\) are the residuals of the input data \(X\) after reconstruction by the ith scores of \(X\).

References

Swenson, E. Continuum Power CCA: A Unified Approach for Isolating: Coupled Modes. Journal of Climate 28, 1016–1030 (2015).

fraction_variance_Y_explained_by_X() → DataArray#

Get the fraction of variance explained (FVE YX).

The FVE YX is the fraction of variance in \(Y\) explained by the scores of \(X\). It is computed as a weighted mean-square error (see equation (15) in Swenson (2015)) :

\[FVE_{Y|X,i} = 1 - \frac{\|(X^TX)^{-1/2} \mathbf{d}_{X,i}^T \mathbf{d}_{Y,i}\|_F^2}{\|(X^TX)^{-1/2} X^TY\|_F^2}\]

where \(\mathbf{d}_{X,i}\) and \(\mathbf{d}_{Y,i}\) are the residuals of the input data \(X\) and \(Y\) after reconstruction by the ith scores of \(X\) and \(Y\), respectively.

References

Swenson, E. Continuum Power CCA: A Unified Approach for Isolating Coupled Modes. Journal of Climate 28, 1016–1030 (2015).

fraction_variance_Y_explained_by_Y()#

Get the fraction of variance explained (FVE Y).

The FVE Y is the fraction of variance in \(Y\) explained by the scores of \(Y\). It is computed as a weighted mean-square error (see equation (15) in Swenson (2015)) :

\[FVE_{Y|Y,i} = 1 - \frac{\|\mathbf{d}_{Y,i}\|_F^2}{\|Y\|_F^2}\]

where \(\mathbf{d}_{Y,i}\) are the residuals of the input data \(Y\) after reconstruction by the ith scores of \(Y\).

References

Swenson, E. Continuum Power CCA: A Unified Approach for Isolating: Coupled Modes. Journal of Climate 28, 1016–1030 (2015).

get_params() → dict[str, Any]#: Get the model parameters.

get_serialization_attrs() → dict#

Get the attributes needed to serialize the model.

Returns:: Attributes needed to serialize the model.
Return type:: dict

heterogeneous_patterns(correction=None, alpha=0.05)#

Get the heterogeneous correlation patterns.

The heterogeneous patterns are the correlation coefficients between the input data and the scores of the other field:

\[G_{X, i} = \text{corr} \left(X, \mathbf{r}_{y,i} \right)\]

\[G_{Y, i} = \text{corr} \left(Y, \mathbf{r}_{x,i} \right)\]

where \(X\) and \(Y\) are the input data, and \(\mathbf{r}_{x,i}\) and \(\mathbf{r}_{y,i}\) are the i`th scores of :math:`X and \(Y\), respectively.

Parameters:

correction (str, default=None) – Method to apply a multiple testing correction. If None, no correction is applied. Available methods are: - bonferroni : one-step correction - sidak : one-step correction - holm-sidak : step down method using Sidak adjustments - holm : step-down method using Bonferroni adjustments - simes-hochberg : step-up method (independent) - hommel : closed method based on Simes tests (non-negative) - fdr_bh : Benjamini/Hochberg (non-negative) (default) - fdr_by : Benjamini/Yekutieli (negative) - fdr_tsbh : two stage fdr correction (non-negative) - fdr_tsbky : two stage fdr correction (non-negative)
alpha (float, default=0.05) – The desired family-wise error rate. Not used if correction is None.

Returns:

tuple[DataObject, DataObject] – Heterogenous correlation patterns of X and Y.
tuple[DataObject, DataObject] – p-values of the heterogenous correlation patterns of X and Y.

homogeneous_patterns(correction=None, alpha=0.05)#

Get the homogeneous correlation patterns.

The homogeneous correlation patterns are the correlation coefficients between the input data and the scores. They are defined as:

\[H_{X, i} = \text{corr} \left(X, \mathbf{r}_{x,i} \right)\]

\[H_{Y, i} = \text{corr} \left(Y, \mathbf{r}_{y,i} \right)\]

where \(X\) and \(Y\) are the input data, and \(\mathbf{r}_{x,i}\) and \(\mathbf{r}_{y,i}\) are the i`th scores of :math:`X and \(Y\), respectively.

Parameters:

correction (str, default=None) – Method to apply a multiple testing correction. If None, no correction is applied. Available methods are: - bonferroni : one-step correction - sidak : one-step correction - holm-sidak : step down method using Sidak adjustments - holm : step-down method using Bonferroni adjustments - simes-hochberg : step-up method (independent) - hommel : closed method based on Simes tests (non-negative) - fdr_bh : Benjamini/Hochberg (non-negative) (default) - fdr_by : Benjamini/Yekutieli (negative) - fdr_tsbh : two stage fdr correction (non-negative) - fdr_tsbky : two stage fdr correction (non-negative)
alpha (float, default=0.05) – The desired family-wise error rate. Not used if correction is None.

Returns:

tuple[DataObject, DataObject] – Homogenous correlation patterns of X and Y.
tuple[DataObject, DataObject] – p-values of the homogenous correlation patterns of X and Y.

Reconstruct the original data from transformed data.

Parameters:

X (DataArray | None) – Transformed data to be reconstructed. At least one of them must be provided.
Y (DataArray | None) – Transformed data to be reconstructed. At least one of them must be provided.

Returns:

Reconstructed data.

Return type:

Sequence[DataObject] | DataObject

classmethod load(path: str, engine: Literal['zarr', 'netcdf4', 'h5netcdf'] = 'zarr', **kwargs) → Self#

Load a saved model.

Parameters:

path (str) – Path to the saved model.
engine ({"zarr", "netcdf4", "h5netcdf"}, default="zarr") – Xarray backend engine to use for reading the saved model.
**kwargs – Additional keyword arguments to pass to open_datatree().

Returns:

model – The loaded model.

Return type:

BaseModel

predict(X: DataArray | Dataset | list[DataArray | Dataset]) → DataArray#

Predict Y from X.

Parameters:: X (DataObject) – Data to be used for prediction.
Returns:: Predicted data in transformed space.
Return type:: DataArray

save(path: str, overwrite: bool = False, save_data: bool = False, engine: Literal['zarr', 'netcdf4', 'h5netcdf'] = 'zarr', **kwargs)#

Save the model.

Parameters:

path (str) – Path to save the model.
overwrite (bool, default=False) – Whether or not to overwrite the existing path if it already exists. Ignored unless engine=”zarr”.
save_data (str) – Whether or not to save the full input data along with the fitted components.
engine ({"zarr", "netcdf4", "h5netcdf"}, default="zarr") – Xarray backend engine to use for writing the saved model.
**kwargs – Additional keyword arguments to pass to DataTree.to_netcdf() or DataTree.to_zarr().

scores(normalized=False) → tuple[DataArray, DataArray]#

Get the scores of the model.

The component scores may be referred to differently depending on the model type. Common terms include canonical variates, expansion coefficents, principal component (scores) or temporal patterns.

Parameters:: normalized (bool, default=False) – Whether to return L2 normalized scores.
Returns:: Scores of X and Y.
Return type:: tuple[DataArray, DataArray]

serialize() → DataTree#: Serialize a complete model with its preprocessor.

squared_covariance_fraction()#

Get the squared covariance fraction (SCF).

The SCF is computed as a weighted mean-square error (see equation (15) in Swenson (2015)) :

\[SCF_{i} = 1 - \frac{\|\mathbf{d}_{X,i}^T \mathbf{d}_{Y,i}\|_F^2}{\|X^TY\|_F^2}\]

where \(\mathbf{d}_{X,i}\) and \(\mathbf{d}_{Y,i}\) are the residuals of the input data \(X\) and \(Y\) after reconstruction by the ith scores of \(X\) and \(Y\), respectively.

References

Swenson, E. Continuum Power CCA: A Unified Approach for Isolating: Coupled Modes. Journal of Climate 28, 1016–1030 (2015).

Transform the data.

Parameters:

X (DataObject | None) – Data to be transformed. At least one of them must be provided.
Y (DataObject | None) – Data to be transformed. At least one of them must be provided.
normalized (bool, default=False) – Whether to return L2 normalized scores.

Returns:

Transformed data.

Return type:

Sequence[DataArray] | DataArray