xeofs.models.ComplexMCARotator#

class xeofs.models.ComplexMCARotator(**kwargs)#

Bases: MCARotator, ComplexMCA

Rotate a solution obtained from xe.models.ComplexMCA.

Complex Rotated MCA [1] [2] [3] extends MCA by incorporating both amplitude and phase information using a Hilbert transform prior to performing MCA and subsequent Varimax or Promax rotation. This adds a further layer of dimensionality to the analysis, allowing for a more nuanced interpretation of complex relationships within the data, particularly useful when analyzing oscillatory data.

Parameters:
  • n_modes (int, default=10) – Specify the number of modes to be rotated.

  • power (int, default=1) – Set the power for the Promax rotation. A power value of 1 results in a Varimax rotation.

  • max_iter (int, default=1000) – Determine the maximum number of iterations for the computation of the rotation matrix.

  • rtol (float, default=1e-8) – Define the relative tolerance required to achieve convergence and terminate the iterative process.

  • squared_loadings (bool, default=False) – Specify the method for constructing the combined vectors of loadings. If True, the combined vectors are loaded with the singular values (termed “squared loadings”), conserving the squared covariance under rotation. This allows estimation of mode importance after rotation. If False, the combined vectors are loaded with the square root of the singular values, following the method described by Cheng & Dunkerton.

  • compute (bool, default=True) – Whether to compute the rotation immediately.

References

Examples

>>> model = ComplexMCA(n_modes=5)
>>> model.fit(da1, da2, dim='time')
>>> rotator = ComplexMCARotator(n_modes=5, power=2)
>>> rotator.fit(model)
>>> rotator.components()
__init__(**kwargs)#

Methods

__init__(**kwargs)

components()

Return the singular vectors of the left and right field.

components_amplitude()

Compute the amplitude of the components.

components_phase()

Compute the phase of the components.

compute([verbose])

Compute and load delayed model results.

covariance_fraction()

Get the covariance fraction (CF).

deserialize(dt)

Deserialize the model and its preprocessors from a DataTree.

fit(model)

Rotate the solution obtained from xe.models.MCA.

get_params()

Get the model parameters.

get_serialization_attrs()

heterogeneous_patterns(**kwargs)

Return the heterogeneous patterns of the left and right field.

homogeneous_patterns(**kwargs)

Return the homogeneous patterns of the left and right field.

inverse_transform(scores1, scores2)

Reconstruct the original data from transformed data.

load(path[, engine])

Load a saved model.

save(path[, overwrite, save_data, engine])

Save the model.

scores()

Return the scores of the left and right field.

scores_amplitude()

Compute the amplitude of the scores.

scores_phase()

Compute the phase of the scores.

serialize()

Serialize a complete model with its preprocessors.

singular_values()

Get the singular values of the cross-covariance matrix.

squared_covariance()

Get the squared covariance.

squared_covariance_fraction()

Calculate the squared covariance fraction (SCF).

total_covariance()

Get the total covariance.

transform(**kwargs)

Project new "unseen" data onto the rotated singular vectors.

components()#

Return the singular vectors of the left and right field.

Returns:

  • components1 (DataArray | Dataset | List[DataArray]) – Left components of the fitted model.

  • components2 (DataArray | Dataset | List[DataArray]) – Right components of the fitted model.

components_amplitude() Tuple[DataArray | Dataset | List[DataArray | Dataset], DataArray | Dataset | List[DataArray | Dataset]]#

Compute the amplitude of the components.

The amplitude of the components are defined as

\[A_ij = |C_ij|\]

where \(C_{ij}\) is the \(i\)-th entry of the \(j\)-th component and \(|\cdot|\) denotes the absolute value.

Returns:

  • DataObject – Amplitude of the left components.

  • DataObject – Amplitude of the left components.

components_phase() Tuple[DataArray | Dataset | List[DataArray | Dataset], DataArray | Dataset | List[DataArray | Dataset]]#

Compute the phase of the components.

The phase of the components are defined as

\[\phi_{ij} = \arg(C_{ij})\]

where \(C_{ij}\) is the \(i\)-th entry of the \(j\)-th component and \(\arg(\cdot)\) denotes the argument of a complex number.

Returns:

  • DataObject – Phase of the left components.

  • DataObject – Phase of the right components.

compute(verbose: bool = False, **kwargs)#

Compute and load delayed model results.

Parameters:
  • verbose (bool) – Whether or not to provide additional information about the computing progress.

  • **kwargs – Additional keyword arguments to pass to dask.compute().

covariance_fraction()#

Get the covariance fraction (CF).

Cheng and Dunkerton (1995) define the CF as follows:

\[CF_i = \frac{\sigma_i}{\sum_{i=1}^{m} \sigma_i}\]

where m is the total number of modes and \(\sigma_i\) is the ith singular value of the covariance matrix.

In this implementation the sum of singular values is estimated from the first n modes, therefore one should aim to retain as many modes as possible to get a good estimate of the covariance fraction.

Note

It is important to differentiate the CF from the squared covariance fraction (SCF). While the SCF is an invariant quantity in MCA, the CF is not. Therefore, the SCF is used to assess the relative importance of each mode. Cheng and Dunkerton (1995) introduced the CF in the context of Varimax-rotated MCA to compare the relative importance of each mode before and after rotation. In the special case of both data fields in MCA being identical, the CF is equivalent to the explained variance ratio in EOF analysis.

classmethod deserialize(dt: DataTree) Self#

Deserialize the model and its preprocessors from a DataTree.

fit(model: MCA) Self#

Rotate the solution obtained from xe.models.MCA.

Parameters:

model (xe.models.MCA) – The MCA model to be rotated.

get_params() Dict#

Get the model parameters.

heterogeneous_patterns(**kwargs)#

Return the heterogeneous patterns of the left and right field.

The heterogeneous patterns are the correlation coefficients between the input data and the scores of the other field.

More precisely, the heterogeneous patterns r_{het} are defined as

\[r_{het, x} = corr \left(X, A_y \right)\]
\[r_{het, y} = corr \left(Y, A_x \right)\]

where \(X\) and \(Y\) are the input data, \(A_x\) and \(A_y\) are the scores of the left and right field, respectively.

Parameters:
  • correction (str, default=None) – Method to apply a multiple testing correction. If None, no correction is applied. Available methods are: - bonferroni : one-step correction - sidak : one-step correction - holm-sidak : step down method using Sidak adjustments - holm : step-down method using Bonferroni adjustments - simes-hochberg : step-up method (independent) - hommel : closed method based on Simes tests (non-negative) - fdr_bh : Benjamini/Hochberg (non-negative) (default) - fdr_by : Benjamini/Yekutieli (negative) - fdr_tsbh : two stage fdr correction (non-negative) - fdr_tsbky : two stage fdr correction (non-negative)

  • alpha (float, default=0.05) – The desired family-wise error rate. Not used if correction is None.

homogeneous_patterns(**kwargs)#

Return the homogeneous patterns of the left and right field.

The homogeneous patterns are the correlation coefficients between the input data and the scores.

More precisely, the homogeneous patterns r_{hom} are defined as

\[r_{hom, x} = corr \left(X, A_x \right)\]
\[r_{hom, y} = corr \left(Y, A_y \right)\]

where \(X\) and \(Y\) are the input data, \(A_x\) and \(A_y\) are the scores of the left and right field, respectively.

Parameters:
  • correction (str, default=None) – Method to apply a multiple testing correction. If None, no correction is applied. Available methods are: - bonferroni : one-step correction - sidak : one-step correction - holm-sidak : step down method using Sidak adjustments - holm : step-down method using Bonferroni adjustments - simes-hochberg : step-up method (independent) - hommel : closed method based on Simes tests (non-negative) - fdr_bh : Benjamini/Hochberg (non-negative) (default) - fdr_by : Benjamini/Yekutieli (negative) - fdr_tsbh : two stage fdr correction (non-negative) - fdr_tsbky : two stage fdr correction (non-negative)

  • alpha (float, default=0.05) – The desired family-wise error rate. Not used if correction is None.

Returns:

  • patterns1 (DataArray | Dataset | List[DataArray]) – Left homogenous patterns.

  • patterns2 (DataArray | Dataset | List[DataArray]) – Right homogenous patterns.

  • pvals1 (DataArray | Dataset | List[DataArray]) – Left p-values.

  • pvals2 (DataArray | Dataset | List[DataArray]) – Right p-values.

inverse_transform(scores1: DataArray, scores2: DataArray) Tuple[DataArray | Dataset | List[DataArray | Dataset], DataArray | Dataset | List[DataArray | Dataset]]#

Reconstruct the original data from transformed data.

Parameters:
  • scores1 (DataObject) – Transformed left field data to be reconstructed. This could be a subset of the scores data of a fitted model, or unseen data. Must have a ‘mode’ dimension.

  • scores2 (DataObject) – Transformed right field data to be reconstructed. This could be a subset of the scores data of a fitted model, or unseen data. Must have a ‘mode’ dimension.

Returns:

  • Xrec1 (DataArray | Dataset | List[DataArray]) – Reconstructed data of left field.

  • Xrec2 (DataArray | Dataset | List[DataArray]) – Reconstructed data of right field.

classmethod load(path: str, engine: Literal['zarr', 'netcdf4', 'h5netcdf'] = 'zarr', **kwargs) Self#

Load a saved model.

Parameters:
  • path (str) – Path to the saved model.

  • engine ({"zarr", "netcdf4", "h5netcdf"}, default="zarr") – Xarray backend engine to use for reading the saved model.

  • **kwargs – Additional keyword arguments to pass to open_datatree().

Returns:

model – The loaded model.

Return type:

_BaseCrossModel

save(path: str, overwrite: bool = False, save_data: bool = False, engine: Literal['zarr', 'netcdf4', 'h5netcdf'] = 'zarr', **kwargs)#

Save the model.

Parameters:
  • path (str) – Path to save the model.

  • overwrite (bool, default=False) – Whether or not to overwrite the existing path if it already exists. Ignored unless engine=”zarr”.

  • save_data (str) – Whether or not to save the full input data along with the fitted components.

  • engine ({"zarr", "netcdf4", "h5netcdf"}, default="zarr") – Xarray backend engine to use for writing the saved model.

  • **kwargs – Additional keyword arguments to pass to DataTree.to_netcdf() or DataTree.to_zarr().

scores()#

Return the scores of the left and right field.

The scores in MCA are the projection of the left and right field onto the left and right singular vector of the cross-covariance matrix.

Returns:

  • scores1 (DataArray) – Left scores.

  • scores2 (DataArray) – Right scores.

scores_amplitude() Tuple[DataArray, DataArray]#

Compute the amplitude of the scores.

The amplitude of the scores are defined as

\[A_ij = |S_ij|\]

where \(S_{ij}\) is the \(i\)-th entry of the \(j\)-th score and \(|\cdot|\) denotes the absolute value.

Returns:

  • DataArray – Amplitude of the left scores.

  • DataArray – Amplitude of the right scores.

scores_phase() Tuple[DataArray, DataArray]#

Compute the phase of the scores.

The phase of the scores are defined as

\[\phi_{ij} = \arg(S_{ij})\]

where \(S_{ij}\) is the \(i\)-th entry of the \(j\)-th score and \(\arg(\cdot)\) denotes the argument of a complex number.

Returns:

  • DataArray – Phase of the left scores.

  • DataArray – Phase of the right scores.

serialize() DataTree#

Serialize a complete model with its preprocessors.

singular_values()#

Get the singular values of the cross-covariance matrix.

squared_covariance()#

Get the squared covariance.

The squared covariance corresponds to the explained variance in PCA and is given by the squared singular values of the covariance matrix.

squared_covariance_fraction()#

Calculate the squared covariance fraction (SCF).

The SCF is a measure of the proportion of the total squared covariance that is explained by each mode i. It is computed as follows:

\[SCF_i = \frac{\sigma_i^2}{\sum_{i=1}^{m} \sigma_i^2}\]

where m is the total number of modes and \(\sigma_i\) is the ith singular value of the covariance matrix.

total_covariance() DataArray#

Get the total covariance.

This measure follows the defintion of Cheng and Dunkerton (1995). Note that this measure is not an invariant in MCA.

transform(**kwargs)#

Project new “unseen” data onto the rotated singular vectors.

Parameters:
  • data1 (DataArray | Dataset | List[DataArray]) – Data to be projected onto the rotated singular vectors of the first dataset.

  • data2 (DataArray | Dataset | List[DataArray]) – Data to be projected onto the rotated singular vectors of the second dataset.

Returns:

Projected data.

Return type:

DataArray | List[DataArray]