xeofs.models.CCA#
- class xeofs.models.CCA(n_modes: int = 2, use_coslat: bool = False, check_nans: bool = True, c: float = 0, pca: bool = True, variance_fraction: float = 0.99, init_pca_modes: float = 0.75, compute: bool = True, eps: float = 1e-06)#
Bases:
CCABaseModel
Canonical Correlation Analysis.
Canonical Correlation Analysis (CCA) identifies linear combinations of variables from multiple datasets that maximize their mutual correlations. An optional regularisation parameter (ridge regression) can be used to improve the conditioning of the covariance matrix.
The objective function of (regularised) CCA is:
\[ \begin{align}\begin{aligned}\begin{split}w_{opt}=\underset{w}{\mathrm{argmax}}\{ w_1^TX_1^TX_2w_2 \}\\\end{split}\\\text{subject to:}\\(1-c_1)w_1^TX_1^TX_1w_1+c_1w_1^Tw_1=n\\(1-c_2)w_2^TX_2^TX_2w_2+c_2w_2^Tw_2=n\end{aligned}\end{align} \]where \(c_i\) are the regularization parameters for dataset.
- Parameters:
n_modes (int, optional) – Number of latent dimensions to use, by default 10
use_coslat (bool, optional) – Whether to use the square root of the cosine of the latitude as weights, by default False
pca (bool, optional) – Whether to perform PCA on the input data, by default True
variance_fraction (float, optional) – Fraction of variance to keep when performing PCA, by default 0.99
init_pca_modes (int | float, optional) – Number of PCA modes to compute. If float, the number of modes is given by the fraction of maximum number of modes for the given data. A value of 1.0 will perform a full SVD of the data. Choosing a smaller value can increase computation speed. Default 0.75
c (Sequence[float] | float], optional) – Regularisation parameter, by default 0 (no regularization)
compute (bool, optional) – Whether to compute the decomposition immediately, by default True
Notes
This implementation is largely based on the MCCA class from the cca_zoo repository [3] .
References
Examples
>>> from xe.models import CCA >>> model = CCA(n_modes=5) >>> model.fit(data) >>> can_loadings = model.canonical_loadings()
- __init__(n_modes: int = 2, use_coslat: bool = False, check_nans: bool = True, c: float = 0, pca: bool = True, variance_fraction: float = 0.99, init_pca_modes: float = 0.75, compute: bool = True, eps: float = 1e-06)#
Methods
__init__
([n_modes, use_coslat, check_nans, ...])components
([normalize])Get the canonical loadings for each view.
Get the explained covariance.
Get the explained covariance ratio.
Get the explained variance for each view.
Get the explained variance ratio for each view.
fit
(views, dim)Get metadata routing of this object.
get_params
([deep])Get parameters for this estimator.
scores
()Get the canonical variates for each view.
set_fit_request
(*[, dim, views])Request metadata passed to the
fit
method.set_params
(**params)Set the parameters of this estimator.
set_transform_request
(*[, views])Request metadata passed to the
transform
method.transform
(views)Transform the input data into the canonical space.
weights
()- components(normalize: bool = True) List[DataArray | Dataset | List[DataArray | Dataset]] #
Get the canonical loadings for each view.
- explained_covariance() DataArray #
Get the explained covariance.
- explained_covariance_ratio() DataArray #
Get the explained covariance ratio.
- explained_variance() List[DataArray] #
Get the explained variance for each view.
- explained_variance_ratio() List[DataArray] #
Get the explained variance ratio for each view.
- get_metadata_routing()#
Get metadata routing of this object.
Please check User Guide on how the routing mechanism works.
- Returns:
routing – A
MetadataRequest
encapsulating routing information.- Return type:
MetadataRequest
- get_params(deep=True)#
Get parameters for this estimator.
- Parameters:
deep (bool, default=True) – If True, will return the parameters for this estimator and contained subobjects that are estimators.
- Returns:
params – Parameter names mapped to their values.
- Return type:
dict
- scores() List[DataArray] #
Get the canonical variates for each view.
- set_fit_request(*, dim: bool | None | str = '$UNCHANGED$', views: bool | None | str = '$UNCHANGED$') CCA #
Request metadata passed to the
fit
method.Note that this method is only relevant if
enable_metadata_routing=True
(seesklearn.set_config()
). Please see User Guide on how the routing mechanism works.The options for each parameter are:
True
: metadata is requested, and passed tofit
if provided. The request is ignored if metadata is not provided.False
: metadata is not requested and the meta-estimator will not pass it tofit
.None
: metadata is not requested, and the meta-estimator will raise an error if the user provides it.str
: metadata should be passed to the meta-estimator with this given alias instead of the original name.
The default (
sklearn.utils.metadata_routing.UNCHANGED
) retains the existing request. This allows you to change the request for some parameters and not others.New in version 1.3.
Note
This method is only relevant if this estimator is used as a sub-estimator of a meta-estimator, e.g. used inside a
Pipeline
. Otherwise it has no effect.- Parameters:
dim (str, True, False, or None, default=sklearn.utils.metadata_routing.UNCHANGED) – Metadata routing for
dim
parameter infit
.views (str, True, False, or None, default=sklearn.utils.metadata_routing.UNCHANGED) – Metadata routing for
views
parameter infit
.
- Returns:
self – The updated object.
- Return type:
object
- set_params(**params)#
Set the parameters of this estimator.
The method works on simple estimators as well as on nested objects (such as
Pipeline
). The latter have parameters of the form<component>__<parameter>
so that it’s possible to update each component of a nested object.- Parameters:
**params (dict) – Estimator parameters.
- Returns:
self – Estimator instance.
- Return type:
estimator instance
- set_transform_request(*, views: bool | None | str = '$UNCHANGED$') CCA #
Request metadata passed to the
transform
method.Note that this method is only relevant if
enable_metadata_routing=True
(seesklearn.set_config()
). Please see User Guide on how the routing mechanism works.The options for each parameter are:
True
: metadata is requested, and passed totransform
if provided. The request is ignored if metadata is not provided.False
: metadata is not requested and the meta-estimator will not pass it totransform
.None
: metadata is not requested, and the meta-estimator will raise an error if the user provides it.str
: metadata should be passed to the meta-estimator with this given alias instead of the original name.
The default (
sklearn.utils.metadata_routing.UNCHANGED
) retains the existing request. This allows you to change the request for some parameters and not others.New in version 1.3.
Note
This method is only relevant if this estimator is used as a sub-estimator of a meta-estimator, e.g. used inside a
Pipeline
. Otherwise it has no effect.- Parameters:
views (str, True, False, or None, default=sklearn.utils.metadata_routing.UNCHANGED) – Metadata routing for
views
parameter intransform
.- Returns:
self – The updated object.
- Return type:
object
- transform(views: Sequence[DataArray | Dataset | List[DataArray | Dataset]]) List[DataArray] #
Transform the input data into the canonical space.
- Parameters:
views (List[DataArray | Dataset]) – Input data to transform