Skip to main content
Advertisement
  • Loading metrics

Generalized contrastive PCA is equivalent to generalized eigendecomposition

  • Joshua P. Woller ,

    Roles Conceptualization, Formal analysis, Methodology, Writing – original draft

    joshua.woller@uni-tuebingen.de

    Affiliations Institute for Neuromodulation and Neurotechnology, University Hospital and University of Tübingen, Tübingen, Baden-Württemberg, Germany, German Center for Mental Health (DZPG), Tübingen, Baden-Württemberg, Germany, Max Planck Institute for Biological Cybernetics, Tübingen, Baden-Württemberg, Germany

  • David Menrath,

    Roles Conceptualization, Writing – review & editing

    Affiliations Institute for Neuromodulation and Neurotechnology, University Hospital and University of Tübingen, Tübingen, Baden-Württemberg, Germany, German Center for Mental Health (DZPG), Tübingen, Baden-Württemberg, Germany

  • Alireza Gharabaghi

    Roles Funding acquisition, Project administration, Resources, Supervision, Writing – review & editing

    Affiliations Institute for Neuromodulation and Neurotechnology, University Hospital and University of Tübingen, Tübingen, Baden-Württemberg, Germany, German Center for Mental Health (DZPG), Tübingen, Baden-Württemberg, Germany, Center for Digital Health, Tübingen, Baden-Württemberg, Germany, Center for Bionic Intelligence Tübingen Stuttgart (BITS), Tübingen, Baden-Württemberg, Germany, Max Planck-University of Toronto Centre for Neural Science & Technology (MPUTC), Canada/Germany

PLOS Computational Biology recently published an article by de Oliveira and colleagues introducing a novel data decomposition method that enables explicit contrasts between experimental conditions, demonstrating its applicability across a range of biological datasets [1]. We fully agree with their call for improved multivariate analyses and share the authors’ critique of the limitations of contrastive PCA (cPCA; [2]), especially the reliance on hyperparameters. To address this, the authors introduce a novel contrastive decomposition via an eigendecomposition of pairs of contrast matrices, called generalized contrastive PCA (gcPCA).

However, our analysis of the gcPCA algorithm suggests that the core approach of gcPCA is mathematically equivalent to solving a generalized eigenvalue problem, which is a well known approach in linear algebra to perform a contrastive matrix decomposition. Here, we aim to position gcPCA within the broader framework of generalized eigenvalue decomposition (GED), highlighting its theoretical foundation and connection to established methods. Toward the end, we will provide a mathematical derivation of this equivalence.

The generalized eigenproblem can be formulated and solved in multiple ways, making it non-trivial to recognize that it underpins a broad range of seemingly distinct algorithms. It is for example used in linear discriminant analysis (LDA) or canonical correlation analysis (CCA). An excellent overview of its background and applications particularly in neuroscience, e.g. to decompose task-evoked activity with respect to baseline activity, is provided by Cohen [3]. GED has been used to remove stimulation-evoked activity from oscillatory background signals [8], or to extract spatial filters that maximize theta-band activity [9]. Further, GED has been used to remove specific neurostimulation artifacts from EEG data [10].

Closer to gcPCA, discriminative PCA (dPCA) [7] has previously been proposed as a GED-based improvement of cPCA that also eliminates the need for hyperparameters. This shows that GED is the basis of a whole family of methods.

In parallel to other GED-based methods we hence argue that gcPCA constitutes a supervised method, since it relies on the definition of two matrices derived from covariances of different data segments. Aggregating data into separate matrices based on a researcher-defined criterion effectively provides the algorithm with labeled data, even if labels are not explicitly passed along. By construction, such methods should not be considered unsupervised, in contrast to approaches such as principal component analysis (PCA) or independent component analysis (ICA).

Furthermore, we would like to highlight a key advantage of gcPCA: the extracted components are not required to be orthogonal in feature space, a property shared with other GED-derived methods, and notably ICA. Non-orthogonality in feature space is a key strength compared to standard PCA. Orthogonalization restricts subsequent components to subspaces orthogonal to previously extracted ones; this makes it difficult to interpret later principal components independent of earlier ones. In this light, a stronger mathematical rationale for orthogonal gcPCA, especially how regular gcPCs relate to their orthogonalized counterparts, would clarify whether orthogonality aids or hinders analysis. For instance, alternating extraction of high- and low-eigenvalue components may introduce bias, particularly if low-eigenvalue noise components are allowed to restrict the subspace into which high-eigenvalue signals are then projected. The degree to which extraction order influences orthogonal gcPCA outcomes remains unclear. While such drawbacks of the orthogonalization are not evident in S3 Fig of [1], standard gcPCs in this example already appear to be close to orthogonality, obscuring possible detrimental effects of the method.

Departing from those general observations, we briefly review the original algorithm, and demonstrate its equivalence to a generalized eigenproblem.

Let be covariance matrices, and let be matrices defined as linear compositions of and (or for cPCA). The authors define various versions of gcPCA, with varying values of (see Table 1):

thumbnail
Table 1. Variants of contrastive or generalized contrastive PCA (gcPCA), defined by different choices of matrices A and B.

https://doi.org/10.1371/journal.pcbi.1013555.t001

The authors use the following optimization criterion:

(1)

To maximize this contrast (1) between matrices , the authors solve the following eigenproblem:

(2)

where are the eigenvalues and eigenvectors, respectively.

We may identify expression (1) as the generalized Rayleigh coefficient:

(3)

The Rayleigh coefficient is used to solve standard (assuming ) and generalized eigenvalue problems.

The generalized eigenproblem can be formulated as follows:

(4)

where are the generalized eigenvalues and eigenvectors, respectively. Decomposing a matrix like this is called a generalized eigendecomposition (GED) [3].

Solvers for this are implemented in numerical libraries such as LAPACK, and scientific computing libraries such as MATLAB or scipy, i.e. using eig(A,B) and linalg.eig(A,B), respectively. We have tested that these solvers produce equivalent solutions to the gcPCA methods in Table 1.

In the following, we will show that (2) is a generalized eigenproblem (4), up to a rotation of the eigenvectors by , which is also later applied in the provided gcPCA software package (https://github.com/SjulsonLab/generalized_contrastive_PCA).

The eigenproblem to maximize (1) is formulated as (2):

This can be reformulated as follows:

(5)(6)

which simplifies to:

(7)

which is the generalized eigenproblem (4).

The eigenvalues of (2) and (4) for a given matrix pair are identical. The corresponding eigenvectors u (2) and x (4) are related by the transforms

For invertible B, (4) can be reduced to the following standard eigenproblem:

(8)

A thorough mathematical and computational overview of generalized eigenproblems is given by Ghojogh and colleagues [4]. Given the connection between gcPCA and GED, it may further be helpful to consider how sparse gcPCA relates to sparse GED [5, 6]. This could offer insight into theoretical links and efficient algorithms.

Placing gcPCA within this broader methodological context of generalized eigenvalue decomposition provides a better understanding of its practical value and interpretive strengths. The study by de Oliveira and colleagues underlines the value of supervised contrastive decomposition techniques for biological data. Their approach, based on GED, produces interpretable components and shows substantial improvements over conventional PCA.

References

  1. 1. de Oliveira EF, Garg P, Hjerling-Leffler J, Batista-Brito R, Sjulson L. Identifying patterns differing between high-dimensional datasets with generalized contrastive PCA. PLoS Comput Biol. 2025;21(2):e1012747. pmid:39919147
  2. 2. Abid A, Zhang MJ, Bagaria VK, Zou J. Exploring patterns enriched in a dataset with contrastive principal component analysis. Nat Commun. 2018;9(1):2134. pmid:29849030
  3. 3. Cohen MX. A tutorial on generalized eigendecomposition for denoising, contrast enhancement, and dimension reduction in multichannel electrophysiology. Neuroimage. 2022;247:118809. pmid:34906717
  4. 4. Ghojogh B, Karray F, Crowley M. Eigenvalue and generalized eigenvalue problems: tutorial. arXiv preprint 2019. arXiv:1903.11240
  5. 5. Song J, Babu P, Palomar DP. Sparse generalized eigenvalue problem via smooth optimization. IEEE Trans Signal Process. 2015;63(7):1627–42.
  6. 6. Han X, Clemmensen L. Regularized generalized eigen-decomposition with applications to sparse supervised feature extraction and sparse discriminant analysis. Pattern Recognition. 2016;49:43–54.
  7. 7. Wang G, Chen J, Giannakis GB. DPCA: dimensionality reduction for discriminative analytics of multiple large-scale datasets. In: 2018 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP). 2018. p. 2211–5. https://doi.org/10.1109/icassp.2018.8461744
  8. 8. Kragel JE, Lurie SM, Issa NP, Haider HA, Wu S, Tao JX, et al. Closed-loop control of theta oscillations enhances human hippocampal network connectivity. Nat Commun. 2025;16(1):4061. pmid:40307237
  9. 9. Arnau S, Liegel N, Wascher E. Frontal midline theta power during the cue-target-interval reflects increased cognitive effort in rewarded task-switching. Cortex. 2024;180:94–110. pmid:39393200
  10. 10. Haslacher D, Nasr K, Robinson SE, Braun C, Soekadar SR. Stimulation artifact source separation (SASS) for assessing electric brain oscillations during transcranial alternating current stimulation (tACS). Neuroimage. 2021;228:117571. pmid:33412281