G protein-coupled receptors (GPCRs) are the most widely targeted gene family for Food and Drug Administration (FDA)-approved drugs. To assess possible roles for GPCRs in cancer, we analyzed The Cancer Genome Atlas (TCGA) data for mRNA expression, mutations, and copy number variation (CNV) in 20 categories and 45 subtypes of solid tumors and quantified differential expression (DE) of GPCRs by comparing tumors against normal tissue from the Gene Tissue Expression Project (GTEx) database. GPCRs are overrepresented among coding genes with elevated expression in solid tumors. This analysis reveals that most tumor types differentially express >50 GPCRs, including many targets for approved drugs, hitherto largely unrecognized as targets of interest in cancer. GPCR mRNA signatures characterize specific tumor types and correlate with expression of cancer-related pathways. Tumor GPCR mRNA signatures have prognostic relevance for survival and correlate with expression of numerous cancer-related genes and pathways. GPCR expression in tumors is largely independent of staging, grading, metastasis, and/or driver mutations. GPCRs expressed in cancer cell lines largely parallel GPCR expression in tumors. Certain GPCRs are frequently mutated and appear to be hotspots, serving as bellwethers of accumulated genomic damage. CNV of GPCRs is common but does not generally correlate with mRNA expression. Our results suggest a previously underappreciated role for GPCRs in cancer, perhaps as functional oncogenes, biomarkers, surface antigens, and pharmacological targets.
Citation: Sriram K, Moyung K, Corriden R, Carter H, Insel PA (2019) GPCRs show widespread differential mRNA expression and frequent mutation and copy number variation in solid tumors. PLoS Biol 17(11): e3000434. https://doi.org/10.1371/journal.pbio.3000434
Academic Editor: Nancy Hynes, Friedrich Miescher Institute, SWITZERLAND
Received: February 7, 2019; Accepted: October 24, 2019; Published: November 25, 2019
Copyright: © 2019 Sriram et al. This is an open access article distributed under the terms of the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are credited.
Data Availability: All relevant data are within the paper and its Supporting Information files, with additional supporting material at an open-access website providing data on GPCRs in cancer: https://insellab.github.io/. Sources for numerical values used to plot certain figures are provided in figure captions, where relevant.
Funding: This work was supported by the American Society of Pharmacology and Experimental Therapeutics (ASPET) David Lehr Award and Padres Pedal the Cause Number PTC2017 (PAI). The funders had no role in study design, data collection and analysis, decision to publish, or preparation of the manuscript.
Competing interests: The authors have declared that no competing interests exist.
Abbreviations:: ACC, Adrenocortical Cancer; BLCA, bladder cancer; BRCA, breast cancer; CAF, cancer-associated fibroblast; CCLE, Cancer Cell Line Encyclopedia; CESC, Cervical Cancer; CNA, copy number amplification; CNV, copy number variation; COAD, colon adenocarcinoma; CPM, Counts Per Million; DE, differential expression; EBI, European Bioinformatics Institute; EMA, European Medicines Agency; ESCA, esophageal cancer; FDA, Food and Drug Administration; FDR, false discovery rate; FPKM, Fragments Per Kilobase of exon, per Million reads; GO, gene ontology; GPCR, G protein-coupled receptor; GSEA, gene set enrichment analysis; GTEx, Gene Tissue Expression Project; GtoPdb, Guide to Pharmacology database; Her2, Human Epidermal Growth Factor Receptor-2; HR, hormone receptor; IDC, infiltrating ductal carcinoma; iRAP, integrated RNA-seq analysis pipeline; KEGG, Kyoto Encyclopedia of Genes and Genomes; KIRC, kidney clear cell carcinoma; KIRP, kidney papillary cell carcinoma; LIHC, liver hepatocellular carcinoma; LSQC, lung squamous cell carcinoma; LUAD, lung adenocarcinoma; MAPK, mitogen-activated protein kinase; MDS, Multidimensional Scaling; mPP, modified Peto-Peto; NCBI, National Center for Biotechnology Information; Nmut, number of mutations per genome; NOS, Not Otherwise Specified; OV, ovarian cancer; PAAD, pancreatic adenocarcinoma; PDAC, pancreatic ductal adenocarcinoma; PI3K, Phosphoinositide 3-Kinase; PRAD, Prostate Adenocarcinoma; RNA-seq, RNA sequencing; SKCM, skin cutaneous melanoma; STAD, stomach adenocarcinoma; TCGA, The Cancer Genome Atlas; THCA, thyroid cancer; TPM, Transcripts Per Million; UCS, Uterine Carcinoma; VWF, Von Willebrand Factor
G protein-coupled receptors (GPCRs), the largest family of cell-surface receptors (>800 in the human genome), mediate the signaling of a wide variety of ligands, including hormones, neurotransmitters, proteases, lipids, and peptides. GPCRs regulate many functions (e.g., metabolism, migration, proliferation) and interactions of cells with their environment, with diverse expression in normal tissue (mined in this study from the Gene Tissue Expression Project [GTEx] database ) and in disease. GPCRs are also the largest family of targets for approved drugs [2,3,4], interacting with approximately 35% of Food and Drug Administration (FDA)-approved drugs, but are infrequently targeted in tumors other than endocrine cancers, even though a role for GPCRs has been implicated in features of the malignant phenotype [5,6]. One reason for their limited use is the notion that GPCRs are rarely mutated in cancer [7,8]—although mutations occur in heterotrimeric GTP binding (G) proteins that GPCRs activate —and that GPCRs regulate pathways, such as Wnt, mitogen-activated protein kinase (MAPK), and Phosphoinositide 3-Kinase (PI3K) signaling, with mutations in cancer . The biological relevance of GPCRs for the malignant phenotype and their high druggability imply that GPCRs might be an underexplored class of contributors to and targets in cancer.
To define the landscape of GPCRs in cancer, we undertook an integrated analysis of Differential Expression (DE), mutations, and copy number variation (CNV) of GPCRs, which are annotated by the Guide to Pharmacology database (GtoPdb) , in 20 types of solid tumors (Table 1 and S1 and S2 Tables). Using RNA sequencing (RNA-seq) data from The Cancer Genome Atlas (TCGA) and the GTEx database , we performed DE analysis of GPCRs in tumors compared to normal tissue, respectively, an analysis facilitated by the TOIL recompute project . We studied GPCRs annotated by GtoPdb , including endo-GPCRs (which respond to endogenous agonists) and taste receptors but not olfactory GPCRs for which such annotations are unavailable (S1 Table). Our findings identify many differentially expressed GPCRs in solid tumors and corresponding cancer cell lines but a less important role for mutations and CNV. GPCRs with DE predict survival and are associated with expression of oncogenes and tumorigenic pathways. Overall, these results reveal a largely underappreciated potential of GPCRs as contributors to cancer biology and as potential therapeutic targets. Our results from DE (N = 6,224 individual tumors), mutation (N = 5103), and CNV analyses (N = 7545) are available as a resource at insellab.github.io.
TCGA cancer type and subclassification, if applicable, for solid tumors with distinct histological classification are shown, along with the number of replicates and GPCRs with increased or decreased expression for each type of tumor.
DE of GPCRs in solid tumors compared to normal tissues
We focused on GPCRs with both substantial DE and magnitude of expression in solid tumors, i.e., (1) >2-fold increase or decrease in DE in tumors compared to normal tissue, (2) false discovery rate (FDR) < 0.05 and (3) median expression in tumors > 1 Transcripts Per Million (TPM). We used the latter threshold for median expression in order to identify GPCRs that may be useful as therapeutic targets, for which higher expression is preferable. For DE analysis, we divided the 20 TCGA tumor types into 45 tumor subtypes (Table 1), based on histological classification of tumors in TCGA metadata. We found that different tumor subtypes within the same TCGA tumor classification have distinct GPCR expression, e.g., subtypes of breast cancer (BRCA), thyroid cancer (THCA), and esophageal cancer (ESCA) (S1A–S1F Fig).
Fig 1A shows a heatmap with fold changes (where statistically significant) for mRNA expression of all GPCR genes in solid tumors, relative to their corresponding normal tissue. Hierarchical clustering was performed on the GPCR genes, revealing 3 main clusters of GPCRs: (a) those with frequent increases in expression across multiple tumor types, (b) those with relatively little change, and (c) GPCRs frequently reduced in expression in solid tumors compared to normal tissue. A phylogenetic tree identifying the GPCRs in each cluster is shown in S2 Fig. Among the most lethal forms of cancer (in terms of annual deaths), Fig 1B shows that >25 GPCRs have >2-fold increased expression relative to normal tissue (with median expression in tumors >1 TPM), whereas >20 GPCRs have significantly down-regulated expression while remaining expressed >1 TPM in solid tumors. Thus, large numbers of GPCRs that are expressed in solid tumors show DE, including tumor types that are most lethal. Fig 1C shows the 20 GPCRs that have >2-fold increased expression (with median expression >1 TPM) in the largest number of tumor types. Fig 1D shows the same, but for GPCRs frequently reduced >2-fold in expression but that are still detected at >1 TPM in those tumors. Among the GPCRs with frequently increased expression are receptors likely expressed in the tumor cells themselves (e.g., GPRC5A [11,12]) and expressed in the tumor microenvironment, such as in fibroblasts (e.g., F2R ) and immune cells (e.g., FPR3, CCR1, CCR5).
(A) For all 45 tumor subtypes, a heatmap showing the log2 fold-change of GPCR expression in tumors compared to normal tissue (positive values indicate higher expression in tumors), with hierarchical clustering of GPCR genes to reveal patterns of DE. (B) The number of GPCRs that show significant (FDR < 0.05) changes in expression compared to normal tissue among tumor types tested with large numbers of replicates (Table 1) and that correspond to the most lethal types of cancer. (C–D) The GPCRs that most frequently (i.e., in most tumor types) show increases (C) or decreases (D) in expression among the 45 tumor subtypes. DE data for GPCRs in all analyzed tumor types can be found in S2 Table, sheets 6–8. BRCA, breast cancer; COAD, colon adenocarcinoma; DE, differential expression; FDR, false discovery rate; GPCR, G protein-coupled receptor; Her2, Human Epidermal Growth Factor Receptor-2; IDC, Infiltrating Ductal Carcinoma; LSQC, lung squamous cell carcinoma; LUAD, lung adenocarcinoma; NOS, Not Otherwise Specified; PDAC, pancreatic ductal adenocarcinoma; PRAD, Prostate Adenocarcinoma.
S3A–S3C Fig shows DE for pancreatic ductal adenocarcinoma (PDAC) tumors (as an example) compared to normal pancreas. A Multidimensional Scaling (MDS) plot (S3A Fig) reveals clusters for tumors and normal tissue, implying distinct transcriptomic profiles. The more diffuse cluster of PDAC samples likely reflects their heterogeneity. Smear and volcano plots (S3B and S3C Fig) reveal many genes (>5,000) with high, statistically significant DE (FDR ≪ 0.05). S3D–S3F Fig show examples of genes (other than GPCRs) with high overexpression that prior studies implicated as having a role in PDAC. Multiple other tumor types also show expression of genes relevant to the malignant phenotype, cluster separately from their respective normal tissues, and have a large number of genes with DE, thus supporting the validity of our analysis.
Many GPCRs show DE in tumors, including those from each GPCR class: A (rhodopsin-like), B (secretin-like), C (metabotropic glutamate and others), frizzled, and adhesion GPCRs. The highest expressed GPCRs in PDAC tumors (as an example, this finding is generalizable to other tumor types) are generally overexpressed compared to normal tissue and include orphan receptors (e.g., GPRC5A and ADGRF4/GPR115) and GPCRs with known agonists (e.g., GPR68) (Fig 2A and 2C). GPRC5A, the most highly expressed GPCR in PDAC, is 50-fold higher expressed; 95% of PDAC samples have >8-fold higher median GPRC5A expression than in normal pancreas (S3J Fig). Within a tumor type, a large majority of individual tumors express such overexpressed GPCRs at far higher levels than corresponding normal tissue (Fig 3A, e.g., GPRC5A); a subset of GPCRs are expressed in >90% of PDAC tumors at abundances greater than in any normal pancreas sample (Fig 3B). As discussed in subsequent sections, Fig 3B, 3D–3F indicate that GPCR expression is relatively consistent, ubiquitous among members of a given tumor cohort, and largely independent of patient characteristics (e.g., tumor grade, pathological T, and sex, Fig 3D–3F) that one might use to stratify patients. In addition, the GPCRs expressed couple to all major classes of Gα G protein signaling mechanisms (Fig 3C).
(A–B) The 30 highest expressed GPCRs in PDAC (A) or primary SKCM (B) and their corresponding expression in normal pancreatic (A) or skin (B) tissue. Expression data in TPM and CPM for all tumor types and normal tissue can be found at https://insellab.github.io/gpcr_tcga_exp. (C–D) The 30 GPCRs with the highest fold-increase in expression in tumors compared to normal tissue for (C) PDAC and (D) primary SKCM, sorted by fold-increase in tumors compared to normal tissue. Data on GPCR DE used in these plots can be found in S2 Table, sheets 6–8. (E–F) For two highly expressed GPCRs in (A–D) as examples, the median expression of (E) GPRC5A and (F) GPR143 in all tumor types tested and corresponding normal tissue, normalized in CPM, allowing for comparison between tissue/tumor types. A lookup file that enables generation of similar plots (as well as upper and lower quartiles of expression) for any GPCR can be found at https://insellab.github.io/gpcr_tcga_exp. Plots for GPCR expression can be generated using the spreadsheet for visualization of expression provided at https://insellab.github.io/gpcr_tcga_exp. CPM, Counts Per Million; DE, differential expression; GPCR, G protein-coupled receptor; PDAC, pancreatic ductal adenocarcinoma; SKCM, skin cutaneous melanoma; TPM, Transcripts Per Million.
(A) The expression of GPRC5A in all PDAC samples and normal pancreas tissues analyzed. (B) Frequency of 2-fold increase and percent of TCGA-PDAC samples with higher maximal expression compared to normal pancreas of the indicated GPCRs with comparison of the frequency of mutations of KRAS and TP53, the most frequent somatic, nonsilent mutations in PDAC tumors in TCGA. (C) Changes in GPCR expression in PDAC alter the GPCR repertoire that couple to different G proteins. (D–F) Tumor grade (D), pathological T (E), and patient sex (F) do not impact on GPCR expression. The 30 highest expressed GPCRs in PDAC tumors are shown for each case; no statistically significant differences occur between the groups. The numerical values used to generate panels A–F can be found at https://insellab.github.io/data. CPM, Counts Per Million; GPCR, G protein-coupled receptor; GTEx, Gene Tissue Expression Project; PDAC, pancreatic ductal adenocarcinoma; TCGA, The Cancer Genome Atlas; TPM, Transcripts Per Million.
Similar to PDAC, numerous GPCRs are highly, consistently overexpressed in other tumor types. For instance, in skin cutaneous melanoma (SKCM) (Fig 2B and 2D), ADGRG1/GPR56, GPR143, and EDNRB are highly overexpressed and highly expressed in magnitude compared to normal skin in >90% of melanoma samples (S3G–S3I Fig). In general, such highly overexpressed GPCRs are expressed in the vast majority—typically >90%—of samples within a tumor subtype.
Fig 2E shows the median expression of GPRC5A, the highest expressed GPCR in PDAC (and highly differentially expressed, Fig 2A and 2C) and highly expressed in many adenocarcinomas. Fig 2F shows this analysis across all tumor types and normal tissue for GPR143, the highest up-regulated GPCR in SKCM. For many GPCRs, we observe similar patterns of expression with pronounced up-regulation in tumors, as shown in Fig 1. To facilitate exploration of these data, we provide a spreadsheet-based tool (downloadable at https://insellab.github.io/gpcr_tcga_exp) wherein users can generate similar plots (along with upper and lower quartiles of expression) as in Fig 2E–2F for any GPCR gene of interest. Of note, overexpression of certain GPCRs tends to be more prevalent within specific tumor types and/or subtypes than are common mutations. For example, KRAS and TP53 are the most frequently mutated genes in PDAC (>70% and >60% of TCGA samples, respectively), but increased expression of multiple GPCRs occurs with greater frequency (Fig 3B). Each GPCR shown in Fig 3B shows statistically elevated expression in tumors compared to normal tissue, with FDRs ≪ 0.05. The magnitude of DE and corresponding FDR for each GPCR shown are provided in S2 Table and the project website.
A resource for exploring GPCR expression in tumors and normal tissue
We compiled a list of GPCRs overexpressed in solid tumors with fold-changes and FDR along with expression in TPM (for median expression and within-group comparisons of different genes) and Counts Per Million (CPM; for intergroup comparisons of the same gene). The analysis revealed that 35 of 45 tumor types/subtypes show increased expression of >30 GPCRs; 203 GPCRs are overexpressed in at least one type of cancer (S2 and S6 Tables), including 47 orphan GPCRs and >15 GPCRs that couple to each of the major G protein classes. Increased expression of 130 GPCRs occurs in ≥4 tumor subtypes (S2 Table). A subset of GPCRs is overexpressed in many tumors, e.g., FPR3 in 38 of the 45 tumor categories. S6 Table lists other examples along with GPCRs that have reduced expression compared to normal tissue. Importantly, of the 203 GPCRs with increased expression in one or more tumors, 77 are targets for approved drugs. These include ADORA2B, CCR5, and F2R, which are overexpressed in 27, 27, and 26 tumor subtypes, respectively. S6 and S2 Tables provide further details regarding such druggable GPCRs.
Data generated and mined in this study (including DE analysis), renormalized GPCR expression data, expression of every GPCR in every individual tumor sample analyzed, accompanying CNV data, and mutation data for GPCRs are provided in the online resource at https://insellab.github.io. This resource also includes plots of GPCR expression in tumors and corresponding normal tissue (such as those shown in Fig 2A and 2C), MDS scaling plots showing the extent to which tumor and normal samples cluster together based on overall gene expression, the aforementioned lookup tool to plot GPCR expression, and information on GPCR expression in cancer cell lines. In addition, high-resolution images of the heatmaps and phylogenetic trees in Figs 1A and 4A and 4B and S2 are available for download. Thus, while we have shown examples of GPCR expression and DE in certain tumor types in this text, the resource website provides similar information for all GPCRs in all tumor types.
(A) A phylogenetic tree showing the hierarchical clustering of GPCRs from Fig 1A reveals subsets of GPCRs that are either high or low expressed in solid tumors. (B) Hierarchical clustering of types of solid tumors, based on their expression (in CPM) of GPCRs, reveals clusters of tumor types, characterized by expression of particular GPCRs. CPM, Counts Per Million; GPCR, G protein-coupled receptor; TPM, Transcripts Per Million.
Patterns of GPCR expression across solid tumors
Fig 4A shows a heatmap that plots median GPCR expression in TPM across all tumor types and for all nonolfactory GPCRs and reveals that GPCRs can be divided into 4 groups: (a) those widely expressed at high levels in all tumor types (often at >10 TPM), (b) those with intermediate expression more broadly (approximately 1 TPM), but with high expression in specific tumor types, (c) GPCRs with generally very low expression (approximately O[0.1] TPM or less) in most tumor types, and (d) GPCRs that are generally not detected in solid tumors. A phylogenetic (Fig 4B) tree shows the identities of these groups of GPCRs. In general, many of the GPCRs in Fig 4 that are widely expressed in cancer are also widely differentially expressed (Fig 1).
We also performed hierarchical clustering on tumor types to explore whether GPCR expression is distinct in different subsets of solid tumors. Fig 5 shows a phylogenetic tree of solid tumors, based on hierarchical clustering of median GPCR expression in CPM (thereby allowing for comparisons between groups) in each tumor type. Certain types of tumors cluster together, in a manner that corresponds more generally to their biology. For example, nearly all adenocarcinomas cluster together and express a common set of GPCRs (Fig 5). Similar findings are observed for the other clusters. All tumor types within each group had a median expression of these GPCRs ≥ 10 TPM, implying a common GPCR profile for adenocarcinomas compared to squamous cell carcinomas or other groups highlighted. Such widely expressed GPCRs may be potential drug targets. Several of these GPCRs commonly appear across multiple groups/clusters but at different levels of expression. For example, GPRC5A is widely expressed in both adenocarcinomas and squamous cell carcinomas but is typically higher expressed in adenocarcinomas.
GPCR expression is associated with cancer-related pathways and with survival: PDAC as an example
A subset of GPCRs in PDAC is highly overexpressed and prominently expressed in tumors and in PDAC cells (vide infra). Combining expression (normalized to median expression in PDAC) of 5 of the most highly differentially expressed GPCRs (ADGRF1, ADGRF4, GPRC5A, HRH1, and LPAR5) yields a composite “marker” the expression of which positively correlates with a subset of approximately 1,200 genes (Benjamini-Hochberg adjusted p < 0.001) (S5A Fig). A reconstruction of the resulting network of genes using STRING  is shown in Fig 6A.
(A) Network analysis via STRING  of the genes positively correlated with expression of GPCRs highly expressed in PDAC, with FDR < 0.001, revealing distinct clusters of genes associated with specific cancer-related pathways. A high-resolution version of the network is available on the accompanying website. (B) Based on GSEA preranked analysis , KEGG  gene sets with positive enrichment among genes most significantly positively or negatively correlated with expression of the identified subset of overexpressed GPCRs. (C) Analysis via GO [17,18] and Enrichr  of the genes positively correlated in (A), with FDR < 0.001, indicates enrichment for cellular compartments associated with exosomes, microvesicles and others; data shown are for results from Enrichr using the Jensen compartment database . (D) Leading-edge analysis of GSEA results from (B), showing numerous cancer-related genes, including KRAS, which are common to multiple enriched KEGG genes sets. (E) GPCRs in PDAC show a positive, statistically significant association (FDR < 0.001) between their expression and that of markers for various cell types as shown. The top 10 GPCRs associated with each cell type are highlighted, along with the strength/significance of these correlations. (F, G) Kaplan-Meier curves for the impact of combined, normalized expression of subsets of GPCRs on PDAC patients. Total number of patients = 141; 59 patients were censored due to inadequate follow-up. (F) Impact of highly expressed GPCRs on median survival: 652 days (if below median expression) and 470 days (if above median expression) and for (G) impact of highly expressed immune-related GPCRs on median survival: 460 days (below median expression) and 603 days (above median expression). Numerical values used to generate panels B, C, D, F, and G can be found at https://insellab.github.io/data. FDR, false discovery rate; GO, gene ontology; GPCR, G protein-coupled receptor; GSEA, gene set enrichment analysis; KEGG, Kyoto Encyclopedia of Genes and Genomes; PDAC, pancreatic ductal adenocarcinoma.
We conducted further analyses related to PDAC—including with gene set enrichment analysis (GSEA)  of the sets of negatively and positively associated genes (with respect to the 5 gene markers described above), preranked/preweighted by their FDRs—and found an enrichment of a number of KEGG  pathways relevant to cancer (Fig 6B). The set of positively associated genes shows similar associations with cancer-related pathways when analyzed via gene ontology (GO) [17,18] and Enrichr . Enrichr also identifies, based on the Jensen compartment database , an enrichment of vesicle and exosome-related gene products among the set of positively correlated genes (Fig 6C). Network analysis of the genes positively associated with the composite GPCR marker via STRING  provides an intuitive picture of this gene set (Fig 6A). Two “clusters” of genes and pathways are evident: those associated with KRAS (including KRAS itself) and related processes (e.g., focal adhesion pathways) and a second group associated with regulation of cell cycle, cell division, and differentiation. Expression of highly overexpressed GPCRs is positively correlated with one another and with expression of KRAS, implicating this GPCR subset as a PDAC signature. Leading edge analysis of the GSEA results confirmed that KRAS and other oncogenes are common elements in multiple enriched gene sets associated with this GPCR signature (Fig 6D). Survival analysis indicates that GPCR expression has prognostic relevance: patients with above-median expression of the 5 GPCRs had an approximately 200-day-shorter survival compared to those with less than the median expression (Fig 6F).
We tested whether we could identify associations between GPCR expression in PDAC tumors and expression of markers for cell types commonly found in these tumors, thereby perhaps serving as an indicator for the cell types in the tumor microenvironment that express certain GPCRs. The cell types and markers we explored were as follows: tumor epithelial cells (E-cadherin as a marker), cancer-associated fibroblasts (CAFs; Collagen1A1), endothelial cells (Von Willebrand Factor [VWF]), the immune compartment (CD45), T cells (CD3G), myeloid cells (CD33), and macrophages (CD14). Pearson correlations were calculated between expression of each GPCR and of these markers, from which p-values were then calculated, followed by adjusted p-values using the Benjamini-Hochberg method so as to identify GPCRs associated with markers for each cell type. Fig 6E lists the most strongly correlated GPCRs for each cell type. For CAFs and cancer cells, these data appear in excellent agreement with prior experimental results. For example, GPR68/OGR1 is strongly associated with CAFs in our analysis, consistent with evidence that it is a novel functional receptor in PDAC-derived CAFs . Similarly, epithelial-enriched GPCRs (Fig 6E) are expressed in cancer cells  and, as shown below, in cancer cell lines.
PDAC tumors have a subset of highly overexpressed (and highly expressed in terms of magnitude) chemokine receptors (CCR6, CCR7, CXCR3 and CXCR4) not expressed in PDAC cancer cells but likely associated with immune cells and their activation. Genes that correlated with expression of these GPCRs are involved with immune-associated processes, especially T-cell–and B cell–related pathways (S4B Fig). Combined expression of these GPCRs is a positive predictor of survival (Fig 6G). The observation that GPCR expression may be a marker for survival is not unique to PDAC (see subsequent section “GPCRs as potential therapeutic targets in cancer,” where survival is discussed in more detail): In SKCM, ESCA, and liver hepatocellular carcinoma (LIHC), expression of individual GPCRs is associated with survival, but combinations of such GPCRs are even better predictors of survival. GPCR expression may thus serve as a prognostic indicator in multiple tumor types.
The finding that GPCRs with high expression and DE in tumors show an association with tumorigenic pathways appears generalizable. For example, GPCRs that are highly expressed or overexpressed in other adenocarcinomas (Fig 5) are positively correlated with expression of genes from pathways similar to the ones shown in Fig 6A–6C for PDAC, i.e., focal adhesion, cell motility, cell cycle and division. By contrast, GPCR expression of solid tumor types from different branches of the cancer/GPCR phylogenetic tree shows an association with different pathways from those observed in adenocarcinomas. For example, GPR143, EDNRB, and ADGRG1 are highly expressed and differentially expressed in SKCM and are adverse indicators of survival (Figs 10C and 10D and S3G–S3I). Pathways enriched among genes that correlate with expression of the GPCRs are ones implicated in metastatic SKCM (S5A–S5C Fig), such as transferrin transport , melanosome organization , and insulin receptor signaling . In general, highly expressed GPCRs in solid tumors show a positive correlation between GPCR expression and expression of tumorigenic pathways, implicating these GPCRs as potentially “functional oncogenes.”
Functionality of overexpressed GPCRs
Evidence for functional roles in cancer cells of GPCRs that are highly expressed and overexpressed in solid tumors and cancer cells include findings for PAR1/F2R in BRCA , gastric cancer , colon cancer , and melanoma  cells and for PAR2/F2RL1 in melanoma , BRCA , and colon cancer cells . Higher PAR2 expression in ovarian cancer (OV) predicts poorer prognosis . EDNRB, which is highly overexpressed in SKCM, promotes migration and transformation of melanocytes and melanoma cells, and inhibition of EDNRB is pro-apoptotic [32,33]. GPR143 promotes migration  and chemotherapeutic resistance  of melanoma cells. GPR160 and GPRC5A, two frequently overexpressed GPCRs, are orphan receptors that influence the malignant phenotype. Knockdown of GPR160 in prostate cancer cells increases apoptosis and growth arrest . It has been suggested that GPRC5A is an oncogene that promotes proliferation, migration, and colony formation of PDAC cells [11,12]. GPCRs with increased expression may thus be functional in cancer cells and activated by endogenous agonists or have constitutive activity that regulates signaling via heterotrimeric G proteins and/or β-arrestin . At least certain of the many overexpressed GPCRs may thus serve as phenotypic drivers.
Incorporating omics analysis similar to what is presented here, our laboratory has recently shown  that GPR68 (a proton-sensing GPCR) is highly overexpressed in PDAC tumors, in particular in CAFs. We validated these data at the protein level and discovered that GPR68 mediates symbiotic crosstalk between CAFs and PDAC cells and contributes to the tumor phenotype. Such findings provide an example as to how such omics data can identify overexpressed GPCRs with relevance to cancer cells themselves and to other cells in the tumor microenvironment.
Driver mutations, patient sex, and stage/grade of tumors does not impact on GPCR expression
GPCR expression and DE are largely independent of tumor stage and grade. Fig 3D shows the similarity in GPCR expression for Grades 1 to 3 (G1 to G3) PDAC tumors. Median expression of GPCRs was also similar in PDAC tumors with different pathological T (Fig 3E). Similarly, Stage 1 and Stage 3A BRCA infiltrating ductal carcinoma (IDC) Hormone Receptor–positive (HR+) tumors have comparable GPCR expression and DE (Fig 7D).
(A) Correlation of median expression of GPCRs in TP53 mutant and PI3K mutant HR-positive BRCA IDC tumors. (B) Median expression of the 25 highest expressed GPCRs in TP53-mutated tumors compared to expression in PI3K mutant HR-positive BRCA IDC tumors. (C) Fold-changes of GPCRs in TP53 mutant and PI3K mutation HR-positive BRCA IDC tumors compared to normal breast tissue. (D) Fold-changes of GPCRs in Stage 1 and Stage 3 HR-positive BRCA IDC tumors over normal breast tissue. (E) Expression of the 25 highest expressed GPCRs and (inset) correlation of median GPCR expression between primary and metastatic SKCM. (F) Gene expression of primary and distant metastatic SKCM tumors cluster differently and have large numbers of differentially expressed genes (S2 Table). (G) Median expression of highest expressed GPCRs in PDAC tumors compared to cancer cells, including those analyzed via methods in  (CCLE , n = 33; Genentech , n = 16; Witkiewicz and colleagues , n = 72). (H) Median expression of highest expressed GPCRs in CCLE PDAC cell lines compared to other cell lines (as in panel G), primary cells and PDAC tumors. (I) Median expression of highest expressed GPCRs in SKCM tumors compared to cancer cells, including those analyzed via methods in  (CCLE , n = 45; Genentech , n = 44; Müller and colleagues , n = 29). (J) Median expression of highest expressed GPCRs in CCLE SKCM cell lines compared to other cell lines (as in panel I), primary cells and SKCM tumors. Numerical values used to generate panels A–E and G–J can be found at https://insellab.github.io/data. MDS files for panel F and for all other tumor types can be found at https://insellab.github.io/mds_plots. BRCA, breast cancer; CCLE, Cancer Cell Line Encyclopedia; CPM, Counts Per Million; FC, fold-change; FPKM, Fragments Per Kilobase of exon, per Million reads; GPCR, G protein-coupled receptor; HR, hormone receptor; IDC, infiltrating ductal carcinoma; MDS, Multidimensional Scaling; PDAC, pancreatic ductal adenocarcinoma; SKCM, skin cutaneous melanoma; TCGA, The Cancer Genome Atlas; TPM, Transcripts Per Million.
GPCR expression appears largely independent of driver mutations, such as in BRCA HR+ IDC tumors with either PI3KA or TP53 mutations (Fig 7A–7C); both groups have similar GPCR expression and DE of the same GPCRs compared to normal breast tissue. Similar results occur for lung adenocarcinoma (LUAD) and stomach adenocarcinoma (STAD) that have or lack TP53 mutations. Increased GPCR expression in solid tumors may thus not depend on specific driver mutations. The presence of highly overexpressed GPCRs may be a more ubiquitous feature of tumors than the presence of specific driver mutations, as exemplified by PDAC (Fig 3B) and in other tumor types with DE of numerous GPCRs (Table 1).
Moreover, GPCR expression also appears to be independent of a patient’s sex. Fig 3F shows this for PDAC as a representative example, with the 30 highest expressed GPCRs in males and females. This finding appears to be generalizable to other tumor types as well. Thus, the elevated expression of highly expressed GPCRs in tumors appears to occur in nearly every patient within a tumor type and is independent of attributes such as sex, tumor progression, and the mutations present. As discussed below, GPCR expression tends to be independent of CNV. Each tumor type thus expresses a repertoire of GPCRs that is largely conserved among all patients with that type of cancer. This finding suggests the potential relevance of such highly expressed GPCRs as cancer drug targets.
GPCR expression is likely to be similar in metastatic and primary tumors
The SKCM gene expression dataset in TCGA has the most replicates of metastases. Primary and metastatic SKCM show similar expression and DE of GPCRs (e.g., GPR143, EDNRB, and other highly expressed GPCRs) even though major differences occur in overall gene expression between primary and metastatic SKCM (Fig 7E and 7F and S3G–S3I). We found similar results for GPCR expression with primary and metastatic BRCA and THCA tumors and for recurrent and primary ovarian tumors (S6A and S6F Fig), though the number of replicates for each is small (<10). As new databases with large numbers of metastatic and primary tumor become available, we anticipate extending this analysis to strengthen this conclusion.
GPCRs highly expressed in tumors are highly expressed in cancer cells
We assessed RNA-seq data for GPCR expression in cancer cell lines from the European Bioinformatics Institute (EBI) portal generated via the integrated RNA-seq analysis pipeline (iRAP)  for cell lines in the Cancer Cell Line Encyclopedia (CCLE)  and from Genentech . The use of a different analysis pipeline than that used for TCGA data does not allow direct statistical comparisons of the datasets but confirms that most GPCRs in TCGA tumors are present in cancer cells and vice versa. We also mined RNA-seq data for primary melanoma cells  and PDAC cells  from the National Center for Biotechnology Information (NCBI) Gene Expression Omnibus (GEO) database. The data from these sources (Methods, Section 1) allow an approximate comparison with data for tumors. S2 Table shows GPCR expression in cancer cell lines.
As an example, GPCRs with the highest median expression in TCGA PDAC tumors are generally highly expressed in pancreatic adenocarcinoma (PAAD; most likely PDAC) cell lines and patient-derived PDAC cells  (Fig 7G). A few exceptions exist, perhaps from effects of cell culture or expression by noncancer cells in tumors. Even so, highly expressed GPCRs in PAAD cells are highly expressed in PDAC tumors (Fig 7H), findings that also occur in other tumors, such as SKCM (Fig 7I and 7J). Thus, most highly expressed GPCRs in tumors are also highly expressed in cancer cells and vice versa.
Most overexpressed GPCRs are rarely mutated
The most frequently mutated GPCRs in solid tumors are rarely overexpressed (Fig 8A and 8B), and conversely, highly overexpressed GPCRs in solid tumors are rarely mutated (Fig 8A and 8B and S7 Table). In SKCM, which has the highest mutation burden among TCGA tumor types, the most highly overexpressed GPCRs (GPR143, EDNRB, and GPR56) are mutated in <2% of SKCM tumors, whereas frequently mutated GPCRs (e.g., GPR98, mutated in nearly 40% of tumors) typically have low expression. The most frequently overexpressed GPCRs across all tumors (e.g., FPR3; S7 Table) are mutated in <1% of all tumors surveyed, compared to frequently mutated GPCRs—e.g., GPR98, GPR112—which are mutated in >5% of all TCGA tumors surveyed. Thus, the frequency of GPCR mutations and likelihood of overexpression do not correlate (Fig 8B). The majority of GPCRs overexpressed in >20 tumor types/subtypes are mutated in <50 samples out of >5,000 TCGA samples surveyed. Furthermore, as discussed in the following sections on GPCR mutation, mutations to these GPCRs are predicted to have no functional impact and are not enriched significantly for mutations at specific sites; thus, overexpressed GPCRs in tumors are not expected to be altered in their function by mutations.
(A) The number of tumors with GPCR mutations and of tumor types/subtypes in which the same GPCR is overexpressed for the 10 most frequently mutated GPCRs in TCGA tumors surveyed and the 10 most frequently overexpressed GPCRs. (B) The frequency of overexpression compared to the frequency of mutations. Numerical values used to generate panels A and B can be found at https://insellab.github.io/data. GPCR, G protein-coupled receptor; TCGA, The Cancer Genome Atlas.
The predicted association of expressed GPCRs with Gα G proteins
S1 Table shows GPCRs annotated in the GtoPdb GPCR database , their signal transduction via G protein heterotrimers, and whether they are orphan GPCRs. Tissues and tumors typically express >150 GPCRs (at detection thresholds >0.1 TPM) that couple to the major types of G proteins (Gs, Gi/o, Gq/11, G12/13), most frequently Gi/Go and Gq/G11 (S7A and S7B Fig). We calculated the abundance of GPCRs that couple to each G protein by summing median GPCR expression (TPM), thereby yielding an expression “repertoire” for each signaling mechanism (S7C and S7D Fig). Gs-coupled GPCRs typically account for the smallest GPCR expression repertoire for which such coupling is known. S2 Table provides GPCR expression and G protein linkage data for normal tissues and solid tumors. Summing expression (TPM) of all GPCRs provides an estimate of the proportion of GPCRs among total mRNA (S7E and S7F Fig).
The GPCR expression repertoire varies among tissues in terms of total expression, number, and identities of GPCRs. Tumors typically have a different GPCR repertoire than normal tissue, with increased or decreased expression of many GPCRs (Figs 1C and S7E and S7F). Total GPCR expression and the number of GPCRs above expression thresholds increases in certain tumors (e.g., PDAC) but decreases in others (e.g., LIHC and SKCM) compared to normal tissue. Tumors also differ from normal tissue with respect to the abundance of GPCRs that couple to different G proteins (Figs 1I and S7C and S7D), suggesting changes in signaling. For example, Gs-coupled GPCR expression decreases in many tumors (e.g., PDAC, Fig 3C), implying decreases in cAMP signaling.
GPCRs as potential therapeutic targets in cancer
Among the >200 GPCRs overexpressed in at least one of the 45 tumor subtypes, 77 are targets for drugs approved by the FDA and/or European Medicines Agency (EMA). (S2 Table). Among these GPCR drug targets, >50% are overexpressed in 4 or more tumor subtypes (Fig 9A), and 15 GPCRs are increased in expression in 10 or more tumor subtypes. These results highlight the potential of GPCRs as targets in cancers and, importantly, for the possible repurposing of drugs approved for other indications. Among the 77 GPCRs with increased expression in tumors, nearly two-thirds link to either Gs- or Gi-coupled signaling and thus are predicted to regulate cAMP formation (Fig 9B).
(A) The number of GPCRs that are targets for approved drugs and have increased expression in 1–3, 4–9, or ≥10 tumor subtypes. (B) The linkage to G proteins of the 77 GPCRs targeted by approved drugs and with increased expression in at least 1 tumor subtype. Note: multiple GPCRs couple to more than one G protein. (C) The number of GPCRs targeted by approved drugs that show increased expression in lung, colon, pancreatic, breast, and prostate cancers, the leading causes of cancer deaths in the US. (D) Overrepresentation of GPCRs among genes with >4-fold elevated expression (FDR < 0.05) for the indicated tumor types/subtypes with p-value calculated via Fischer’s exact test. (E) The magnitude of overrepresentation (relative enrichment) of GPCRs corresponding to the p-value in panel H. Numerical values used to generate panel C can be found in S2 Table, in sheets on DE of druggable GPCRs. Numerical values for panels D and E can be found at https://insellab.github.io/data. ACC, Adrenocortical Cancer; Ad, Adenocarcinoma; BRCA, breast cancer; CESC, Cervical Cancer; COAD, colon adenocarcinoma; DE, differential expression; ESCA, esophageal cancer; FDR, false discovery rate; GPCR, G protein-coupled receptor; Her2, Human Epidermal Growth Factor Receptor-2; IDC, infiltrating ductal carcinoma; KIRC, kidney clear cell carcinoma; KIRP, kidney papillary cell carcinoma; LIHC, liver hepatocellular carcinoma; LSQC, lung squamous cell carcinoma; LUAD, lung adenocarcinoma; NOS, Not Otherwise Specified; OV, ovarian cancer; PDAC, pancreatic ductal carcinoma; PRAD,; SKCM, skin cutaneous melanoma; STAD, stomach adenocarcinoma; TGCT, testicular cancer; THCA, thyroid cancer; UCS, Uterine Carcinosarcoma.
Of the solid tumor types that we analyzed, lung, colon, pancreatic, breast, and prostate cancers account for the largest annual number of deaths in the US (https://www.cancer.gov/types/common-cancers). Fig 9C shows currently druggable GPCRs with increased expression in subtypes of those tumor types. Approved drugs target at least 10 GPCRs that have increased expression in those tumors. Hierarchical clustering of GPCR expression in different tumor types (based on their median expression of all GPCRs) reveals that GPCR expression distinguishes tumor types into groups that are consistent with other molecular/physiological traits (Fig 5). As discussed in sections above, GPCR expression appears to characterize categories of tumors, and certain GPCRs may be targets across tumor classes or families.
As a protein-coding family of genes, GPCRs are disproportionately enriched among overexpressed genes in solid tumors, when compared to all protein-coding genes. Evidence for this was obtained as follows: In each tumor type indicated (Fig 9D and 9E), the ratio of number of coding genes with increased expression (above a prescribed threshold) over the total number of differentially expressed genes present was computed for (a) GPCRs only and (b) all coding genes. Fischer’s exact test was used to verify whether overrepresentation of GPCRs among genes with increased expression is significant (p < 0.05). Data shown are for coding genes with >4-fold increased expression in solid tumors, i.e., highlighting genes with drastic increases in expression. We found that many tumor types have 2-fold or greater overrepresentation of GPCRs among coding genes with large increases in expression.
Moreover, extending from the data shown in Fig 6F and 6G, we find that many GPCRs expressed by solid tumors may be prognostic markers. GPCR mRNA expression is associated with differences in survival in multiple tumor types (Fig 10). Using GPCR expression normalized in CPM, we performed survival analysis using the modified Peto-Peto (mPP) test for every GPCR in tumor types for which sufficient numbers of TCGA replicates were available (typically >60 replicates, allowing for >30 samples in each group for survival analysis). We omitted tumor types such as testicular cancer (TGCT) for which almost no fatalities were recorded in the metadata. For certain tumor types with large numbers of replicates (e.g. OV, LUAD, lung squamous cell carcinoma [LSQC], etc.), we have also provided alternative survival analysis on the related website omitting patients who are >75 years old at time of diagnosis, to reduce the risk of other sources of mortality confounding the analysis. This analysis yielded many GPCRs with a statistically significant impact on survival of each tumor type. Combined expression, i.e., a mean-normalized sum of expression, of GPCRs that individually predict survival and/or are highly differentially expressed yield stronger composite markers with prognostic relevance. Fig 10A–10C shows examples of such combined markers in LSQC (panel A), LUAD (panel B), and SKCM (panel C), which demonstrate strong adverse effects on survival. Certain tumor types—in particular, kidney papillary cell carcinoma (KIRP), kidney clear cell carcinoma (KIRC), LSQC, and LUAD—express individual GPCRs that have statistically significant associations with survival (Fig 10D).
(A–C) Kaplan-Meier survival curves in the indicated tumor types, for weighted, combined expression of the indicated GPCRs. (D) In the indicated tumor types, the number of GPCRs with significant (p < 0.05) association with survival. (E–F) As examples, the impact on mean survival of GPCRs in (E) ESCA adenocarcinoma and (F) SKCM (distant metastases). Numerical values used to generate panels A–C can be found at https://insellab.github.io/data. Numerical values for panels D–F can be also found at https://insellab.github.io/data; in addition, the analysis results, metadata, GPCR expression values, and R-code for generating these survival data can be found at https://insellab.github.io/analyses, under the survival analysis section. ACC, Adrenocortical Cancer; BLCA, bladder cancer; BRCA, breast cancer; CESC, Cervical Cancer; COAD, colon adenocarcinoma; ESCA, esophageal cancer; GPCR, G protein-coupled receptor; KIRC, kidney clear cell carcinoma; KIRP, kidney papillary cell carcinoma; LIHC, liver hepatocellular carcinoma; LSQC, lung squamous cell carcinoma; LUAD, lung adenocarcinoma; NOS, Not Otherwise Specified; OV, ovarian cancer; PDAC, pancreatic ductal adenocarcinoma; PRAD, Prostate Adenocarcinoma; SKCM, skin cutaneous melanoma; THCA, thyroid cancer.
For two tumor types as examples (ESCA adenocarcinoma and SKCM distant metastases, Fig 10E and 10F), we show the difference in mean survival times (in days) between patients with high expression (above median) and low expression (below median); negative values imply that elevated expression of these GPCRs is associated with adverse survival rates. GPCR expression is associated with both negative and positive survival effects, depending on the tumor type and GPCR. Several GPCRs are associated with differences in survival of >1 year. Some GPCRs have low median expression (<1 TPM) in the tumor population in general (e.g., taste receptors TAS2R14 and TAS2R20 in ESCA) but may be expressed in subsets of these populations, wherein they may contribute to differences in survival. Our analysis focuses on higher expressed GPCRs, but we provide (at the accompanying website) expression and DE data for all GPCRs regardless of expression level, since in certain tumors very low-expressed GPCRs may have disease relevance in subsets of patients within a cancer cohort.
In total, we found that 301 GPCRs show statistically significant evidence (p < 0.05) of an association with survival, in at least one tumor type/subtype (among 20 for which such analysis was feasible). A subset of 32 GPCRs was significantly associated with survival in 5 or more of these tumor types/subtypes. All associations of GPCRs with survival in the tumor types tested are available at the accompanying website.
Somatic mutations of GPCRs in solid tumors
Analysis of 5,103 TCGA samples in 20 tumor types (S3 Table; 21 tumor types if one divides ESCA into esophageal adenocarcinoma and squamous cell carcinomas) revealed many GPCRs with frequent nonsilent mutations (Figs 11A and S8A), including a more frequently mutated subset (Fig 11A, inset). GPR98/ADGRV1, the most frequently mutated GPCR, occurs in >8% of TCGA samples. Tumor types with high mutation burdens have a high frequency of GPCR mutations (Fig 11B). SKCM has the highest frequency: approximately 40% of SKCM tumors have GPR98 mutations (Fig 11C and 11H). Approximately 65% of tumors have ≥1 nonsilent GPCR mutation. Certain GPCRs are mutated in >10% of specific tumor types (Fig 11C).
(A) Frequency of GPCR mutations in the TCGA cohort (n = 5,103). Inset: 20 most frequently mutated GPCRs. (B) The average number of all genes (red) and GPCRs (blue) with somatic, nonsilent mutations per tumor genome for the TCGA tumor types surveyed. (C) The number of mutated GPCRs per tumor for several types of solid tumors in TCGA (black), the proportion of samples in each tumor type with at least one mutated GPCR (orange), and proportion with nonsilent mutations for the most commonly mutated GPCR (gray) for each tumor type. (D) GPCR mutation frequency is linearly related to Nmut in SKCM. (E) Probability of GPCR mutation as Nmut increases in SKCM for the 10 most frequently mutated GPCRs. (F) Normalized probability of GPR98 mutation as Nmut increases in SKCM, and the same for several cancers with high mutational burden and combined for BLCA, LUAD, LSQC, COAD, and SKCM. (G) The 20 most frequently mutated GPCRs in primary SKCM, with frequency of mutation and median (and upper and lower quartile) expression in TPM. (H) Expression (CPM) of the 5,000 most abundant genes in SKCM correlate closely in primary SKCM tumors that have or lack GPR98 mutations. (I) MutSig 2CV version 3.1 (gdac.broadinstitute.org) scores in SKCM obtained from https://gdac.broadinstitute.org/, showing the q-values for the significance of mutation scores for each annotated coding gene (blue); GPCR (black) and for those GPCRs, the number of mutation events (red) among the SKCM cohort. Numerical values used to generate all panels of this figure can be found at https://insellab.github.io/data. BLCA, bladder cancer; COAD, colon adenocarcinoma; CPM, Counts Per Million; GPCR, G protein-coupled receptor; LSQC, lung squamous cell carcinoma; LUAD, lung adenocarcinoma; Nmut, Number of Mutations per Genome; SKCM, skin cutaneous melanoma; TCGA, The Cancer Genome Atlas; TPM, Transcripts Per Million.
Nmut, the number of genes with somatic nonsilent mutations per tumor genome, and the number of mutated GPCRs scale linearly in individual tumors (Figs 11D and S8E–S8G). Frequently mutated GPCRs (e.g., GPR98, GPR112, and BAI3) are more likely to be mutated as Nmut increases (Fig 11E, SKCM as an example). The relationship between Nmut and likelihood of GPR98 mutation is similar in SKCM and other cancers (Fig 11F); this is also observed for other frequently mutated GPCRs. Hence, the likelihood of a GPCR being mutated appears to depend on the accumulation of genome damage and to be independent of the mechanisms for the mutations. The relationship between GPCR mutation rates and Nmut is identical in bladder cancer (BLCA), LUAD, and SKCM, although the factors driving DNA damage and oncogenesis are likely different. Mutations of certain GPCRs, such as GPR98, may thus serve as a bellwether for genome-wide DNA damage.
Missense mutations and in-frame deletions are the most frequent nonsilent mutations in GPCR genes (S8C and S8D Fig and S5 Table). Mutations in frequently mutated GPCRs occur at many sites (S9A Fig), which contrasts with the smaller number of such sites in common oncogenes, e.g., KRAS . (S9B Fig). Certain GPCR genes (e.g., GPR98) may be in genomic regions vulnerable to dysregulation of DNA damage and repair and belong to a subset of mutated genes; GPR98 mutations frequently occur alongside other frequently mutated genes such as TTN and MUC16 (S10A–S10G Fig). GPR98 is among the 25 most frequently mutated genes in all tumor types surveyed; its mutational frequency is similar to that of genes (e.g., BAGE2, S8B Fig) that are mutational hotspots . As the GPCR with the largest gene length (approximately 19,000 bp), GPR98 has more mutational events. Compared with other very long genes, e.g., genes > 10,000 bp, GPR98 belongs to a subset of approximately 20 genes with high mutational frequencies (Fig 12A), implicating GPR98 as a hotspot for both silent and nonsilent somatic mutations. GPR98 has a >4-fold increased density of mutational events (normalized for gene length) compared to the average of these very long genes. One obtains a similar result by assessing the density of somatic mutations across all genes, irrespective of length (Fig 12B). GPR98, GPR112, and BAI3 are among the top 5% of genes in number of mutations per unit gene length, highlighting these genes as chromosomal regions with higher than normal rates of somatic mutation.
(A–B) Number of somatic (silent and nonsilent) mutations per unit of gene length (total length of all exons per genes) for (A) all genes and (B) genes >10 kb in length in SKCM. (C–D) Survival analysis in metastatic SKCM for (C) GPR112 and (D) GPR98 mutations, with p-values calculated using the Peto-Peto modification of the Gehan-Wilcoxon test. Numerical values used to generate all panels of this figure can be found at https://insellab.github.io/data. GPCR, G protein-coupled receptor; SKCM, skin cutaneous melanoma.
Survival analysis of metastatic SKCM samples was performed in order to evaluate the impact on tumors of somatic nonsilent mutations to GPR98, GPR112, or other frequently mutated GPCRs. Analysis of Kaplan-Meier survival curves using the mPP method reveals that the presence of somatic mutations in these GPCRs does not have a statistically significant impact on survival. Fig 12C and 12D shows this for GPR98 and GPR112, the two most frequently mutated GPCRs in SKCM. We find the same result in other tumor types as well and thus conclude that somatic nonsilent mutations to GPCRs have no impact on patient survival.
Most mutated GPCR genes have low levels of mRNA expression (Figs 8 and 11G [SKCM as an example]) and so may not be functionally relevant, but certain GPCR genes (e.g., CELSR1 and LPHN2/ADGRL2) are frequently mutated and moderately or highly expressed. Several such GPCRs are orphan receptors (without known physiologic agonists or roles), for which it is unclear whether they impact on cell function. As cell-surface receptors, frequently mutated, well-expressed GPCRs may represent neo-antigens. For SKCM, which has the most GPCR mutations among tumors types surveyed, DE analysis of primary melanomas and distant metastases that have or lack GPCR mutations (e.g., GPR98 and LPHN2) revealed little evidence that these mutations alter the tumor transcriptome, implying that such GPCR mutations are likely passenger, rather than driver, mutations (Figs 11H and S8C and S8D). Conversely, previous work has suggested that for known oncogenes (e.g., for TP53 ), there are often widespread transcriptomic changes associated with specific mutations. We found similar behavior for other tumors (e.g., BLCA) that have frequent GPCR mutations.
As a further approach, we evaluated GPCR mutations, predicting the likelihood of functional consequences and site-specific enrichment of the mutations via MutSig 2CV version 3.1 (gdac.broadinstitute.org). The majority of GPCRs frequently mutated (Fig 11I, SKCM as example) show nonsilent mutations that are nonsignificant in terms of enrichment (compared to the background mutation rate of silent mutations over the same regions) for individual mutation sites. These mutations are not predicted to be functional (calculated from estimations of functional impact of mutations based on whether mutated regions are highly evolutionarily conserved) by MutSig 2CV, consistent with the idea that the frequent GPCR mutations are likely passenger and not driver mutations. MutSig 2CV results for all GPCRs in all tumor types are available at the accompanying website.
CNV of GPCRs in solid tumors
CNV of certain GPCRs occurs frequently in TCGA solid tumor samples (Fig 13A and 13B and S1 Table), with some GPCRs (e.g., GPR160) amplified in >5% of all TCGA samples surveyed. CNV data were obtained as GISTIC 2.0  thresholded data, wherein values of −2, −1, 0, 1, and 2, respectively, denote homozygous deletion, single-copy deletion, diploid copy number, low-level amplification (i.e., increase of 0.1 to 0.9 of copy number, expressed as a log2 ratio), and high-level amplification (amplification of >0.9 of the log2 ratio, i.e., >1.7 extra copies in a diploid cell) . The distribution of amplification events among GPCRs parallels that of all genes (Fig 13C, SKCM as an example), but a subset of GPCRs is disproportionally amplified (Fig 13D–13F and S2 Table). Most GPCRs have infrequent amplification (in 2% of tumors or less). Amplification does not predict high expression or overexpression of GPCRs (Fig 13G–13J; OV as an example): frequently amplified GPCRs in tumors often have limited mRNA expression in those tumors, whereas most highly expressed, overexpressed GPCRs are not amplified.
(A) The number of solid tumors with CNV for each GPCR across all TCGA samples (plotted in descending order of frequency for high-level amplification and homozygous deletions; see text for definition of high- and low-level amplification). (B) The same as in (A) for low-level amplification and single-copy deletions. (C) In SKCM tumors (n = 367), the distribution of high-level amplification of GPCRs (n = 390 genes) compared to that of all protein coding genes (n = 24,776). (D) The total number of homozygous deletions, single-copy deletions, and low-level and high-level amplifications for all GPCRs combined in the 7,545 TCGA tumors surveyed for CNV. (E) The most frequently amplified GPCR for each TCGA tumor type and the proportion of samples with high and low-level amplification for that GPCR. (F) Proportion of samples of various tumors types with high and low-level amplification of GPR160, the most frequently amplified GPCR overall). (G) OV samples with and without high-level amplification of GPR160, with the median for each group indicated. The difference between groups was not statistically significant). (H) The risk ratio of elevated GPR160 expression (above the median value for OV) when GPR160 also shows high-level amplification; amplification of GPR160 increases the likelihood that GPR160 expression is elevated. (I) For the 30 GPCRs in OV with the highest fold-increase in expression relative to normal ovarian tissue (with FDR < 0.05 and median expression in OV > 1 TPM), the corresponding number (of 579) OV tumors with high-level amplification of those same GPCRs. (J) The same as (I) but comparing fold-increase against low-level amplification. Numerical values used to generate all panels of this figure can be found at https://insellab.github.io/data. CNA, copy number amplification; CNV, copy number variation; GPCR, G protein-coupled receptor; OV, ovarian cancer; SKCM, skin cutaneous melanoma; TCGA, The Cancer Genome Atlas; TPM, Transcripts Per Million.
Single-copy/heterozygous deletions of GPCRs are widespread, whereas homozygous deletions are rare (Fig 13A and 13D). GPCR genes with single-copy deletions are generally not significantly expressed in tumors or normal tissues, implying that such deletions lack functional effects, but exceptions exist. PTH1R, frequently deleted in KIRC (approximately 77% of samples have single-copy deletions), has approximately 10-fold reduced expression compared to normal kidney tissue. Similarly, ADRA1A is frequently deleted and has reduced expression in hepatocellular and PRADs.
Fig 13E shows the identity and frequency of amplification of the most frequently amplified GPCR in each tumor type. Several cancers (e.g., OV and LSQC) have a high level of amplification of specific GPCRs in >25% and low-level amplification in >40% of samples. Fig 13F shows the copy number amplification (CNA) frequency of GPR160, the most frequently amplified GPCR overall, among all tumors surveyed. Except for GPR160 and FZD6, most GPCRs with frequent amplification are rarely overexpressed (S6 Table). CNA alone thus does not generally predict increased mRNA expression in tumors compared to normal tissue, and highly expressed GPCRs in tumors are typically not amplified. We tested for association between high-level amplification and increased mRNA expression. Fig 13G shows GPR160 expression in OV tumors with and without high-level amplification as an example; CNA is not a prerequisite for high mRNA expression, and the small difference in median expression is not statistically significant, based on DE analysis via EdgeR. However, tumors with amplification of GPR160 show a higher likelihood (approximately 33%, p = 0.003, Fig 13H) of expressing GPR160 at levels above the median for OV. We did not observe such statistically significant risk ratios relating GPCR expression with amplification for other frequently amplified GPCRs. We conclude that CNA and GPCR mRNA expression are generally weakly associated; therefore, examination of amplified GPCR genes does not predict which GPCRs are highly and/or differentially expressed in a tumor (Fig 13I and 13J, OV as an example).
S1 and S2 Tables provide, respectively, the frequency of GPCR CNV and changes in expression of GPCRs in the tumors surveyed. The widespread CNV of certain GPCRs suggests that they (and/or neighboring genes that vary along with these GPCRs) contribute to the malignant phenotype and may be markers for malignancy.
Discussion and conclusions
In this study, we identified mutations, CNVs, and alterations in mRNA expression of GPCRs in a range of solid tumors. The results reveal a broad landscape of changes, suggesting a functional role for this gene superfamily in such tumors and possible therapeutic utility of GPCRs in a large number of solid tumors.
Mutations of certain GPCRs have been implicated in cancer , but a comprehensive analysis of GPCR amplification, expression, and DE has been lacking. The largest public datasets of normal (GTEx)  and cancer tissues (TCGA) provide RNA-seq data analyzed and/or normalized differently, making it difficult to compare datasets. For many tumor types, few replicates of “normal” TCGA tissue are available, and samples from matched non-tumor tissue from patients may not be representative of normal tissue (Methods, Section 1). The TOIL project enabled comparison of TCGA and GTEx data with analysis of both RNA-seq datasets via the same pipeline.
Our analysis identified frequently mutated GPCRs (e.g., GPR98/ADGRV1 and GPR112/ADGRG4) in multiple cancers, especially melanoma (SKCM). GPCR mutations appear to reflect accumulation of DNA damage and mutations across the genome and may be tumor markers for this process. Multiple prior studies (e.g., [8,46]) have suggested that GPCRs that are frequently mutated (e.g., certain adhesion GPCRs) in tumors should be further studied for their role as drivers of tumor progression and as targets for novel therapeutics. Several observations lead us to question this idea:
- The absence of enrichment of mutations in specific sites or regions of frequently mutated GPCRs
- The analysis from MutSig 2CV indicating that such mutations are not enriched at specific residues at rates statistically elevated above the background mutation rate, i.e., the rate of nonsynonymous mutations is not statistically elevated, adjusted for the rate of synonymous mutations
- The general low expression of frequently mutated GPCRs at the mRNA level
- The lack of impact of GPCR mutations on survival
- The lack of effect of GPCR mutations on gene expression
While GPCRs are of interest in cancer due to their high mRNA expression, using their frequency of mutation as a rationale to choose particular GPCRs as potentially functionally important or possible therapeutic targets is likely a flawed approach—a conclusion that contrasts with previous suggestions about mutated GPCRs. Our data also raise a cautionary point about mutations: frequency of mutation of a gene is insufficient evidence to suggest its importance as a potential oncogene or therapeutic target without additional analyses such as those noted above.
While we conclude that frequently mutated GPCRs are unlikely to be targets for therapeutic intervention in solid tumors, such GPCR genes may be useful for studies related to oncogenesis. For example, GPR98 is frequently mutated in a number of tumor types and is one of several genes with a high frequency of somatic mutations per kilobase of coding gene length, suggesting that the GPR98 gene lies on a chromosomal region particularly vulnerable to accumulation of DNA damage. Mutation to this gene seems to scale with mutational burden, in a manner that is independent of tumor type: in tumors such as SKCM and BLCA, in which DNA damage from environmental factors may occur via different mechanisms, the accumulation of damage at the GPR98 locus in response to genome-wide damage appears to occur at a constant rate. This rate is elevated relative to the rest of the coding genome in terms of mutation rate per kilobase. Understanding how and why this occurs may provide additional insight into the mechanisms that drive DNA damage.
Known driver mutations do not appear to influence GPCR expression in tumors, but we excluded rare mutations. In order to ensure large numbers of replicates and high statistical significance, we analyzed tumors with high-frequency mutations (e.g., TP53 or KRAS) . Highly expressed GPCRs are widely expressed among replicates of specific tumor types and are more prevalent in tumors than are common driver mutations. The overrepresentation of GPCRs among protein-coding genes with increased expression in solid tumors supports the hypothesis that the elevated expression of specific GPCRs is a hitherto underappreciated feature of solid tumors.
The finding that many GPCRs show altered mRNA expression in tumors at rates higher than occurs for coding genes in general raises the question: what causes GPCR-specific changes in expression? Regulation of GPCR mRNA expression is poorly understood; therefore, exploration of potential mechanisms for such regulation or dysregulation is of interest. Studies of GPCRs with DE in tumors may shed light on such mechanisms and perhaps also have relevance for other disease settings with altered GPCR expression.
Given the widespread changes in copy number that occur in cancer, especially CNA, which might explain the increased expression of particular GPCRs in tumors, we tested but failed to find that CNV can generally explain altered GPCR mRNA expression or DE. CNV is not stochastically distributed among the GPCR family; certain GPCRs are more frequently amplified (e.g., GPR160) or deleted (e.g., PTH1R in KIRC). Amplified and deleted GPCRs may have potential as biomarkers, as has been suggested for other genes with frequent CNV . Our findings with respect to CNV and GPCRs raise a broader concern and cautionary note: changes in copy number of a gene should not be taken as evidence of dysregulation of expression of that gene, unless one obtains clear evidence for corresponding changes at the mRNA level. In particular, it is unclear that CNA of GPCR genes leads to elevated expression of those GPCR mRNAs in tumors.
Numerous solid tumors have increased mRNA expression of large numbers of mostly nonmutated GPCRs: 72 GPCRs are overexpressed in >10 tumor subtypes, implying that common mechanisms may regulate GPCR expression in such tumors. Highly overexpressed GPCRs are potential candidates as drug targets. Of note, 77 such overexpressed GPCRs are targets of approved drugs that have the potential to be repurposed to treat tumors. The similarity of GPCR expression in primary tumors and metastases supports such therapeutic potential.
The data reveal that clusters of GPCRs may be prognostic indicators for survival and provide a molecular signature of the malignant phenotype. GPCRs whose expression adversely or positively predicts survival are candidates for antagonists or agonists, respectively, as novel cancer drugs. Hierarchical clustering of tumor types based on GPCR expression identifies groups of tumors consistent with other molecular or phenotypic features of these tumors. Thus, the tumor GPCRome appears to be predictive of the broader molecular landscape of tumors.
Do GPCR mRNA data predict protein expression? Direct quantification of GPCR proteins is challenging, due to their generally low abundance and paucity of well-validated antibodies. However, mRNA expression of GPCRs, especially highly expressed GPCRs, generally predicts the presence of functionally active GPCRs in human and animal cells [48–52]. In contrast to earlier ideas, recent evidence supports the view that mRNA expression broadly predicts protein expression [53–57] (S1 Text and references [58–59] within). As an example, GPRC5A protein and mRNA abundance are concordant (S2 Text and S11 Fig). GPCR detection via mass spectrometry has been challenging; proteomics data (e.g., ) indicate that, at present, few GPCRs are detectable by such methods, likely due to the low abundance of GPCR proteins. As noted in Results, functional evidence is available for numerous GPCRs with DE in solid tumors. As cell-surface proteins enriched in tumors and cancer cells, certain GPCRs may represent novel tumor-associated proteins that might be targeted for diagnosis and/or treatment.
As a caveat to those ideas regarding mRNA and protein expression, most analyses on their concordance cited above have used model organisms. Such concordance may be less evident in native mammalian cells, particularly in certain cell and tissue types, as observed by Edfors and colleagues . It is unclear whether GPCR mRNA expression predicts protein expression and functional effects in native cells and/or tissues. However, our recent work on Gq-coupled GPCRs in pancreatic cancer cells indicates a concentration-response relationship between GPCR mRNA expression and function, such that high GPCR expression corresponds to strong signaling and functional responses . Previous work on GPCR mRNA expression in native cells has shown robust functional effects and/or protein expression in a range of settings, including cancer, for GPCRs identified as highly expressed, based on omics methods (reviewed in ). We thus suggest that highly expressed GPCRs (e.g., those expressed at ≥5–10 TPM from RNA-seq data) are very likely to be functional.
GPCR mutations, CNV, and DE thus occur at a high frequency in solid tumors. Therefore, this receptor superfamily may have unappreciated functional roles in such tumors, especially because GPCR expression appears to be largely independent of tumor grade or stage and mutations. Our results imply that new insights may derive from further studies of GPCRs regarding mechanisms of gene expression and phenotype in solid tumors and perhaps other cancers. Of particular—and perhaps rapid—translational importance is the potential of GPCR-targeted drugs, including FDA/EMA-approved drugs, that might be repurposed as therapeutics for a variety of solid tumors.
The table below (Table 2) lists all of the tools used in the analysis performed, as well as links where readers can find relevant data. All tools and data used are free to use and openly accessible, as are all results from this study.
1. Contact for resource sharing
A website has been created for sharing all data at https://insellab.github.io/. Links to data files will be posted on this website following peer review. All data provided will be open access following peer review; information about how to cite data generated from this study is available at https://insellab.github.io/.
2. Details of methods
2.1. DE analysis.
Gene expression for the GTEx and TCGA datasets, assayed via RNA-seq, was downloaded from the UCSC Xena Portal (xena.ucsc.edu). For DE analysis, RSEM expected counts were obtained, which were computed via the TOIL pipeline, described in  and available at the Xena Portal by the authors of the TOIL project (https://xenabrowser.net/datapages/?host=https://toil.xenahubs.net).
The analyzed data from the TOIL project were generated as follows. Merged FASTQ files were adapter-trimmed via CUTADAPT, followed by alignment via STAR . Gene expression was then quantified using RSEM . The HG38 reference genome, with Gencode version 23 annotations, was used in the TOIL analysis. For this study, both RSEM estimated counts (for DE analysis) and RSEM TPMs (for evaluating magnitudes of expression) were used.
Files were accessed from https://xenabrowser.net/datapages/?dataset=tcga_gene_expected_count&host=https://toil.xenahubs.net for TCGA expected counts (version 2016-09-01) and https://xenabrowser.net/datapages/?dataset=gtex_gene_expected_count&host=https://toil.xenahubs.net for GTEx expected counts (version 2016-05-19).
Expression in TPMs for GPCRs was queried via https://xenabrowser.net/heatmap/ for both TCGA and GTEx.
Following the download of gene expression data, corresponding files on sample phenotype were obtained from the relevant links hosted at https://xenabrowser.net/datapages/?host=https://tcga.xenahubs.net for TCGA and at https://xenabrowser.net/datapages/?cohort=GTEX for GTEx samples, respectively. Samples were then grouped for DE analysis based on attributes such as tissue type and tumor type.
We used a table of estimated counts from tumor and normal tissue as input for analysis in edgeR , yielding normalized abundance in CPM (using TMM normalization) and DE, showing magnitude of fold-change and statistical significance estimated by FDR. We estimated fold-changes of gene expression in tumors compared to normal tissue via an exact test. Genes that changed with FDR < 0.05 were considered statistically significant; however, we focused attention on GPCRs that, besides low FDRs, are also expressed at >1 TPM in tumors, because high expressed GPCRs are likely of greater interest. The 20 TCGA tumor types were divided into 45 subtypes/categories (Table 1) based on histological classification. Different tumor subtypes show distinct GPCR expression, e.g., Classical versus Follicular THCA, Triple negative versus Her2 BRCA IDC (“BRCA IDC”), and Esophageal (ESCA) squamous cell carcinoma versus Adenocarcinoma (S1A–S1F Fig), thus requiring this subdivision into tumor subtypes.
In addition to the standard TMM approach in edgeR, we tested upper-quartile normalization before conducting DE analysis in edgeR. The two methods yielded nearly identical results (S11C and S11D Fig). Log2 fold-changes for all genes and GPCRs show that expression changes calculated by both methods are closely correlated, with nearly identical magnitude. We also evaluated DE via EBseq . EBseq and edgeR yielded very similar results (S11A and S11B Fig), in particular for GPCRs, implying that assumptions implicit in the DE analysis via edgeR/TMM normalization do not skew or bias the results. As an empirical test, we verified that GPCRs that show large fold-changes between tumor and normal samples also show large differences in normalized gene expression in TPM, e.g., EDNRB, GPR143, and ADGRG1 in SKCM (S2F–S2H Fig).
2.2. Database of normalized GPCR expression in tumors and normal tissue.
For GPCR expression in tissue, expression as TPM (a normalization of gene abundance that corrects for effective length of genes) and CPM (number of times a gene is encountered per million reads, hence normalization for library size without length normalization) are provided in S2 Table.
Data in TPM are provided for assessing the relative abundance of members of a gene family, such as GPCRs, within an individual sample or a set of biological replicates, while data in CPM are provided for comparing a specific gene between multiple groups of dissimilar samples, where length normalization is problematic because dissimilar datasets may be normalized differently. We provide GPCR expression in both formats to facilitate different approaches for analysis.
This resource thus enables an estimate of GPCR abundance in more rigorous terms than comparing Fragments Per Kilobase of exon, per Million reads (FPKM)/TPM values for a particular gene across different tissue types. Furthermore, this approach provides normalized gene expression estimates (in TPM and CPM) using the same units and analysis methods for both normal and cancer tissue, thus allowing for direct comparison.
Gene abundances in CPM were calculated via EdgeR. In several cases, the same normal tissue dataset was used to compare multiple tumors (e.g., GTEx kidney data were used for comparison with KIRP, KICH, and KIRC). In these cases, the calculated CPM values for normal tissue from each analysis were (as expected) very similar but not equal, as EdgeR TMM normalization yields slightly different normalization factors in each case. The CPM values presented for these tissues are therefore average values (e.g., for CPMs in normal kidney tissue, the values provided are the average of the data obtained from comparisons of normal kidney with KIRP, KICH, and KIRC, respectively). The normal tissue types in which this was performed are breast, lung, and kidney.
2.3. GPCR mutation and copy number analysis.
For each cancer type, tables of somatic, nonsilent mutations (gene-level) and somatic mutations (SNPs and small INDELs) and gene-level GISTIC2 thresholded CNV were obtained using https://xenabrowser.net/datapages/?host=https://tcga.xenahubs.net and links within.
For mutation data, we used results obtained via the Broad Automated Pipeline, where available. In other cases, we used data from the Baylor College of Medicine sequencing center. The source of the mutation data is indicated on the respective downloadable files and S3 Table. In all cases, the HG19 reference genome was used for calling mutations. Mutation data for genes coding for GPCRs were extracted as part of the present study; all GPCR mutation data are available for download as supplemental material.
TCGA copy number estimates were obtained using Affymetrix SNP 6.0 arrays. The data were analyzed via GISTIC 2.0  to obtain gene-level estimates of CNV. The resulting “thresholded” GISTIC 2.0 data yield values of −2,−1, 0, 1, and 2, indicating homozygous/two-copy deletion, heterozygous/single-copy deletions, no change, low-level amplification, and high-level amplification, respectively. CNV for GPCR genes in each tumor type was extracted and is available as downloadable material.
2.4. Which genes are included in this analysis?
We evaluated all GPCRs annotated by the GtoPdb maintained by IUPHAR/British Journal of Pharmacology . This list primarily focuses on endo-GPCRs (GPCRs natively expressed in peripheral tissue that possess endogenous ligands and receptors primarily used as drug targets). The IUPHAR list includes taste and vision GPCRs but not olfactory receptors. We excluded several GPCRs annotated by IUPHAR but that are thought to be pseudogenes. The list of GPCRs in this analysis is provided in S1 Table.
For these annotated GPCRs, we included information about their linkages to G proteins and their status as orphans or not. For nonorphans, an example of an endogenous ligand is provided. These data are almost entirely based on information available at the IUPHAR website cited earlier. In a few cases in which such information is not provided by IUPHAR, we have used other literature sources for annotation.
2.5 GPCR expression in cancer cells mined from other sources.
GPCR expression in a range of cancer cell lines was queried via the EBI Expression Atlas (https://www.ebi.ac.uk/gxa/home) for cell line profiles part of CCLE  as well as by Genentech , profiled via RNA-seq. These data were analyzed as part of the EBI Expression Atlas via the iRAP bioinformatics analysis pipeline, described in detail in  wherein gene expression was computed in FPKM, a length-normalized expression abundance estimate analogous to TPM units, used for TOIL TCGA data. Precisely, statistically relevant comparisons between gene abundances in TPM and FPKM are not feasible; however, empirical comparisons between such datasets are possible. In general, genes with high abundances in TPM or FPKM will be highly expressed relative to other genes within sets of samples; therefore, our comparison of CCLE and other cell-based data versus TOIL TCGA data serves as an empirical confirmation of the fact that GPCRs highly expressed in TCGA tumors are also present in cancer cells.
Normalized gene expression in cancer cells from other sources [40,41,66] was obtained via NCBI GEO, wherein analyzed RNA-seq data with quantification of gene expression were provided. As with the data from EBI above, such data allowed for empirical comparisons versus TCGA TOIL data to confirm the presence of GPCRs in cancer cells, which were also detected in tumors.
3. Quantification and statistical analysis
DE analysis was performed in the R software environment via EdgeR , as discussed above in section 2.1. We used the following criteria to evaluate GPCRs with significant DE:
- FDR < 0.05. In the majority of cases, genes with a high fold-change also showed FDRs ≪ 0.05.
- Magnitude of fold-change > 2-fold (increase or decrease).
- Magnitude of expression > 1 TPM median expression in tumors, as calculated by RSEM in the TOIL pipeline, discussed in section 2 above. We focused on genes with significant DE and high expression because our primary goal was to identify GPCRs that may be drug targets and/or biomarkers.
For compilation and distribution of data, we assembled data files primarily in Microsoft Excel, with files stored in .xlsb format.
Plots of normalized expression in tumors and normal tissue, whether in TPM or CPM, show median expression for respective cohorts, along with upper and lower quartiles, as indicated in figure legends where applicable.
The numbers of replicates in each sample group/category of normal tissue and tumors are provided in S3 and S4 and 1 Tables, respectively. Tumor types with 10 or more replicates were considered for DE analysis. A small number of samples (≪1% of TCGA samples studied) were excluded because they were duplicated in downloaded databases from TOIL and showed discrepancies between expression in these downloaded data versus expression data queried via the visualization tool on the TOIL website. These discrepant samples are provided on a downloadable list at insellab.github.io.
DE results presented in this text are from comparisons between TCGA and GTEx samples, but we also include—in our MDS analysis and in all downloadable counts files—data for TCGA-matched “normal” samples taken from tissue adjacent to tumors of TCGA patients. In general, normal TCGA and GTEx samples cluster closer together than do TCGA tumors and GTEx normal samples (S1G–S1J Fig).
The overlap, however, is not exact. In several cases, we found differences between TCGA normal and GTEx samples. It is unclear whether these differences result from biological or technical factors; prior data show that tumors impact surrounding “normal” tissue and can also induce global changes [67–70]. Therefore, we have not used batch-correction methods to account for these variations between TCGA normal and GTEx tissues. In general, DE of GPCRs is similar whether TCGA normal tissue or GTEx tissue is compared to TCGA tumor samples (e.g., S11E and S11F Fig), suggesting that such differences are unlikely to impact upon the general conclusions of this study.
For TCGA data, some recent efforts (e.g., http://bioinformatics.mdanderson.org/tcgambatch/) have been made to account for batch effects, though peer-reviewed studies have not yet established best practices for batch correction of TCGA data. Many TCGA datasets for individual tumor types contain numerous batches (with batches defined in terms of factors such as sequencing runs, or location of tissue collection), with small numbers of replicates in each batch. Given this, it is unclear whether such batch corrections account for technical variation among batches or merely suppress biological variation, especially with the known heterogeneity among tumor samples. In light of this, we present all data from TCGA and GTEx without correcting for batch effects.
In nearly all cases, the DEs of GPCRs that we highlight have large fold-changes with high statistical significance (i.e., FDR ≪ 0.05), such that minor technical variations ought to not substantially impact our key findings. Moreover, the fact that TCGA tumors and GTEx normal tissues form distinct, separated clusters (and thus show a high degree of DE) is unlikely to be due to technical factors. In several cases (e.g., KICH matched normal versus GTEx kidney samples; S1G Fig), TCGA matched normal and GTEx normal tissues are in fact highly similar, whereas in other cases they are not (e.g., PRAD, S1J Fig). This suggests that technical factors between the two studies do not consistently skew or bias the two datasets versus each other.
Analysis to identify dependence of GPCR expression on patient characteristics, such as sex or tumor stage, was done as follows: GPCR expression for tumor samples normalized in CPM, output from EdgeR, was used to evaluate whether grouping samples together based on a specified attribute (e.g., sex) resulted in a statistically significant relative risk for that attribute tested for results in elevated GPCR expression in one group compared to the other. We tested this using Fisher’s exact test in R, yielding calculations of risk ratio and p-values, with the terms in the contingency table for Fisher’s test being the number of samples in each group with GPCR expression either above or below the population median. p-Values were then adjusted for multiple testing using the p_adjust function in R, utilizing the Benjamini–Hochberg method. These adjusted p-values thus indicate whether any GPCRs have expression that is significantly associated with a given patient attribute. Specific attributes tested for were sex, tumor stage, grade, and pathological T (depending on availability of metadata for each tumor type and the number of available replicates). This method was also used to evaluate the significance of associations between expression of GPCRs and presence of specific driver mutations (e.g., presence or absence of mutations to TP53 or KRAS) and association between GPCR mRNA expression and the thresholded GISTIC 2.0 CNV call.
Survival analysis was performed in R using gene expression data normalized in CPM (i.e., units appropriate for comparisons between samples) via the “survival” and “survminer” packages. For each tumor type analyzed, samples were divided into two groups based on median expression of the gene being tested, and differences in survival between the two groups were calculated. The mPP method was used to estimate the statistical significance of differences in survival rates. We noted cases in which the effects of GPCR expression on survival were related and/or coupled (hence, the presence of composite markers of survival noted in the text). Thus, these statistical tests were not independent; given a lack of understanding about the nature of such dependencies (which, to our knowledge, were hitherto unknown), we did not adjust p-values for multiple testing. We also note that (a) due to our use of relevant units for normalizing the gene expression data prior to performing survival analysis (CPM instead of units such as FPKM or RPKM) and (b) the subdivision of broad TCGA categories into appropriate subtypes of tumors with consistent histological classification (e.g., dividing ESCA into adenocarcinomas and squamous cell carcinomas and analyzing each separately), our analysis yields associations of genes with survival that may differ in some cases from other sources that also provide survival analysis of TCGA data. We selected a significance threshold of p < 0.05, which is relatively lenient for survival analysis, as our initial priority was to minimize false negatives; we anticipate that these analyses will need subsequent validation efforts in specific tumor types, with larger numbers of patients.
S1 Fig. Comparisons between GTEx normal samples and different TCGA tumor subtypes.
(A–F) DE of GPCRs differs in different cancer subtypes within the same cancer category. (A) The repertoire of overexpressed (OE) GPCRs in Her2-positive and triple-negative BRCA IDC (breast adenocarcinoma, IDC). (B) For commonly OE GPCRs, the correlation of magnitude of fold-changes in expression in each tumor subtype compared to normal breast tissue. (C) The repertoire of OE GPCRs in classical and follicular THCA. (D) For commonly OE GPCRs, the correlation of magnitude of fold-changes in expression in each tumor subtype compared to normal thyroid tissue. (E) The repertoire of OE GPCRs in ESCA adenocarcinoma and squamous cell carcinoma. (F) For commonly OE GPCRs, the correlation of magnitude of fold-changes in expression in each tumor type compared to normal esophageal mucosal tissue. BRCA IDC, either Her2-positive or triple-negative, overexpresses a number of GPCRs. Several of these GPCRs are commonly overexpressed (A), but others are OE in one type but not the other. In general, fold-changes of commonly overexpressed GPCRs correlated among cancer subtypes, but often with some scatter (B). Similar results are found in other tumors (e.g., C–D), showing the degree of overlap of overexpressed GPCRs in classical or follicular THCA. Further, in tumors that occur in the same tissue but with different precursor cells (e.g., squamous cell carcinomas versus adenocarcinomas), the repertoire of differentially expressed GPCRs is distinct. Panels E–F illustrate this for ESCA. Thus, in general, tumor types and subtypes with distinct histological classification possess distinct repertoires and changes in expression of GPCRs. (G–J) Differences between TCGA-matched “normal,” GTEx normal tissue, and tumors (KICH, LSQC [NOS]). MDS plots indicate that in some cases (G, H), TCGA-matched normal and GTEx normal tissue are similar, whereas in others (I, J), LSQC (NOS) and PRAD TCGA matched normal and GTEx normal samples differ, although these differences are smaller than the differences between normal tissue (from either source) and tumors. Differences in tumor biology with different tumor types influencing surrounding tissue to different degrees may explain the apparent differences in the “normal” tissue in the TCGA samples. Numerical values used to generate panels A–F of this figure can be found at https://insellab.github.io/data. MDS plots for tumor and normal tissues can be found at https://insellab.github.io/mds_plots.
S3 Fig. DE of genes between PDAC tumors and normal pancreatic tissue.
(A) MDS plot of gene expression in normal pancreatic tissue and PDAC tumors. (B) Volcano plot showing significantly differentially expressed genes (FDR < 0.05) in red, with FDR plotted against fold-change. (C) Smear plot showing genes with significant fold-change (red), with fold-change plotted against magnitude of gene expression in CPM. (D–F) Expression of MMP11, S100A6, and LGALS3 in all samples for PDAC and normal pancreas, with medians (dashed lines) also indicated. (G–I) Expression of EDNRB, ADGRG1, and GPR143 in all samples for primary and distant SKCM and normal skin, with medians (dashed lines) also indicated. (J) The fraction of PDAC tumors that express GPRC5A above the indicated thresholds, compared to median expression in normal tissue. MDS plot for part A can be found at https://insellab.github.io/mds_plots. Numerical values for all other plots can be found at https://insellab.github.io/data.
S4 Fig. DE of genes between PDAC tumors and normal pancreatic tissue.
(A) The number of patients whose survival was tracked in the TCGA PDAC cohort at each time point, along with the rates of dropout and mortality. (B) Network construction via STRING of the genes the expression of which correlates with that of CCR6, CCR7, CXCR3, and CXCR4. Numerical values for panel A can be found at https://insellab.github.io/data.
S5 Fig. Pathways enriched among genes associated with expression of GPCRs in SKCM.
(A) The combined, weighted expression of GPR143, ADGRG1, and EDNRB in SKCM shows positive correlation with expression of a subset of nearly 2,000 genes. (B) Network construction via STRING of the top 500 most strongly correlated genes from (A) illustrating the presence of genes related to the melanosome and to insulin response as examples of cancer-associated pathways in SKCM. (C) Analysis of the 500 most strongly correlated genes via Enrichr shows enrichment of pathways such as transferring signaling, insulin response, etc. among these positively correlated genes. Numerical values for panel C can be found at https://insellab.github.io/data.
S6 Fig. Additional results on GPCR expression and DE: Metastatic versus primary tumors, primary versus recurrent tumors, and normal melanocytes versus melanoma cell.
(A–F) GPCR expression in metastatic and recurrent OV, thyroid cancer, and BRCA is similar to that in primary tumors. Most TCGA tumor types have few replicates of metastases or recurrent tumors. However, for those with available data (SKCM, Fig 7E; BRCA, THCA, and OV in this figure, discussed below), we tested whether GPCR expression is similar in primary tumors and metastases and in recurrent tumors. Panels A–F show that recurrent ovarian cancers, metastatic THCA (classical) tumors, and BRCA IDC, respectively, have similar GPCR expression, with identities of expressed GPCRs and magnitude of expression similar to primary tumors. One exception in BRCA IDC was NPY1R, which is more highly expressed in metastases than in primary tumors. All BRCA IDC subtypes were combined (e.g., Her2+, triple negative) for these analyses as there were only 6 metastases, precluding comparison of metastases in the different BRCA IDC subtypes. (A) Correlation of expression of the 100 highest expressed genes in primary OV with that of recurrent tumors. (B) Identity and relative expression of the 30 highest expressed GPCRs in primary and recurrent ovarian tumors. (C) Correlation of expression of the 100 highest expressed genes in primary THCA (classical) compared to that of metastatic tumors. (D) Identity and relative expression of the 30 highest expressed GPCRs in the primary and metastatic thyroid tumors. (E) Correlation of expression of the 100 highest expressed genes in primary and metastatic BRCA (IDC, all types combined), excluding NPY1R, which is much higher expressed in metastatic than primary tumors. (F) Identity and relative expression of the 30 highest GPCRs (including NPY1R) in primary and metastatic BRCA IDC tumors. (G, H) GPCR expression in nondiseased melanocytes differs from that in melanoma cells. (G) GPCR expression in low-passage melanoma cancer cells (data mined from Müller and colleagues ) indicates that most highly expressed GPCRs in TCGA tumors are also highly expressed in melanoma cells. By comparison, nondiseased melanocytes  typically show much lower expression of these GPCRs than in melanoma cells. (H) GPCR expression (in FPKM) in low-passage melanoma cells and melanocytes is very poorly correlated. Numerical values for the plots in all panels can be found at https://insellab.github.io/data.
S7 Fig. The GPCR expression “repertoire” of normal tissues and solid tumors.
(A, B) The number of GPCRs, orphan GPCRs, and GPCRs that couple to each G protein class in normal tissues (A) and solid tumors (B) and that have ≥0.1 TPM median expression. (C, D) GPCRs that couple to different G proteins compared to the total GPCR expression repertoire in normal tissue (C) and solid tumors (D). (E, F) GPCR expression (TPM) in normal tissue (E) and solid tumors (F), and the number of GPCRs detected at different thresholds of expression. GPCRs typically account for <0.1% of the tissue and tumor transcriptomes. Numerical values for all panels can be found at https://insellab.github.io/data.
S8 Fig. GPCR mutation events and their relationship to genome-wide mutations.
(A, B) Missense mutations are the most frequent type of nonsilent mutation of GPCRs in TCGA tumors, with GPR98/ADGRV1 the most frequently mutated GPCR. (A) The number of each type of somatic mutation for GPCRs in TCGA tumors surveyed (n = 5,103 tumors with 32,727 somatic mutational events in GPCRs. (B) The same analysis for SKCM, which has the highest number (11,348) of GPCR mutational events, i.e., more than one-third of all somatic mutation events but only approximately 9% of TCGA samples. S5 Table lists the total number of mutation events for the most commonly mutated GPCRs; a complete list is provided in S1 Table. GPCRs with the most frequent mutation events are also mutated in the largest number of tumors (Fig 10A). In the 5,103 TCGA tumors surveyed, missense mutations are the most frequent type of nonsilent GPCR mutation, occurring approximately 10-fold more frequently than frameshift deletions, the next most common type of nonsilent mutation. Missense mutations were the most common type of mutational event for GPCRs in all cancer types except LIHC, which had a high frequency of frameshift deletions (n = 1,875 events), with GPR98 the most frequently so mutated (n = 79 events). (C) The number of somatic mutation events in all TCGA tumors for all annotated GPCRs. Inset: the number of mutation events for the 10 most frequently mutated GPCRs. These data mirror those in Fig 10 that show the number of tumors in TCGA that possess somatic, nonsilent mutations to GPCRs. (D) The most frequently mutated GPCR and non-GPCR genes in solid tumors: the number of tumors across all tumor types surveyed with somatic nonsilent mutations for the genes indicated. (E–G) The number of mutated GPCRs in a tumor scales linearly with Nmut, the number of mutated genes per tumor genome. For SKCM, LUAD, and LSQC, their number of mutated GPCRs increases linearly with Nmut. This linear relationship is found with other tumor types and is nearly identical among tumor types, implying a general pattern of accumulation of genome-wide mutations and mutations in the GPCR superfamily. Numerical values for all panels can be found at https://insellab.github.io/data.
S9 Fig. Additional results on the nature of nonsilent GPCR mutations compared to certain other non-GPCR mutations.
(A, B) Location of silent and nonsilent mutations of GPR98/ADGRV1 in TCGA SKCM tumors (A) and KRAS mutations in PAAD tumors (B) accessed via Xena (xena.ucsc.edu). Data are shown for 204/472 SKCM samples in which GPR98 has silent or nonsilent mutations. Introns are not included; hence, the figure shows exonic locations of mutations; 132/186 PAAD samples had somatic KRAS mutations. Vertical gray bars indicate exons. Mutations of GPR98/ADGRV1 are distributed along the length of the gene and are not enriched at specific locations or exons. Thus, large portions of GPR98 represent mutational hotspots, findings that contrast with what occurs in driver mutations such as KRAS, in which mutations at specific sites and specific exons result in gain of function or loss of function, respectively. S9B Fig shows that virtually all somatic (almost exclusively missense) mutations occur at exon 2 (resulting in an oncogenic KRAS) in PAAD, in contrast with the range of mutations in SKCM for GPR98, GPR112, and other GPCR genes. The distribution of mutations along gene length is a feature of GPCR mutations in other cancers as well (e.g., BLCA, LUAD). (C, D) For primary (C) and distant metastatic (D) SKCM samples, relatively few genes show DE if one compares samples with or without LPHN2/ADGRL2 mutations. Fewer than 100 genes increase or decrease >2-fold (with FDR < 0.05 and >1 TPM median expression). Lists of DE genes for each case are provided in S2 Table. Red dots correspond to DE genes with FDR < 0.05. Similar data occur for other frequently mutated GPCRs (e.g., GPR98) in SKCM and in LUAD and BLCA. (E, G) Correlation of GPCR expression (n = 100 highest expressed GPCRs) in primary and distant metastatic SKCM tumors, respectively, with GPR98 somatic nonsilent mutations compared to tumors without GPR98 mutation. (F, H) The identities and median expression (TPM) of the 30 highest-expressed GPCRs in primary and distant metastatic SKCM tumors, respectively, for tumors with or without GPR98 mutations. Numerical values for panels C–H can be found at https://insellab.github.io/data.
S10 Fig. GPR98 is part of a tumor mutation profile.
(A–D) Mutations in GPR98 are frequently accompanied by mutations in other genes. The proportion of tumors possessing somatic, nonsilent mutations in GPR98 correlates with that of several other frequently mutated genes. Tumor types that show a high frequency of mutations to genes such as TTN and MUC16 also typically show a high frequency of GPR98 mutations and vice versa for tumors with infrequent mutations to these genes. TTN and MUC16 form a group of genes that, along with GPR98, are frequently mutated across a range of tumors. (A) The proportion of tumors in each TCGA tumor type that show mutations in TTN, MUC16, and GPR98. (B) The correlation between the proportions of tumor samples possessing TTN mutations and GPR98 mutations for the 20 TCGA tumor types shown above. (C) The same data for mutations in MUC16 versus GPR98 mutations. (E, F, G) Tumors with GPR98 mutations frequently have mutations in other frequently mutated genes. SKCM tumors (n = 472) have a high frequency of TTN, MUC16, and DNAH5 somatic, nonsilent mutations. Most SKCM tumors with somatic, nonsilent mutations to GPR98 also have mutations in these other genes. Numerical values for panels A–D can be found at https://insellab.github.io/data.
(A) GPRC5A mRNA expression in normal tissue. Median normalized mRNA expression from GTEx, in CPM of GPRC5A in normal tissues profiled by RNA-seq. (B) GPRC5A protein expression in normal tissue. Protein abundance (by immunohistochemistry from the human protein atlas; https://www.proteinatlas.org/) of GPRC5A in normal tissue. (C) GPRC5A mRNA expression in tumors. Median normalized mRNA expression (in CPM in TCGA) of GPRC5A in tumors profiled by RNA-seq. For TCGA tumor types with multiple subtypes (e.g., LUAD), values are the median for all subtypes of the tumor type to facilitate comparison with protein atlas data, in which TCGA tumor types are not separated into subtypes. (D) GPRC5A protein expression in tumor tissue. Protein abundance (by immunohistochemistry from the human protein atlas; proteinatlas.org) of GPRC5A in a range of tumor types. Data are provided for multiple replicates at https://www.proteinatlas.org/ENSG00000013588-GPRC5A/pathology; the proportion of samples staining at different intensity levels are shown. Values plotted for all panels are available at https://insellab.github.io/data. Data from the human protein atlas were downloaded from https://www.proteinatlas.org/about/download (bulleted items 1 and 2), and data for GPRC5A were extracted from the relevant files.
S12 Fig. Supplemental results validating DE analysis methods.
(A, B) EBseq and EdgeR yield similar results for DE analysis. (A) Comparison of fold-changes determined via EdgeR for the 10,000 genes with lowest FDRs (i.e., the most significantly altered genes), compared to the fold-changes for the same genes determined via Ebseq, using data for PDAC tumors compared to normal pancreatic tissue. (B) The same comparison (between EBseq and EdgeR) for the 100 GPCRs with lowest FDRs calculated via EdgeR. Dashed lines indicate linear fits. The two methods make different statistical assumptions but yield similar results, especially for GPCRs. We chose to use EdgeR rather than Ebseq based on the much lower processing times in EdgeR for the large files we generated. These results, along with the similarity of DE analysis between upper-quartile and TMM normalized data, helps allay concerns regarding the validity of this analysis stemming from the large numbers of DE genes found when comparing tumor and normal samples. (C, D) TMM and upper-quartile normalization in EdgeR yield similar results for DE analysis. (C) Comparison of fold-changes determined via EdgeR for the 10,000 genes with lowest FDRs (i.e., the most significantly altered genes) compared to fold-changes for the same genes determined via Ebseq, using data for PDAC tumors compared to normal pancreatic tissue. (D) The same comparison for the 100 GPCRs with lowest FDRs calculated via EdgeR. Dashed lines indicate linear fits. (E, F) DE analysis of fold-changes for GPCRs identified as meaningfully increased or decreased expression in KICH (E) and LSQC (NOS); (F) is similar (especially for highly expressed GPCRs with high fold-change) whether one compares these tumors with GTEx normal tissue or TCGA-matched normal samples. Numerical values for all figure panels can be found at https://insellab.github.io/data.
S13 Fig. DE of GPCRs with and without corrections for batch effects.
(A, B) Correlation of median (a) and average (b) expression of GPCRs in TCGA PDAC samples, with and without batch corrections performed for TCGA plate ID. (C, D) Correlation of median (c) and average (d) expression of GPCRs in OV (TCGA “Ovarian Cancer”) samples, with and without batch corrections performed for TCGA plate ID. Numerical values for all figure panels, showing corresponding GPCR expression before and after batch corrections, can be found at https://insellab.github.io/data.
S1 Table. Annotated GPCRs, GPCR mutations, and CNV.
S2 Table. GPCR expression and DE in tumors and normal tissue.
S3 Table. The types of cancers and number of replicates from TCGA surveyed for GPCR mutations and CNV.
Mutation and CNV data were obtained from xena.ucsc.edu. For mutations, data were generated at the Broad Institute Sequencing Center, except for *: Baylor College of Medicine Sequencing Center and #: Washington University Sequencing Center. Mutation data were generated via automated pipelines from the respective sources, hosted at xena.ucsc.edu.
S4 Table. The normal tissue types and number of replicates (GTEx database) used for DE analysis of RNA-seq data in normal tissue compared to tumors (TCGA).
COAD tumors in the sigmoid and transverse colon were compared to normal tissue from those regions. For esophageal tumors (adenocarcinomas and squamous cell carcinomas), DE analysis was compared to esophageal mucosal tissue; data in GTEx for esophageal muscularis tissue were not used in this analysis. Data for sun-exposed skin were compared to melanomas; similar DE results were found if non–sun-exposed skin (in GTEx) was used. Only 9 of 952 TCGA BRCA samples for which gender was recorded were from males, therefore we only used GTEx breast tissue from females as reference normal tissue for DE analysis of BRCA.
S5 Table. The total number of different types of somatic mutational events for the 30 most frequently mutated GPCRs in the TCGA tumors surveyed.
A complete list is provided as downloadable supplemental material at insellab.github.io. In addition, a breakdown of numbers of mutation events for each GPCR in each tumor type is provided in S1 Table.
S6 Table. GPCRs showing DE in multiple tumor types.
(Left) Multiple GPCRs have increased expression in multiple types of cancers. The 25 most widely/commonly OE GPCRs are listed along with the number of cancer types in which they are OE. (Center) The same tabulation for the 25 most commonly OE GPCRs that are targets for approved drugs. (Right) The 25 GPCRs most frequently reduced in expression in tumors compared to normal tissue. GPCRs were considered to have DE if fold-changes were >2, FDR < 0.05, and median expression in tumors was >1 TPM. Details of the tumor types in which such DE occurs, the size of fold-changes, etc., are available in S2 Table and at insellab.github.io. A list of GPCRs that are targets for approved drugs was obtained by querying the IUPHAR and CHEMBL databases for lists of approved drugs and the genes that they target. Most such GPCRs are targets for drugs approved by the FDA. The resulting list of “druggable” GPCRs is provided in S2 Table. OE, overexpressed.
S7 Table. GPCRs that show frequent overexpression typically show infrequent mutation and vice versa.
(Left) The 30 most frequently overexpressed GPCRs across the 45 different types/subtypes of cancer profiled, along with the number of TCGA tumors (of 5,103 total) in which they have somatic, nonsilent mutations. (Right) The same data, sorted for the 30 GPCRs mutated in the highest number of tumors. Among the most frequently mutated GPCRs, the CELSR genes are also frequently overexpressed, but most other frequently mutated genes are not.
S1 Text. Concordance of mRNA and protein expression.
Results are in part based upon data from The Cancer Genome Atlas (TCGA) Research Network: http://cancergenome.nih.gov/.
- 1. Lonsdale J, Thomas J, Salvatore M, Phillips R, Lo E, Shad S, et al. The Genotype-Tissue Expression (GTEx) project. Nat Genet. 2013;45: 580. pmid:23715323
- 2. Alexander SP, Davenport AP, Kelly E, Marrion N, Peters JA, Benson HE, et al. The Concise Guide to PHARMACOLOGY 2015/16: G protein-coupled receptors. Br J Pharmacol. 2015/12/10. pmid:26650439
- 3. Sriram K IP. G protein-coupled receptors as targets for approved drugs: How many targets and how many drugs? Mol Pharmacol. 2018;Apr 1;93(4.
- 4. Wacker D, Stevens RC, Roth BL. How Ligands Illuminate GPCR Molecular Pharmacology. Cell. 2017;170: 414–427. pmid:28753422
- 5. Dorsam RT, Gutkind JS. G-protein-coupled receptors and cancer. Nat Rev Cancer. 2007;7: 79–94. https://doi.org/10.1038/nrc2069 pmid:17251915
- 6. Hanahan D, Weinberg RA. The hallmarks of cancer. Cell. 2000;100: 57–70. pmid:10647931
- 7. Bar-Shavit R, Maoz M, Kancharla A, Nag JK, Agranovich D, Grisaru-Granovsky S, et al. G Protein-Coupled Receptors in Cancer. Int J Mol Sci. 2016;17. pmid:27529230
- 8. O’Hayre M, Vazquez-Prado J, Kufareva I, Stawiski EW, Handel TM, Seshagiri S, et al. The emerging mutational landscape of G proteins and G-protein-coupled receptors in cancer. Nat Rev Cancer. 2013;13: 412–424. pmid:23640210
- 9. Kandoth C, McLellan MD, Vandin F, Ye K, Niu B, Lu C, et al. Mutational landscape and significance across 12 major cancer types. Nature. 2013;502: 333–339. pmid:24132290
- 10. Vivian J, Rao AA, Nothaft FA, Ketchum C, Armstrong J, Novak A, et al. Toil enables reproducible, open source, big biomedical data analyses. Nat Biotechnol. 2017;35: 314–316. pmid:28398314
- 11. Zhou H, Rigoutsos I. The emerging roles of GPRC5A in diseases. Oncoscience. 2014;1: 765–776. pmid:25621293
- 12. Zhou H, Telonis AG, Jing Y, Xia NL, Biederman L, Jimbo M, et al. GPRC5A is a potential oncogene in pancreatic ductal adenocarcinoma cells that is upregulated by gemcitabine with help from HuR. Cell Death Dis. 2016;7: e2294. pmid:27415424
- 13. Wiley SZ, Sriram K, Liang W, Chang SE, French R, McCann T, et al. GPR68, a proton-sensing GPCR, mediates interaction of cancer-associated fibroblasts and cancer cells. Faseb j. 2018;32: 1170–1183. pmid:29092903
- 14. Szklarczyk D, Morris JH, Cook H, Kuhn M, Wyder S, Simonovic M, et al. The STRING database in 2017: quality-controlled protein-protein association networks, made broadly accessible. Nucleic Acids Res. 2017;45: D362–d368. pmid:27924014
- 15. Subramanian A, Tamayo P, Mootha VK, Mukherjee S, Ebert BL, Gillette MA, et al. Gene set enrichment analysis: a knowledge-based approach for interpreting genome-wide expression profiles. Proc Natl Acad Sci U S A. 2005;102: 15545–15550. pmid:16199517
- 16. Kanehisa M, Goto S. KEGG: kyoto encyclopedia of genes and genomes. Nucleic Acids Res. 2000;28: 27–30. pmid:10592173
- 17. Ashburner M, Ball CA, Blake JA, Botstein D, Butler H, Cherry JM, et al. Gene ontology: tool for the unification of biology. The Gene Ontology Consortium. 2000;25: 25–29. pmid:10802651
- 18. Consortium TGO. Expansion of the Gene Ontology knowledgebase and resources. Nucleic Acids Res. 2017;45: D331–D338. pmid:27899567
- 19. Kuleshov M V, Jones MR, Rouillard AD, Fernandez NF, Duan Q, Wang Z, et al. Enrichr: a comprehensive gene set enrichment analysis web server 2016 update. Nucleic Acids Res. 2016;44: W90–7. pmid:27141961
- 20. Binder JX, Pletscher-Frankild S, Tsafou K, Stolte C, O’Donoghue SI, Schneider R, et al. COMPARTMENTS: unification and visualization of protein subcellular localization evidence. Database (Oxford). 2014;2014: bau012. pmid:24573882
- 21. Insel PA, Sriram K, Wiley SZ, Wilderman A, Katakia T, McCann T, Yokouchi H, Zhang L, Corriden R, Liu D FM. GPCRomics: GPCR expression in cancer cells and tumors identifies new, potential biomarkers and therapeutic targets. Front Pharmacol. 2018;9. pmid:29872392
- 22. Richardson DR, Baker E. The uptake of iron and transferrin by the human malignant melanoma cell. Biochim Biophys Acta. 1990;1053: 1–12. pmid:2364114
- 23. Chen KG, Leapman RD, Zhang G, Lai B, Valencia JC, Cardarelli CO, et al. Influence of melanosome dynamics on melanoma drug sensitivity. J Natl Cancer Inst. 2009;101: 1259–1271. pmid:19704071
- 24. Stracke ML, Engel JD, Wilson LW, Rechler MM, Liotta LA, Schiffmann E. The type I insulin-like growth factor receptor is a motility receptor in human melanoma cells. J Biol Chem. 1989;264: 21544–21549. pmid:2557332
- 25. Boire A, Covic L, Agarwal A, Jacques S, Sherifi S, Kuliopulos A. PAR1 is a matrix metalloprotease-1 receptor that promotes invasion and tumorigenesis of breast cancer cells. Cell. 2005;120: 303–313. pmid:15707890
- 26. Fujimoto D, Hirono Y, Goi T, Katayama K, Matsukawa S, Yamaguchi A. The activation of Proteinase-Activated Receptor-1 (PAR1) mediates gastric cancer cell proliferation and invasion. BMC Cancer. 2010;10: 443. pmid:20723226
- 27. Darmoul D, Gratio V, Devaud H, Peiretti F, Laburthe M. Activation of proteinase-activated receptor 1 promotes human colon cancer cell proliferation through epidermal growth factor receptor transactivation. Mol Cancer Res. 2004;2: 514–522. pmid:15383630
- 28. Shi X, Gangadharan B, Brass LF, Ruf W, Mueller BM. Protease-activated receptors (PAR1 and PAR2) contribute to tumor cell motility and metastasis. Mol Cancer Res. 2004;2: 395–402. pmid:15280447
- 29. Su S, Li Y, Luo Y, Sheng Y, Su Y, Padia RN, et al. Proteinase-activated receptor 2 expression in breast cancer and its role in breast cancer cell migration. Oncogene. 2009;28: 3047–3057. pmid:19543320
- 30. Darmoul D, Marie JC, Devaud H, Gratio V, Laburthe M. Initiation of human colon cancer cell proliferation by trypsin acting at protease-activated receptor-2. Br J Cancer. 2001;85: 772–779. pmid:11531266
- 31. Jahan I, Fujimoto J, Alam SM, Sato E, Sakaguchi H, Tamaya T. Role of protease activated receptor-2 in tumor advancement of ovarian cancers. Ann Oncol. 2007;18: 1506–1512. pmid:17761706
- 32. Lahav R, Suva ML, Rimoldi D, Patterson PH, Stamenkovic I. Endothelin receptor B inhibition triggers apoptosis and enhances angiogenesis in melanomas. Cancer Res. 2004/12/18. 2004;64: 8945–8953. pmid:15604257
- 33. Lahav R. Endothelin receptor B is required for the expansion of melanocyte precursors and malignant melanoma. Int J Dev Biol. 2005/05/21. Institut Universitaire de Pathologie, Lausanne, Switzerland. email@example.com; 2005;49: 173–180. pmid:15906230
- 34. Bai J, Xie X, Lei Y, An G, He L, Lv X. Ocular albinism type 1-induced melanoma cell migration is mediated through the RAS/RAF/MEK/ERK signaling pathway. Mol Med Rep. 2014;10: 491–495. pmid:24736838
- 35. Hertzman Johansson C, Azimi A, Frostvik Stolt M, Shojaee S, Wiberg H, Grafstrom E, et al. Association of MITF and other melanosome-related proteins with chemoresistance in melanoma tumors and cell lines. Melanoma Res. 2013;23: 360–365. pmid:23921446
- 36. Zhou C, Dai X, Chen Y, Shen Y, Lei S, Xiao T, et al. G protein-coupled receptor GPR160 is associated with apoptosis and cell cycle arrest of prostate cancer cells. Oncotarget. 2016;7: 12823–12839. pmid:26871479
- 37. Fonseca NA, Petryszak R, Marioni J, Brazma A. iRAP—an integrated RNA-seq Analysis Pipeline. bioRxiv. 2014; Available from: http://biorxiv.org/content/early/2014/06/06/005991.abstract. [cited 2019 Nov 19].
- 38. Barretina J, Caponigro G, Stransky N, Venkatesan K, Margolin AA, Kim S, et al. The Cancer Cell Line Encyclopedia enables predictive modelling of anticancer drug sensitivity. Nature. 2012;483: 307–603. pmid:22460905
- 39. Klijn C, Durinck S, Stawiski EW, Haverty PM, Jiang Z, Liu H, et al. A comprehensive transcriptional portrait of human cancer cell lines. Nat Biotech. 2015;33: 306–312. pmid:25485619
- 40. Witkiewicz AK, Balaji U, Eslinger C, McMillan E, Conway W, Posner B, et al. Integrated Patient-Derived Models Delineate Individualized Therapeutic Vulnerabilities of Pancreatic Cancer. Cell Rep. 2016;16: 2017–2031. pmid:27498862
- 41. Müller J, Krijgsman O, Tsoi J, Robert L, Hugo W, Song C, et al. Low MITF/AXL ratio predicts early resistance to multiple targeted drugs in melanoma. Nat Commun. 2014;5: 5712. pmid:25502142
- 42. Ruault M, van der Bruggen P, Brun ME, Boyle S, Roizes G, De Sario A. New BAGE (B melanoma antigen) genes mapping to the juxtacentromeric regions of human chromosomes 13 and 21 have a cancer/testis expression profile. Eur J Hum Genet. 2002;10: 833–840. pmid:12461691
- 43. Miller LD, Smeds J, George J, Vega VB, Vergara L, Ploner A, Pawitan Y, Hall P, Klaar S, Liu ET BJ. An expression signature for p53 status in human breast cancer predicts mutation status, transcriptional effects, and patient survival. Proc Natl Acad Sci. 2005;Sep 20;102.
- 44. Mermel CH, Schumacher SE, Hill B, Meyerson ML, Beroukhim R, Getz G. GISTIC2.0 facilitates sensitive and confident localization of the targets of focal somatic copy-number alteration in human cancers. Genome Biol. 2011;12: R41. pmid:21527027
- 45. Beroukhim R, Getz G, Nghiemphu L, Barretina J, Hsueh T, Linhart D, et al. Assessing the significance of chromosomal aberrations in cancer: methodology and application to glioma. Proc Natl Acad Sci U S A. 2007/12/14. Broad Institute, Massachusetts Institute of Technology and Harvard University, 7 Cambridge Center, Cambridge, MA 02142, USA.; 2007;104: 20007–20012. pmid:18077431
- 46. Purcell RH HR. Adhesion G Protein–Coupled Receptors as Drug Targets. Annu Rev Pharmacol Toxicol. 2018; Jan 6;58:4. pmid:28968187
- 47. Vang Nielsen K, Ejlertsen B, Møller S, Trøst Jørgensen J, Knoop A, Knudsen H, et al. The value of TOP2A gene copy number variation as a biomarker in breast cancer: Update of DBCG trial 89D. Acta Oncol (Madr). 2008;47: 725–734. pmid:18465341
- 48. Snead AN, Insel PA. Defining the cellular repertoire of GPCRs identifies a profibrotic role for the most highly expressed receptor, protease-activated receptor 1, in cardiac fibroblasts. FASEB J. 2012;26: 4540–4547. pmid:22859370
- 49. Berger M, Scheel DW, Macias H, Miyatsuka T, Kim H, Hoang P, et al. Galphai/o-coupled receptor signaling restricts pancreatic beta-cell expansion. Proc Natl Acad Sci USA. 2015;112: 2888–2893. pmid:25695968
- 50. Insel PA, Wilderman A, Zambon AC, Snead AN, Murray F, Aroonsakool N, et al. G Protein-Coupled Receptor (GPCR) Expression in Native Cells: “Novel” endoGPCRs as Physiologic Regulators and Therapeutic Targets. Mol Pharmacol. 2015;88: 181–187. pmid:25737495
- 51. Koyama H, Iwakura H, Dote K, Bando M, Hosoda H, Ariyasu H, et al. Comprehensive Profiling of GPCR Expression in Ghrelin-Producing Cells. Endocrinology. 2016;157: 692–704. pmid:26671185
- 52. Regard JB, Sato IT, Coughlin SR. Anatomical profiling of G protein-coupled receptor expression. Cell. 2008;135: 561–571. pmid:18984166
- 53. Csardi G, Franks A, Choi DS, Airoldi EM, Drummond DA. Accounting for experimental noise reveals that mRNA levels, amplified by post-transcriptional processes, largely determine steady-state protein levels in yeast. PLoS Genet. 2015;11: e1005206. pmid:25950722
- 54. Edfors F, Danielsson F, Hallstrom BM, Kall L, Lundberg E, Ponten F, et al. Gene-specific correlation of RNA and protein levels in human cells and tissues. Mol Syst Biol. 2016;12: 883. pmid:27951527
- 55. Guimaraes JC, Rocha M, Arkin AP. Transcript level and sequence determinants of protein abundance and noise in Escherichia coli. Nucleic Acids Res. 2014;42: 4791–4799. pmid:24510099
- 56. Koussounadis A, Langdon SP, Um IH, Harrison DJ, Smith VA. Relationship between differentially expressed mRNA and mRNA-protein correlations in a xenograft model system. Sci Rep. 2015;5: 10775. pmid:26053859
- 57. Li JJ, Bickel PJ, Biggin MD. System wide analyses have underestimated protein abundances and the importance of transcription in mammals. PeerJ. 2014;2: e270. pmid:24688849
- 58. Peshkin L, Wühr M, Pearl E, Haas W, Freeman RM Jr, Gerhart JC, Klein AM, Horb M, Gygi SP, Kirschner MW. On the relationship of protein and mRNA dynamics in vertebrate embryonic development. Developmental cell. 2015 Nov 9;35(3):383–94. pmid:26555057
- 59. Schwanhausser B, Busse D, Li N, Dittmar G, Schuchhardt J, Wolf J, et al. Global quantification of mammalian gene expression control. Nature. 2011;473: 337–342. pmid:21593866
- 60. Lapek JD Jr, Greninger P, Morris R, Amzallag A, Pruteanu-Malinici I, Benes CH, et al. Detection of dysregulated protein-association networks by high-throughput proteomics predicts cancer vulnerabilities. Nat Biotechnol. 2017;35: 983–989. pmid:28892078
- 61. Sriram K, Wiley SZ, Moyung K, Gorr MW, Salmerón C, Marucut J, et al. Detection and quantification of GPCR mRNA: An assessment and implications of data from high-content methods. ACS Omega. 2019;4: 17048–17059. pmid:31646252
- 62. Dobin A, Davis CA, Schlesinger F, Drenkow J, Zaleski C, Jha S, et al. STAR: ultrafast universal RNA-seq aligner. Bioinformatics. 2012/10/30. Cold Spring Harbor Laboratory, Cold Spring Harbor, NY, USA. firstname.lastname@example.org; 2013;29: 15–21. pmid:23104886
- 63. Li B, Dewey CN. RSEM: accurate transcript quantification from RNA-Seq data with or without a reference genome. BMC Bioinformatics. 2011/08/06. Department of Computer Sciences, University of Wisconsin-Madison, Madison, WI, USA.; 2011;12: 323. pmid:21816040
- 64. Robinson MD, McCarthy DJ, Smyth GK. edgeR: a Bioconductor package for differential expression analysis of digital gene expression data. Bioinformatics. 2009/11/17. Cancer Program, Garvan Institute of Medical Research, 384 Victoria Street, Darlinghurst, NSW 2010, Australia. email@example.com; 2010;26: 139–140. pmid:19910308
- 65. Leng N, Dawson JA, Thomson JA, Ruotti V, Rissman AI, Smits BM, et al. EBSeq: an empirical Bayes hierarchical model for inference in RNA-seq experiments. Bioinformatics. 2013/02/23. Department of Statistics, University of Wisconsin, Madison, WI 53706, USA.; 2013;29: 1035–1043. pmid:23428641
- 66. Reemann P, Reimann E, Ilmjarv S, Porosaar O, Silm H, Jaks V, et al. Melanocytes in the skin—comparative whole transcriptome analysis of main skin cell types. PLoS ONE. 2014;9: e115717. pmid:25545474
- 67. Acharyya S, Ladner KJ, Nelsen LL, Damrauer J, Reiser PJ, Swoap S, et al. Cancer cachexia is regulated by selective targeting of skeletal muscle gene products. J Clin Invest. 2004;114: 370–378. pmid:15286803
- 68. Egeblad M, Nakasone ES, Werb Z. Tumors as organs: complex tissues that interface with the entire organism. Dev Cell. 2010;18: 884–901. pmid:20627072
- 69. Gabrilovich DI, Nagaraj S. Myeloid-derived suppressor cells as regulators of the immune system. Nat Rev Immunol. 2009;9: 162–174. pmid:19197294
- 70. Redon CE, Dickey JS, Nakamura AJ, Kareva IG, Naf D, Nowsheen S, et al. Tumors induce complex DNA damage in distant proliferative tissues in vivo. Proc Natl Acad Sci USA. 2010;107: 17992–17997. pmid:20855610