CINS: Cell Interaction Network inference from Single cell expression data

Studies comparing single cell RNA-Seq (scRNA-Seq) data between conditions mainly focus on differences in the proportion of cell types or on differentially expressed genes. In many cases these differences are driven by changes in cell interactions which are challenging to infer without spatial information. To determine cell-cell interactions that differ between conditions we developed the Cell Interaction Network Inference (CINS) pipeline. CINS combines Bayesian network analysis with regression-based modeling to identify differential cell type interactions and the proteins that underlie them. We tested CINS on a disease case control and on an aging mouse dataset. In both cases CINS correctly identifies cell type interactions and the ligands involved in these interactions improving on prior methods suggested for cell interaction predictions. We performed additional mouse aging scRNA-Seq experiments which further support the interactions identified by CINS.


Author summary
Single cell transcriptomics has emerged as a leading technology for studying the composition of organs and tissues in the human body, development and several other biological processes. More recent studies, including studies of various diseases (such as cancer), treatment-response studies and aging studies aim at comparing samples at the single cell level. To date, such analysis mainly focused on the differences in expression of genes in the different cell types. However, in addition to differences in expression such studies also provide information on the differences of cell type proportions between the conditions. To use such information for inferring cell interactions we developed a new computational framework termed CINS. CINS combines Bayesian network learning (which is used to infer cell type-cell type interactions) with constrained regression analysis (used to infer the specific proteins involved in such interactions). We applied CINS to a number of different case-control scRNA-Seq datasets including a lung disease and an aging study. By

Introduction
The ability to profile the expression of genes at the single cell level has revolutionized gene expression studies. Single cell RNA-Seq (scRNA-Seq) studies resulted in insights related to the cell type composition of tissues [1,2], changes in cell type composition in various diseases and states [3], various differentiation pathways used within cells [4] and more. However, while scRNA-Seq provides valuable information about expression within cells, it is hard to use it to study interaction between cells. The main problem is that once cells are extracted it is very challenging to determine the spatial relationships among them [5].
A number of methods have been introduced recently to identify ligand receptor interactions in scRNA-Seq studies [6,7]. While these methods differ in the exact formulation and statical analysis, they all focus on finding correlations between ligands expressed in one cluster (or cell type) and receptors expressed in another. This works well for studies that are analyzing a single condition (for example expression in a specific tissue or at a specific time point) but does not fully utilize information in case-control studies single cell studies [8,9]. Unlike single condition studies, in addition to differences in expression case-control studies also provide information on differences in the proportions of different cell types between the conditions. Such information can be very useful in determining which cell types interact. When cell type proportions are correlated between two conditions (for example both high in one and low in the other) it may indicate that they are likely to interact [10,11]. As we show, this information greatly improves the ability to correctly infer cell-cell interactions from scRNA-Seq data.
In addition to methods that attempt to infer cell-cell interaction information from scRNA-Seq, a number of technologies have emerged for spatially profiling single cell expression data [12][13][14][15]. These technologies often combine Fluorescence in situ hybridization (FISH) with rapid sequencing to provide information on the spatial expression of thousands of genes at various resolutions [16,17]. A number of recent computational methods have been developed to allow for the study of signaling pathways involved in cell-cell interactions from this type of spatially-resolved expression data [18]. However, while spatial transcriptomics studies are promising there are several challenges involved in employing them to study intercellular interactions. First, current commercial spatial transcriptomics platforms do not profile cells at the single cell level. Most labs do not have access or ability to perform such studies at the single cell resolution. More importantly, spatial transcriptomics often requires the fixation of the samples which limits their usage and can negatively impact their ability to accurately profile molecular quantities [16]. Finally, spatial transcriptomics methods can scan only a small region of the tissue and so cannot be applied to large number of conditions and samples that are studied using scRNA-Seq.
Here we present a new method, the Cell Interaction Network Inference (CINS) pipeline, that infers cell type interactions in case control scRNA-Seq studies. CINS involves two major steps. First, it uses scRNA-Seq data from multiple samples of a similar condition (i.e. disease, age, etc.) to learn Bayesian networks which highlight the cell types whose distributions are covarying under different conditions. Next, for the high scoring differential interactions identified in the Bayesian network analysis, CINS learns a regression model with ligand-target interaction matrix [6] that identifies the key ligands and targets that participate in the interactions between these cell types.
We tested CINS by applying it to both, disease and aging datasets. We show that CINS correctly identifies known interacting cell type pairs and ligands associated with these interactions and improves upon prior methods for inferring ligand-receptor interactions in scRNA-Seq data. We also discuss several novel predictions made by CINS. Finally, we show that a number of CINS predicted cell type interactions are supported by a new scRNA-Seq lung aging dataset we profiled.

The Cell Interaction Network Inference (CINS) Pipeline
We developed the Cell Interaction Network Inference (CINS) pipeline which uses single cell (sc) RNA-seq expression data to infer cell-cell interactions (Fig 1). Given repeated experiments of the same condition / system CINS uses annotated cell type information to construct a Bayesian network (BN) that models causal relationships between different cell types. For this, CINS first discretizes the proportion data for each cell type using a Gaussian Mixture Model (GMM) with only two components and then learns a BN that models the joint probability distribution of the cell type mixtures observed for each sample. High scoring differential causal relationships are determined based on bootstrapping. Next, for each of the high scoring differential Cell type annotation is used to extract cell type fractions in each sample. Next cell type fraction is discretized by learning Gaussian Mixture Model (GMM) for this type, respectively. (B) A Bayesian network (BN) is learned using the discretized cell abundance information. Bootstrapping is performed to identify high scoring differential interactions between cell types. (C) For pairs identified in the directed bootstrap BN analysis, a ligand-target regression (LTR) model is learned. In this model we use the expression change of ligands in the cell type with the outgoing edge to predict the expression of targets genes in the cell type with incoming edge. (D) Finally, LTR is used to select key ligands that underlie the cell-cell interactions identified in the BN. cell interaction.
https://doi.org/10.1371/journal.pcbi.1010468.g001 pairs identified we infer the genes involved in the interactions by learning a ligand-target regression (LTR) model with ligand-target interaction database from NicheNet [6]. The LTR model aims to explain changes in target genes as a function of changes in their activating ligands allowing CINS to identify the most significant ligands that regulate the cell-cell interactions.

Inferring cell type interactions using Bootstrapped Bayesian Network
We first applied CINS to simulated expression data. For this we generated cells using a fixed BN for cell-cell interactions and for each cell generated expression values based on their cell type. See S4 Text for details on the simulation settings and parameters. Results of our simulation analysis are presented in S18 and S19 Tables. As can be seen, when not assuming dropout, CINS can correctly infer most of the underlying interactions (identifying 5 of 6 real interactions with accuracy of 83% and precision of 100%). While performance decreases when we increase noise, CINS is still able to accurately reconstruct the underlying BN even under relatively high noise levels (4 of 6 for 50% dropout with no false positive interactions).
We next applied CINS to a lung disease scRNA-Seq dataset [8]. The lung disease dataset contained scRNA-Seq data for 28 healthy (controls) and 32 Idiopathic Pulmonary Fibrosis (IPF) individuals. A total of 250,942 cells were profiled for these individuals. Cell type annotations were assigned based on the original study and we used the detailed assignments that provided information on 39 cell types.
We used CINS to explore differential cell type interactions between IPF and control samples. For this, we constructed two different networks based on the cells profiled for each condition. We next performed bootstrap analysis to determine the score of each edge in each condition. Edges that appear in the majority of bootstrap iterations likely represent real relationships in the data rather than noise [19,20]. Resulting BNs for the two conditions are presented in Fig 2A and 2B. As the figures show, there are some edges that appear for both conditions. These include Basal to Goblet cell interactions, which agrees with the fact that club cell's attachment sites are provided by Basal cell [21]. However, there are also many differences between edges selected for the two condition networks. Table 1 summarized the top differences based on the signed difference in edge count in 100 bootstrap iterations for IPF and control (See S1 Table for differences for all detected edges). Several of the highest scoring edges are supported by prior work. For example, the edge from Treg to Fibroblast cell is supported by a previous study suggesting that Treg's can negatively regulate fibroblast activity [22]. The edge between cDC2 and cDC1 is also supported by recent work showing that cDC2 and cDC1 are cross-talking with each other [23]. Several other top scoring edges are supported by the literature as referenced in Table 1. We next compared the interactions predicted by CINS to interactions predicted by CellPhoneDB, iTALK, and NicheNet (Methods), which are all popular methods for inferring ligand-receptor based cell interactions [6,7,24]. As can be seen, in S2 Table. unlike CINS which identified a diverse set of cell type interactions, almost all interactions predicted by CellPhoneDB involved Goblet cells (18 of the top 20). While there is some support for Goblet involvement in IPF [25] they only explain a small fraction (estimated to be less than 20%) of individuals with the disease and it is unlikely that they interact with almost all other cell types. Similarly, for NicheNet, almost all interactions predicted involved a single cell type, Pericyte cells. iTALK performed better, but it has only detected interactions between immune cells in the IPF lung dataset. While these are indeed of interest, the more interesting interactions are those between immune cells and fibroblast cells in the (injured) lung and none of these were identified by iTALK. In contrast, by looking at the overall distribution of cell types CINS was able to find a more general and, as we showed, accurate set of interactions between cell types that are likely relevant for the disease.

Inferring ligand-target interactions for high scoring differential cell type pairs
While the BNs discussed above identify pairs of cell types that likely interact in disease, the network does not show which genes and protein products participate in the interactions. To infer Table 1. Top differential cell type interactions identified by CINS for the IPF dataset. The IPF-Control column lists the difference in the number of times the edge between the two cells was identified in 100 bootstrap runs for each of the two datasets. Negative values indicate that it was identified more for the Control whereas positive numbers mean that the interaction is more prevalent in IPF. For all listed edges the interaction was only identified in one of the two datasets (score of 100 or -100).

cell_type1 cell_type2 IPF-Control Reference
Macrophage Ciliated -100 There is strong interaction between ciliated cell and Macrophage in COVID-19 critical cases [44] Fibroblast Lymphatic -100 Fibroblast produce extracellular matrix which is critical to lymph node microenvironment [45] cDC2 PLOS COMPUTATIONAL BIOLOGY such gene-gene interactions across cells we developed a ligand-target regression (LTR) model. For cell type pairs identified in the BNs our LTR model uses a set of ligands in the first cell type to predict the expression values of their known targets in the second cell type. The LTR model uses the LASSO algorithm which enables the identification of a small set of key ligands predicted to participate in the interaction observed in the BN. We trained the model using a fivefold cross validation strategy. See Methods for details. The LTR method was applied to all high scoring differential pairs identified by the BN. S3 Table presents top scoring ligands for several cell type pairs. S4 Table presents top scoring ligands for one cell type pair (Fibroblast -> Lymphatic cell). Several of the top LTR ligands are known to play an important role in the activated cell (Lymphatic cell). For example, the highest scoring ligand identified by LTR is "FGF2" which was identified as a critical gene for lymphangiogenesis [26]. Another highly ranked ligand, "TGFB1", can also accelerate lymphatic regeneration in wound repair [27]. S5 Table presents top ranked ligands for another pair (Treg cell -> Fibroblast), several of which have also been shown to participate in the interaction between these cell types. For example, fibroblast express IL13 receptor and may behave as an inflammatory cell if stimulated by IL-13 [28], and TGFB1-3 (including TGFB1 and TGFB2 in the table) are all involved in promoting collagen production in fibroblasts [29].

Identified ligands are primarily involved in cell-cell interactions
To test if the predicted ligands are indeed impacting cell type-cell type interactions or mainly represent autocrine relationships we compared the activity of top predicted ligands within and between cell types. For this, we compared the performance of the LTR method for top edges to the performance of a similar method that only uses information from a single cell type. Specifically, if the BN predicted a high scoring differential interaction between cell types A -> B, we first trained LTR using the ligands of A and the targets of B (as we did above) and compared the performance to a LTR model which uses the ligands expressed in B to predict targets in B (autocrine model).
Results for the high scoring differential edges in the IPF and control datasets is presented in Fig 3A and 3B. presents the results for the same pairs (so x axis is fixed based on the BN score) but with the LTR trained using only the ligands of the second cell type. As can be seen, when using the ligand of the predicted interacting cell type LTR obtained a higher average correlation with a p-value of 0.034 (using the scipy function in Python for computing Pearson correlation p-values). In contrast, when using the same cell type for both ligands and targets the Pearson correlation is lower (Fig 3B). We also evaluated the performance of the LTR method on the predicted cell type interactions by comparing the results we obtained with the real ligand-target interaction matrix to results obtained using a random ligand-target interaction matrix. We found that for most of the random assignments the resulting LASSO models contained only a Bias term with all coefficients set to 0 (S3 Fig). This indicates that expression of the ligands did not provide any useful information about the expression of the targets when using the random interaction matrix.

Application to a scRNA-Seq dataset on lung aging
We next applied CINS to another, smaller, scRNA-Seq dataset which studied lung aging in mice [9]. The dataset profiled lung cells in 15 mice, 8 young (three-month, 3M) and 7 old (24-month, 24M). The 14,813 cells profiled in this study were assigned to one of 34 cell types in the original paper. We again learned 100 bootstrapped BNs for the two conditions (young and old) and compared the resulting networks. We found 11 edges to be differentially present between the two conditions when using an edge threshold count of 20 (Fig 4 and S6 Table).
These included an edge between Capillary-endothelial-cell and Type 1-pneumocyte cells which are known to jointly form thin air-blood barriers used for gas exchange [30]. Another pair was Ciliated and Club cells, of which the ratio is reported to alert significantly between young and old mouse lung [9]. We next performed LTR analysis on the high scoring differential edges. The top ranked ligand in Ciliated cells, TNF is known to regulate CC16 gene production, which plays a role in immunomodulatory activity in Club cells [31]. Apoe, a ligand identified for the macrophage to goblet edge, is produced by macrophages to negatively modulate goblet cell hyperplasia [32].
As we did for the IPF study we compared the performance of the LTR method using ligands from the BN identified edges (A -> B) and ligands from the same cell type (B) to predict target expression for genes in B. We observed a Pearson correlation of 0.67 when using the ligands    (Fig 5). And it is noticed that when randomizing the interactions the LTR method again failed to identify any significant correlation between predicted and real expression for the targets (S3 Fig). Computational validation of high scoring differential edges using a second aging mouse lung dataset To test the predictions of the aging BN and to validate them using an independent cohort we next performed additional scRNA-Seq experiments on young and old mice to generate a pilot scRNA-Seq dataset on lung aging. For this, we profiled four young and four old mice of the Fendrr-floxed genotype recently generated in the Kaminski laboratory. We obtained 71,562 cells that were clustered, annotated, and assigned to 20 cell types that overlapped with the cell types assigned by Angelidis I et al. [9]. The problem with both aging datasets is their small size 15 and 8 compared to 60 in IPF dataset). We could not obtain significant results using the 8 dataset aging data given its small size. Thus, we could not use it as a standalone dataset to validate the results of the larger (15 samples) datasets. Instead, we looked at the impact of combining the two. We next used the combined data (from [30] and from our new experiments) to learn a joint BN. Several of the predicted interactions were further supported by our new data. Specifically, we found 19 cell type pairs for which the addition of our new data enhanced both the presence of the edge and the direction predicted when performing the bootstrap analysis. S8 Table presents the top 10 enhanced pairs based on the overall bootstrap score (See S11 Table for all enhanced pairs). For example, the interaction between Neutrophils and Gamma Delta T cell is enhanced from edge count of 40 to 61 and was reported by recent studies that neutrophils can suppress Gamma Delta T cell's activation involved in the resolution of inflammation [33]. And the interaction between B Cell and CD4+ T Cell is enhanced from -16 to -19 (being negative means that old lung has less), and is supported by other studies that B cell will activate CD4 T cells in human cutaneous leishmaniasis infection led by Viannia [34]. In addition, we also found that T-cell-B-cell interactions were calculated to occur less often in older samples, which further validates the comparison between old and young mice [35]. We next focused on the top five predicted interactions in S8 Table (all with an absolute enhanced bootstrap score larger than 15). Permutation analysis indicates that identifying such a large number of edges supported by both studies is significant (p-value = 0.05, Methods and Fig 6, and see S13 Table for result of other threshold values). Specifically, we permutated the cell type fraction of the aging dataset with 8 samples, and then did the BN analysis for 1,000 times. We next calculated the fraction of enhanced pairs with certain edge threshold over the whole pairs reported. We applied LTR to the cell type pairs in S8 Table to find important ligand genes. S9 Table presents the top predicted ligand genes. Several of these (red font) are supported by prior studies on the interaction between these cell types. Comparisons to Cell-PhoneDB, iTALK and NicheNet indicated that, similar to what we observed for the IPF data, the predicted interactions are very different compared to CINS (S4 Fig). In addition, unlike CINS for which the overlap between the pairs identified with and without the new datasets were significant, for CellPhoneDB we did not observe significant overlap between predicted interaction pairs (S13 Table).

Discussion
To enable the study of cell type-cell type interactions using scRNA-Seq data we developed a method termed Cell Interaction Network Inference (CINS). CINS first learns a Bayesian network between cell types (BN) using repeated samples. High scoring differential cell type pairs identified by the BN are further studied to infer the ligands that regulate these interactions. CINS is implemented in python and R and can be downloaded from https://github.com/ xiaoyeye/CINS.
While CINS can be applied to any dataset with multiple samples, it is most appropriate for datasets containing case and control or multiple conditions. For such datasets CINS can infer not only the high scoring differential interactions within a condition but also those interactions that differ between the condition and that may partially explain the differences between the conditions studied. The main difference between CINS and all prior methods is the unique ability of CINS to take advantage of case-control studies to infer cell type interactions. Prior methods mainly focus on ligand gene expression information of each of these datasets separately and do not use the changes in fraction between the case and controls to infer such interactions, while CINS can make use of cell proportion as additional information and the two can prove each other mutually, further confirming the findings. The discretization of cell proportion can fit the data very well and makes it easier for BN to learn the correct structure of the network which is the major focus of CINS. As we show, by using the case control setting the interactions we can obtain are indeed more accurate and more diverse than these prior methods.
We first applied CINS to simulated data and demonstrated its accuracy on simulated NB networks. We next applied CINS to study a case and control dataset profiling lung expression from IPF patients and controls. CINS identified several differences between the interactions observed for IPF patients and for healthy individuals. These include the interaction from Treg to Fibroblast cells which is supported by a recent study that found Treg can negatively regulate fibroblast activity [22], and the edge between cDC2 and cDC1 is also supported by recent work showing that cDC2 and cDC1 are cross-talking with each other [23].
For many of the identified high scoring differential interactions CINS was also able to identify key ligands involved in the interactions. For example, "FGF2" which was identified as a critical gene for lymphangiogenesis [26], and one more highly ranked ligand, "TGFB1", can also accelerate lymphatic regeneration in wound repair [27].
We next applied CINS to a lung scRNA-Seq aging dataset and identified a number of high scoring differential pairs that differ between young and old mice. To validate predicted interactions we performed additional experiments in which we profiled scRNA-Seq expression in 4 additional young and old mice and then used the combined dataset to learn a joint network. As we showed, the network we learned identified a significant number of interactions that are supported by both datasets. These include the interactions between Neutrophils and Gamma Delta T cell [33], and between B Cell and CD4+ T Cell [34,35] which are both supported by previous studies. CINS was again able to identify key ligands involved in these interactions, TNF, identified as the top ligand in the interaction between neutrophils and Gamma Delta T cells was previously identified as expressed in neutrophils [36] and as a regulator of immune cells Gamma Delta T cells [37], and TNFSF18 identified in interactions between CD4+ T cells and Vascular Endothelial Cells, was also previously reported to mediate the interactions between immune cells and endothelial cells [38].
While CINS can be successfully applied to several scRNA-Seq studies, there are still many places where it can be improved. First, it can only be applied if multiple samples are profiled since the BN part requires several repeated samples to compute relationships between cells. In addition, because BNs do not allow self edges, interactions between cells of the same type cannot be identified by CINS. Thirdly, since it uses a bootstrap approach to infer edge score it can miss important interactions if not enough samples and / or cells are available. Finally, CINS indeed requires, and relies on, prior cell type annotations. Since most case control datasets are from the same experiment, differences between cell type assignments attributed to individual preference should be minimal. If such information is not provided users would need to annotate their cells using one of the cell atlas annotation servers (for example, scQuery [1,2]) and then apply CINS.
CINS is one of the first methods to enable the inference of cell type interactions in scRNA-Seq data from repeated samples. Given the growing popularity of this method, and its increased use in clinical studies which are currently less amenable to spatial transcriptomics techniques we believe that CINS provides a solution to an important problem that is not currently addressed.

Materials and methods
We developed a pipeline for modeling interactions between cells of different types from scRNA-Seq data. Our method first identifies cell types that are likely interacting and then tries to provide a mechanistic model to explain how such interactions are manifested at the molecular level.

Datasets
We tested CINS using three scRNA-Seq datasets. The first compared gene expression in lungs of healthy and Idiopathic Pulmonary Fibrosis (IPF) with accession number of GSE136831 [8]. This dataset contained 28 controls and 32 IPF patients with a total of 243,472 cells and the expression levels for 45,947 genes in each cell. We used the original annotations and included in the model all 39 cell types with at least 100 cells. The second dataset studied lung aging in mice with accession number of GSE124872 [9]. This dataset contained 8 three-month-old mice and 7 24-month-old mice for which a total of 14,813 cells were profiled. For each cell the expression levels of 21,969 genes were provided. Each cell was assigned by the authors to one of 34 cell types. The third dataset was a new dataset in which we profiled single cell expression in four young (25 weeks) and four old (2x 103 weeks; 2x 120 weeks) Fendrr-floxed mouse lungs. This dataset contained a total of 71,562 cells with expression values for 45,947 genes. These cells were originally assigned to 37 cell types based on the expression of canonical cell type markers. To combine the two aging datasets we did the following. We first normalized the gene expression data using the same method for both datasets. Next, we manually assigned a common set of cell types to both datasets so all cell type match between the two. Specifically, we identified a joint subset of 20 cell types identified by both and only used cells assigned to these cell types in our combined BN analysis (see S10 Table for cell type information details). Information about ligands and their targets were obtained from a recent paper [6] which provided targets for 688 ligands.

Single-cell sequencing of Fendrr-floxed mice
Animal procedures had been approved by the Institutional Animal Care and Use Committee (IACUC). We created a floxed allele of Fendrr via two-guide, two-oligo CRISPR/Cas mediated cleavage and recombination essentially as described in Yang et al. [39]. A generated mouse which had the expected conditional allele was bred with C57BL/6J mice to establish the colony and to sort the floxed allele from any other possible mutant alleles. Three female and five male mice in two age groups (young: 23 weeks, old: ranging from 103 to 120 weeks; four mice per group) were euthanized, and lungs were harvested and minced in small pieces with a scalpel. Lung pieces were dissociated using the enzyme Liberase TL (Roche).
Single RNA molecules of single cells were barcoded using the 10× chromium single-cell technology according to the manufacturer's instructions (Single Cell 3 0 Reagent Kits v2, 10× Genomics, USA). Barcodes were used to assign reads to cells and quality control was performed to remove low quality cells (S2 Text). Generated sequencing data is available at GEO accession number GSE165638. A modified version of the standard Seurat pipeline was employed to normalize, cluster and annotate the raw counts single-cell expression data for downstream analysis [40]. Briefly, the percent of mitochondrially-expressed genes was calculated for each individual cellbarcode using the PercentageFeatureSet function. Next, unique molecular identifier (UMI) counts were log normalized with a scale factor of 10,000 UMIs per cell and then natural log transformed using a pseudocount of one. Following log normalization, the top 3500 variable genes within the dataset were determined using Seurat's implementation of the FindVariableFeatures function with the "vst" parameter. Next, the gene-level scaling of the data was performed using the ScaleData function. Each feature was centered to have a mean of zero and scaled by the standard deviation of each feature. The percent of mitochondrially-expressed genes captured within each cell were regressed out during scaling by using the "vars.to.regress" parameter. To reduce the dimensionality of the dataset and to identify genes contributing the most variability to the underlying manifold of the dataset, Principal Component Analysis (PCA) was performed using the scaled data and the 3500 variable genes calculated determined for the dataset. Following exploration of the PCs (S3 Text), the first 75 PCs were selected for clustering and following Uniform Manifold Approximation and Projection (UMAP), a dimensionality reduction method. The quality of subject and age representation within each cluster was assessed prior to cell type annotation to note any subject-or agespecific biases.

Cell type assignment of Fendrr-floxed mice
To assign a specific cellular identity to each cluster, differentially expressed markers were determined and assessed within the context of canonical marker genes. Briefly, a differential gene expression test using Wilcoxon Rank Sum test was performed that compared the gene expression within a specific cluster to expression within all cells outside of that cluster. The resulting list of cluster-specific marker genes was assessed and cell types were ascribed based on expression of canonical marker genes. Clusters displaying canonical markers for multiple cell types were flagged as multiplets and were omitted from downstream analysis.

Cell type quantification and discretization
We use the cell type annotation information provided by each study. To use Bayesian network to learn relationships between cell type we first discretize the proportion of each cell type in each sample. Discretization is cell type specific (i.e. different cell type will be assigned different values for the same proportion quantity) and is learned using an unsupervised method based on Gaussian Mixture Model (GMM) with two components. Specifically, let ½x i 1 ; x i 2 ; � � � x i n � � � ; x i N � be the fraction (percentage) of the ith cell type in the N samples. We learn a two components GMM for these values and then assign each value to the class with the higher likelihood for this value. The target function of the GMM aims to maximize the log likelihood: Where N represents gaussian distribution and (p i k ; m i k ; s i k ) represent proportion, mean and standard deviation parameters for the kth component of the ith cell type.
Following convergence, each proportion value x i n is assigned to one of the two classes. We assign labels to the two classes such that the component with lower mean parameter is assigned a value of 0 and the second is assigned a value of 1. Specifically, the cell type specific cutoff is determined by the GMM model, which can automatically find the cutoff based on the clustering it learns. Next, samples assigned to the cluster with the higher mean are set to 1, and those assigned to the lower mean cluster are set to 0. This leads to a learned cell type specific cutoff such that all samples with a value less than that cutoff are assigned to 0 and all those above are assigned to 1. However, the number of 0's and 1's is not pre-determined and may be highly skewed in either direction based on the distribution of the fractions. See S1 Fig for examples of assignments. To learn GMMs we used the Python package "sklearn" with a maximum iteration number of 500 and a convergence threshold of 10 �� -4.

Learning a cell type Bayesian network
We use the discretized cell type values to learn a cell type Bayesian network. Bayesian network is a probabilistic graphical model that uses directed acyclic graph to represent joint probability distributions. The absence of an edge can indicate independence and / or conditional independence. Bayesian networks are parameterized as <G, P> where G = <V, E> is a directed acyclic graph with V as variables and E as directed edges, and P is the global joint distribution for all nodes V. Given the graph structure this probability can be decomposed into local distribution for each node, V q , conditioned on its parent nodes as follows: Where Pa(V q |Θ q ) is parent node set of V q according to G.
To learn a Bayesian network using the discretized cell type proportion data, we iterate between network learning and parameter estimation. We initialize the network using the Hiton Parents and Children strategy which is based on marginal association among variables [41]. Next we iterate a search strategy, that uses penalized Hill-Climbing to add, flip or remove edges based on the Bayesian Information Criterion (BIC) score when using dataset D, where each sample in D contains values for all the variables of V: where N is the number of samples and Dim(G) is the number of parameters in the model. For this, we used the "rsmax2" function from the R library "bnlearn", which implements the iterative Penalized Maximization algorithm to construct a Bayesian network.
To obtain confidence values for edges (predicted interactions) in the network we followed previous learning methods that utilized a bootstrap strategy [19,20,42]. For each iteration of the bootstrap we first randomly sample 80% of all single cells in the dataset. Next, we used these cells to determine cell type frequencies in each sample and to perform the discretization and network learning as described above. This step is repeated 100 times, and for which we counted the presence of all directed edges. While the direction of an edge in a Bayesian network does not always imply casual interactions [43], we observed that high scoring differential edges were also very consistent in their direction (S1 Table).

Ligand-Target Regression (LTR) model
The bootstrapping method presented above provides a small set of high scoring differential interactions between some of the cell types in the dataset. To obtain a mechanistic explanation for these interactions, and to identify the interacting genes between the two cell types we focused on ligand-target interactions between the two cells types. Specifically, for a predicted directed edge between cell types A and B we learned a Ligand-Target Regression (LTR) model to determine if there is an underlying cell type-cell type interaction between A and B. Our assumption is that if these two cell types indeed interact, then the expression of some of the ligands in cell A type should be able to explain some of the expression changes observed in cell type B. Similar approaches have been used by others to explore cell-type interactions in non case control studies [7].To identify a set of ligands in A predicted to activate or repress target genes in B we optimized the following regression model: min a P T t ð P L l I ðt;lÞ L ðlÞ a ðlÞ À T ðtÞ Þ 2 þ lkak ð4Þ Where I represents an input (known) ligand-target interaction matrix [6], L is an input vector of log values for the expression of ligands in cell type A, α represents the (unobserved) ligand activation vector, T represents the expression levels for target genes in cell type B and λ is a regularization parameter. Here we used a L1 regularization which usually leads to the selection of relatively few non zero values (corresponding to relatively few activated ligands in cell type A).
Using the inputs to sett A (t,l) = I (t,l) L (l) , transforms the optimization problem to Which is a standard least absolute shrinkage and selection operator (LASSO) model. To learn parameters for the model we used the "LASSOCV" function from the Python library "scikit-learn", which implements the LASSO cross validation. Note that the model in Eq 5 using the same ligand activity parameters for all genes (there is only 1 ligand activity parameter in the model for each ligand across all target genes). Thus, we can use this model in a cross validation setting to predict the expression levels of held out targets in cell type B. For these, we know the ligand-target interaction from Matrix I and the ligand expression from L allowing us to evaluate the ability of the model to generalize to unseen targets. We also use the model to test if we obtain better prediction accuracy for significant pairs identified in the BNs.

Training and Test for Ligand-Target Regression (LTR) model
We used a five-fold cross validation strategy to train and test the LTR model: We split the training part of each validation set into two sets to select the hyperparameter λ (our penalty term) and then retrain using all training data for this set and the selected λ to obtain the model used for the fold test data. Evaluation of predicted values is based on the average Pearson correlation between the predicted and actual expression changes for each fold. Following testing we use the average product between the log fold change and coefficient value α in the five-fold training models to rank the list of active ligands.

Joint plots of Bayesian Network and LTR model scores for cell type pairs
To jointly plot the Bayesian network bootstrap score and the Pearson correlation regression score for each cell type pair, we first converted the edge count to log value. For the Pearson correlation we used the average correlation for the five-fold results. For both IPF lung data and lung aging data, cell pairs with edge count smaller than threshold of 20 are removed. To test the robustness of the method to the threshold selection we performed a validation study that tested different threshold values. We observed good agreement for values between the BN score and LTR model score, and eventually selected 20 since the Pearson correlation between BN and LTR model score reaches the high value when the threshold is set to 20, as shown in S12 Table. Note that for some of the pairs we tried to model using LASSO the learning terminated with coefficients of 0 for all ligands (this happened for all runs of the random interaction matrix as we mention in Results and to a few of the CV runs of the cell-cell and intra-cell models). In such cases these models were removed from the correlation analysis.

Comparison to CellPhoneDB, iTALK and NicheNet
All the three methods are based on ligand related gene expression analysis. For CellPhoneDB, the result contains all possible cell type pairs with calculated ligand-receptor scores. For each cell type pair, we use the sum of all its ligand-receptor scores as its pair score. iTALK detects significant ligand-receptor pairs, and provides the mean expression level of them. We then sum the product of ligand and receptor expression as the final score for each cell type pair. NicheNet can select top functional ligand genes with prediction scores for a given cell type pair. We next sum these prediction scores for all selected ligands to rank cell type pairs. We first learn a 2 cluster GMM using all samples. Next, we perform Leave one out (LOO) cross validation and compare it with that using all samples. Specifically, we obtain an average accuracy of 0.96 for the larger (60 samples) IPF study across the different cell types (A) and an accuracy higher than 0.9 for the smaller (15 samples) aging dataset (B). (TIF) S1  Table. The overlap cell types between ours and NC mouse aging data. The labeled cell types by different colors are merged as one cell type, respectively. (XLSX) S11 Table. All the enhanced pairs by our own mouse aging scRNA-seq data. Red means that compared to the original bootstrap BN based on NC paper with 34 cell types in section of "Application to lung aging dataset", these overlap cell type pairs have the same difference sign between young and old mice, while green means these overlap cell type pairs have opposite difference sign. All other cell types are those not found by the original bootstrap BN based on NC paper with 34 cell types in section of "Application to lung aging dataset". (XLSX) S12 Table. The model consistence (Person correlation) between BN and LTR/CellPho-neDB model of different edge count threshold for IPF lung and lung aging NC data. We selected edge count 20 as threshold finally used for both data. In each table, the left column corresponds to LTR model using cell 1 ligand to predict cell 2 target gene, the middle column corresponds to a control LTR model using cell 2 ligand to predict cell 2 target gene, and the right column corresponds to CellPhoneDB model. (XLSX) S13 Table. Significance of permutation analysis of different edge score threshold values in aging lung data using BN and CellPhoneDB. The two methods generated 68 and 441 edges respectively. We present the percentage of selected edges of the different score thresholds for each method and the resulting p-value of the overlap. (XLSX) S14 Table. Pearson correlation between original (80%) edge count and new percentage (60/ 70/90%) edge count for IPF data. (XLSX) S15 Table. Detected edges (cell type pairs) for the bootstrap analysis of Bayesian Networks when comparing IPF and healthy data using a simple discretization strategy that does not assume Gaussian distribution and is appropriate for constrained values like percentages. For each cell type, we split the values in half [min, (max+min)/2] and [(max+min)/2, max]. We next analyzed the significance of edge overlap between the two discretization strategies, and found that among 24 edges detected by the new BN, 8 are covered by the original BN result that contains 42 edges, with a p value of 7.39 � 10^-7 using fisher's exact test. Such result indicates that different discretization can impact the result, but does not make a huge difference. It is noticed that all edges have the same absolute value of difference, 100, based on which the Pearson correlation between BN and LTR model would be always 0. That is to say, with such strategy, no randomization can be introduced, and the following LTR model comparison way cannot be used to validate the two models mutually. (XLSX) S16 Table. Model agreement (Person correlation) between LTR model and BNs using the "leave one sample out" (LOSO) strategy that each time we leave one IPF or healthy sample out for the Bootstrapping. (XLSX) S17 Table. Model agreement between BN and LTR model using MSE (instead of Pearson correlation) as its evaluation score, for different edge count threshold on IPF lung data. LTR control model represents the model that cell type B's ligand genes are used to predict B's target genes for the BN identified edge (A->B). (XLSX) S18 Table.