Transcriptional landscape of pulmonary lymphatic endothelial cells during fetal gestation

The genetic programs responsible for pulmonary lymphatic maturation prior to birth are not known. To address this gap in knowledge, we developed a novel cell sorting strategy to collect fetal pulmonary lymphatic endothelial cells (PLECs) for global transcriptional profiling. We identified PLECs based on their unique cell surface immunophenotype (CD31+/Vegfr3+/Lyve1+/Pdpn+) and isolated them from murine lungs during late gestation (E16.5, E17.5, E18.5). Gene expression profiling was performed using whole-genome microarrays, and 1,281 genes were significantly differentially expressed with respect to time (FDR q < 0.05) and grouped into six clusters. Two clusters containing a total of 493 genes strongly upregulated at E18.5 were significantly enriched in genes with functional annotations corresponding to innate immune response, positive regulation of angiogenesis, complement & coagulation cascade, ECM/cell-adhesion, and lipid metabolism. Gene Set Enrichment Analysis identified several pathways coordinately upregulated during late gestation, the strongest of which was the type-I IFN-α/β signaling pathway. Upregulation of canonical interferon target genes was confirmed by qRT-PCR and in situ hybridization in E18.5 PLECs. We also identified transcriptional events consistent with a prenatal PLEC maturation program. This PLEC-specific program included individual genes (Ch25h, Itpkc, Pcdhac2 and S1pr3) as well as a set of chemokines and genes containing an NF-κB binding site in their promoter. Overall, this work reveals transcriptional insights into the genes, signaling pathways and biological processes associated with pulmonary lymphatic maturation in the fetal lung.

During late gestation, the lymphatic vasculature undergoes dynamic structural changes as they gain the capacity to transport cells and macromolecules. This provides evidence that pulmonary lymphatics undergo functional maturation prenatally [18][19][20]. This process may be part of a broader, generalized paradigm whereby cellular maturation in the late fetal lung precedes the transition to air breathing. Examples include alveolar type-II cell surfactant biosynthesis, macrophage-mediated surfactant catabolism, alveolar type I cell flattening, as well as vascular and interstitial remodeling [21][22][23][24]. Accordingly, the complex pathology of the premature lung likely reflects maturational defects in multiple cell types and signaling events.
For survival after birth, a mature pulmonary lymphatic system is necessary for proper immune surveillance and lung fluid homeostasis. In this context, two groups have shown that fetal lymphatic ablation results in lung fluid retention and respiratory distress at birth [17,19] though the effect of lymphatic ablation on immune function was not examined. Notably, alterations in lymphatic structure and function have been implicated in lung diseases that afflict neonates and young children such as bronchopulmonary dysplasia, pulmonary lymphangiectasia, congenital chylothorax and asthma [25].
Despite the fundamental importance of lymphatics for optimal lung function and their implication in respiratory pathologies, little is known regarding the transcriptional mechanisms associated with pulmonary lymphatic maturation prior to birth. To address this gap in knowledge, we established a cell sorting method to isolate CD45-/Epcam-/CD31+/Vegfr3 +/Lyve1+/Pdpn+ primary PLECs from fetal murine lungs for transcriptional profiling using microarrays.
Here, we identified differentially expressed genes and pathways involved in vascular development, complement and coagulation cascades, lipid biosynthesis and metabolism and the innate immune response. The type I interferon (IFN-I) signaling pathway was the gene set with the greatest coordinate upregulation from E16.5 to E18.5, which was confirmed by qRT-PCR and in situ hybridization. Furthermore, NF-kB target genes, chemokines and additional genes were upregulated specifically in late-gestation PLECs. Together, these data provide novel insights into the transcriptional mechanisms, biological processes and key signaling events associated with lymphatic maturation in the fetal lung, a critical event for air breathing and postnatal survival.

Mice and tissue processing
Our protocol was approved under Boston University IACUC Protocol# 14419. Timed pregnant mice (C57BL/6NCrl) were ordered from Charles River Labs and cared for in the BU Animal Science Center on a 12-hour light:dark cycle with access to food and water ad libitum. Timed pregnant dams and fetal offspring were ethically euthanized under deep isoflurane anesthesia followed by harvesting of lung tissues from E16.5, E17.5 and E18.5 mice for PLEC isolation.
In situ hybridization E16.5 and E18.5 fetal lungs were fixed in 4% paraformaldehyde overnight, processed, and paraffin embedded according to manufacturer's recommendations. Antigen retrieval was carried out on 8 um lung sections (n = 3 mice/time point) via the microwave method. All reagents for peroxidase quenching, protease digestion, washing, hybridization and transcript detection for Ifit1 (ACDBio, cat# 500071) were purchased from the manufacturer and used according to the recommendations for the RNAscope method (ACDBio). In order to identify PLECs, sections were concomitantly blocked and immunostained as mentioned with goat anti-Vegfr3 (AF2727, R&D Systems).

Microarray analysis of fetal pulmonary lymphatic endothelial cells
Mouse Gene 2.0 ST CEL files were normalized to produce log2-transformed gene-level expression values using the implementation of the Robust Multiarray Average (RMA) in the affy R package (version 1.36.1) and an Entrez Gene-specific probeset mapping (version 17.0.0) from the Molecular and Behavioral Neuroscience Institute at the University of Michigan (the "Brainarray" group). Array quality was assessed by computing Relative Log Expression (RLE) and Normalized Unscaled Standard Error (NUSE) using the affyPLM R package (version 1.34.0); all samples had median RLE and NUSE values below 0.1 and 1.05, respectively. Principal Component Analysis (PCA) was performed using the prcomp R function with expression values znormalized across all samples (set to a mean of zero and a standard deviation of one). Differential expression was assessed using the moderated (empirical Bayesian) ANOVA and t test implemented in the limma R package (version 3.14.4), i.e., creating simple linear models with lmFit, followed by empirical Bayesian adjustment with eBayes. Correction for multiple hypothesis testing was accomplished using the Benjamini-Hochberg false discovery rate (FDR). All microarray analyses were performed using the R environment for statistical computing (version 2.15.1). The raw and processed gene expression data have been deposited in the Gene Expression Omnibus (GEO), Series GSE121079.

Functional annotation
Mouse Entrez Gene identifiers from each cluster were analyzed using the Database for Annotation, Visualization and Integrated Discovery (DAVID, v6.8) with the GOTERM_BP_DIR-ECT collection to identify Gene Ontology (GO) biological process terms with significant overrepresentation.

Gene set enrichment analysis
The Entrez Gene identifiers of the human homologs of all mouse Entrez Genes present in the probeset mapping described above (identified using HomoloGene, version 68) were ranked according to the t statistic computed between the E16.5 and E18.5 time points. The ranked list was limited to only those human genes that correspond 1:1 with a mouse homolog, and used to perform pre-ranked GSEA (version 2.2.1, default parameters with random seed 1234) using the Entrez Gene versions of the Biocarta, KEGG, and Reactome (c2 CP) motif (c3) and GO (c5) gene sets obtained from the Molecular Signatures Database (MSigDB), v5.0.

Comparative microarray analysis of whole fetal lung, GSE35485
Mouse Gene 1.0 ST CEL files were obtained from the Gene Expression Omnibus (GEO), Series GSE35485, which profiled gene expression in whole lung during perinatal maturation. Analysis was limited to the samples from the E16.5, E17.5, and E18.5 timepoints in C57BL/6 animals. Normalization to Entrez-Gene-specific expression values, statistical analysis, and GSEA were performed in the same manner and with the same software versions and parameters as described above for the PLEC microarrays.

Statistical analysis
All statistical analyses utilized a one-or two-tailed Student's t test as indicated in each figure legend. P values less than 0.05 were deemed significant.

Immunophenotypic analysis and isolation of pulmonary lymphatic endothelium
To identify genes, gene networks and signaling pathways associated with LEC maturation in the fetal lung, we developed a cell sorting strategy to isolate pulmonary LECs (PLECs) for broad-based transcriptomic profiling. This approach is based on the identification of a unique panel of LEC surface markers that immuno-colocalized with Prox1, the essential transcription factor required for lymphatic development. We found that CD31, Vegfr3, Pdpn, and Lyve1 immuno-colocalization with Prox1 was only detected in lymphatic vessels in the E16.5 and E18.5 lung (Fig 1).
To isolate LECs by flow cytometry, we analyzed cell fractions in collagenase digested lung preparations after staining with the four lymphatic directed antibodies that co-localized with Prox1 (Fig 1). Following exclusion of doublets, dead cells, CD45+ hematopoietic cells and Epcam+ lung epithelial cells, we identified three major cell populations based on CD31 and Vegfr3 expression (Fig 2A) during cell sorting. The CD31+/Vegfr3+ double positive fraction consistently accounted for~1-2% of total viable cells of which 85-95% also expressed the lymphatic markers Lyve1 and Pdpn (Figs 2A and 3, right panels). A subset of CD31+/Vegfr3-cells expressed Lyve1 but not Pdpn, consistent with a venous endothelial phenotype [26] (Fig 2A, yellow cluster,). Importantly, dual expression of both Pdpn and Lyve1 was only observed in the CD31+/Vegfr3+ gate (Fig 2A, purple cluster). Collectively, these data indicate that CD31 +/Vegfr3+/Pdpn+/Lyve1+ cells are highly enriched in PLECs.

Temporal-dependent patterns of gene expression in late-gestation PLECs
We then used whole-genome microarrays to profile gene expression in RNA samples isolated from sorted CD31+/Vegfr3+/Lyve1+/Pdpn+ PLECs at E16.5, E17.5, and E18.5 days of gestation (n = 3 per time point). Principal Component Analysis (PCA) performed across all genes showed that the samples separated strongly with respect to time along the first principal component (PC1), which explains 22% of total variance (Fig 5A), indicating that many genes are up-or down-regulated in a progressive manner from E16.5 to E17.5 to E18.5. A one-way analysis of variance (ANOVA) was then performed to identify the predominant patterns of coordinate differential expression with respect to time, and t tests were performed for each gene between the E16.5 and E18.5 time points (S1 File and Tables 1 and 2). After excluding genes that were not expressed above the median value of at least one array, 1,281 genes, whose expression was significantly associated with time (ANOVA, FDR q < 0.05), were partitioned into six clusters ( Fig 5B and S2 File).
Cluster 3 encompasses 80 genes with an expression pattern that is abruptly down-regulated between E16.5 and E17.5 and remains low at E18.5, including genes involved in ubiquitindependent protein catabolism (GO:0006511, p = 0.020): Psma1, Psma4, Ube2c, and Usp11. Surprisingly, cluster 4, which contains only 11 genes with a trough-like pattern of down-regulation at E17.5 followed by reversion to E16.5 levels at E18.5, was significantly associated with the GO term "steroid metabolic process" (GO:0008202, FDR q = 0.066) as it contains the genes Hmgcs1, Sc4mol, and Soat1. However, cluster 5, which contains only eight genes that peak at E17.5, was too small to achieve statistical significance for any GO terms.

Gene set enrichment analysis
Although the clustering analysis described above identified numerous pathways that are significantly overrepresented in gene clusters with temporal-dependent expression, some of these results are derived from a relatively small number of genes. To increase statistical power to identify biological pathways that are coordinately up-or down-regulated with respect to time, we next performed Gene Set Enrichment Analysis (GSEA), ranking the human homologs of all genes interrogated by the microarray according to the t statistic computed between the early and late time points. The full results of the GSEA analysis are included in S5 File. Many gene sets were significantly coordinately upregulated from E16.5 to E18.5, including the Reactome "interferon alpha/beta signaling" pathway (R-HSA-909733, FDR q < 0.001), which includes Socs3, Irf7, Stat1, Stat2, and Usp18 ( Fig 6A); the KEGG pathway "complement and coagulation cascades" (hsa04610, FDR q = 0.0006), which contains the Serping1 and Serpine1, two genes with fibrinolytic and plasminogen inhibitor activity; and the complement genes C3, C7, Cfh and Hc ( Fig 6B); and the GO term "regulation of angiogenesis" (GO:0045765, FDR q = 0.012) were enriched in genes upregulated at E18.5 ( Fig 6C and Table 1).
We selected several canonical interferon-stimulated genes (ISGs) and other genes involved in the IFN-I pathway and confirmed their upregulation in independently sorted E16.5 and E18.5 PLEC samples by qRT-PCR. We also confirmed down-regulation of Mcm5 in E18.5 PLECs (Fig 7). Next we used single-molecule fluorescent in situ hybridization probes against Ifit1 (interferon-induced protein with tetratricopeptide repeats 1), a well-established ISG significantly upregulated from E16.5 to E18.5 (7.4-fold, FDR q = 3.1x10 -3 ) to further confirm that the IFN-I pathway was activated in E18.5 PLECs in vivo (Fig 8). This analysis demonstrated increased expression of Ifit1 in Vegfr3+ LECs at E18.5 versus E16.5. Combined with the microarray and qRT-PCR analysis, these data provide further evidence that PLECs are a target of IFN-I signaling in late gestation and that ISGs are transcriptionally active in fetal LECs in vivo.

Fetal PLEC specific maturation profile
To identify genes and pathways that are unique to PLECs during prenatal maturation, we repeated the statistical analysis and GSEA using a publicly available microarray dataset (GSE35485) profiling whole fetal lung in C57BL/6 animals at the same time points, and compared these results with those obtained from the PLECs. The full set of statistical results is provided in S6 File, and a tabulation of the E18.5 vs E16.5 GSEA results in both the PLECs and GSE35485 is provided in S7 File. These analyses demonstrated that sets of genes annotated with the GO term "chemokine activity" (GO:0008009) or containing an NF-κB binding site (TRANSFAC motif V$NFKAPPAB_01) in their promoter were significantly upregulated between E16.5 and E18.5 (FDR q < 0.25) in PLECs but not in whole lung. The "chemokine activity" gene set included Cxcl1, Cxcl10, Cxcl16, Cxcl2, and Pf4/Cxcl4, and "NF-κB target genes" included Prx, Nod2, Ddr1, Tnfrsf1b, Sdc4, Ntn1 and Azin1 (Fig 9A and 9B). In addition, the expression of several genes listed in Table 1 changed significantly with respect to time in PLECs (one-way ANOVA p < 0.05) but not in whole lung, including Ch25h, Itpkc, Pcdhac2 and S1pr3, each of which were upregulated from E16.5 to E18.5 in a PLEC-specific manner ( Fig 9C). Overall, our results highlight both broad and PLEC-specific maturation profiles during late gestation lung development.

Discussion
The switch from placental to pulmonary gas exchange at birth requires fluid removal and the capacity to face the immunologic challenges of air breathing. An intact and mature pulmonary lymphatic system is essential for each of these fundamental processes. In this study we developed and implemented a cell sorting strategy that enabled the first transcriptional profiling of PLECs during late gestation. GSEA was used to identify pathways that are coordinately regulated from E16.5 to E18.5 fetal time points. This analysis demonstrated that genes involved in IFN signaling are a salient feature of maturing pulmonary lymphatics, which was confirmed by qRT-PCR and in situ hybridization. Transcriptional activation of IFN target genes at E18.5 indicates that PLECs act as a receiver of IFN signaling during late gestation. Although the specific effects of IFN on lymphatic formation and function in the lung have not been definitively studied, the prevailing concept is that IFNs inhibit both angiogenesis and lymphangiogenesis in postnatal life [27][28][29][30]. Importantly, others have linked IFNα signaling to maturation of fetal hematopoietic stem cells [31]. Taken together, these findings are consistent with a role for IFN signaling in maturation of the pulmonary lymphatic endothelium prior to birth. Genes involved in angiogenesis were upregulated in PLECs from E16.5 to E18.5. One key molecule is sphingosine kinase 1 (Sphk1), which is required for the production of sphingosine-1-phosphate (S1P), a lipid important for angiogenesis, vascular maintenance, LEC maturation, as well as lymphocyte mobilization [32,33]. Another important gene for lymphangiogenesis, Ccbe1 was upregulated in E18.5 PLECs, and this gene is a Vegf-C potentiator, and thus mutations in Ccbe1 result in lymphatic malformation, lymphedema and other complex phenotypes associated with Hennekam syndrome [34][35][36][37].
Genes involved in the complement and coagulation cascade were upregulated in E18.5 PLECs. For proper interstitial fluid clearance and leukocyte trafficking, lymph must be a hypocoagulable biological fluid [38,39]. Consistent with this, we observed increased expression of anti-coagulative genes (Serpine1 and Serping1), however genes known to promote coagulation (F2rl2, Tfpi2, and Clec1b/3b) were also induced in E18.5 PLECs. Coagulation and clot formation play an important role in lymphatic biology throughout life [40][41][42], but a role for complement factors in lymphatic maturation and function has not been described.  The lymphatic endothelium possesses several highly specialized features that make them unique and optimally adapted for clearance of interstitial fluids, macromolecules, and for leukocyte transmigration. For instance, capillary LECs contain incomplete basement membranes devoid of pericyte investment, and exhibit discontinuous intercellular junctions termed "button-junctions;" essentially an entry point for fluids, macromolecules and leukocytes [18,43]. Notably, our analysis identified a cluster of genes that are significantly upregulated at E18.5, which encode for proteins critical to "cell adhesion", "cell-cell junctions" and interaction with the "extracellular matrix". This cluster also contains the surface receptor Fgfr3, a direct target of Prox1 [44], which is critical for lymphatic development, growth and regulation of LEC metabolism [45]. Other genes in this cluster include Ccbp2/D6, which is expressed by LECs to influence adaptive immunity [46]. Transcription factors such as Sp100, a tumor suppressor known to regulate endothelial migration and angiogenesis [47]; Trim30a, a TLR-responsive inhibitor of NF-kB signaling [48] and Irf7, the master regulator of IFN-I mediated immune protection [49] were all upregulated in E18.5 PLECs. Furthermore, there was upregulation of Pdpn, which is required for proper lymphatic-venous separation during development and for dendritic cell migration to lymph nodes in postnatal life [50]. Sema3c, a member of the Semaphorin 3 family involved in lymphatic maturation and valve formation was upregulated at E18.5 [51]. GSEA and cluster analyses showed down-regulation of genes involved in cell cycle regulation and DNA replication pathways, suggestive of cellular maturation or terminal differentiation of the pulmonary lymphatic endothelium prenatally [52]. And lastly, recent evidence indicates a major role for fatty acid β-oxidation and lipid metabolism in lymphatic development [53]. Our data are in line with these findings since the microarray analysis detected increased expression of genes involved in lipid biosynthesis and metabolism in E18.5 PLECs suggesting a link between lipid signaling and prenatal pulmonary lymphatic maturation.
Gene sets such as "chemokine activity" and "NF-kB targets" were enriched specifically in E18.5 PLECs. It is noteworthy that genes with NF-kB binding sites in their promoters were enriched in PLECs during late fetal gestation because constitutive NF-kB activity has previously been shown in lymphatic beds of multiple organs in adult mice [54]. Both Cxcl1 and Cxcl10, ISGs which regulate chemotaxis and leukocyte trafficking [55] were represented in both PLEC specific gene sets and in type-I IFN signaling. Other upregulated genes unique to PLECs were Ch25h and S1pr3. Ch25h is involved in catabolism of cholesterol and leads to generation of oxysterol, a ligand for EBI2, a receptor that regulates chemotaxis and positioning of lymphoid lineages in lymph nodes [56]. S1pr3, a gene upregulated specifically in PLECs promotes lymphoid mobilization from lymphoid tissues [57]. Overall, our analyses identify broad mechanisms and PLEC specific signatures of maturation during late gestational lung development.

Conclusions
Our microarray analysis indicated the repression and activation of numerous fundamental processes during late-gestation lung development and suggests the presence of a distinct prenatal maturation program for fetal PLECs. Further studies are needed to determine the precise contribution of IFN signaling or alternative pathways to lymphatic function and postnatal pulmonary biology. The importance of lung lymphatics is demonstrated by their involvement in immunity, fluid clearance and air breathing at birth. This work reveals transcriptional events associated with PLEC maturation prior to birth and may provide some insight into the dysfunction and pathology of the premature lung.
Supporting information S1 File. Statistical analysis of PLEC gene expression microarray experiment. Each row corresponds to a single mouse Entrez Gene. Columns A-E contain the MBNI probeset identifier, links to the Entrez Gene record(s) for the mouse gene and any of its human homolog(s), and the gene symbol and description as obtained from version 17.0.0 of the MBNI mogene20stmmentrezg.db R package. Columns F-G contain the nominal p value and FDR q value from the moderated one-way ANOVA, and columns H-K contain the signed fold change (computed in linear space), moderated t statistic, nominal p value, and FDR q value for the comparison between the E18.5 and E16.5 time points. All FDR q values were computed after excluding any genes that were not expressed above the median value of at least one of all nine samples. Columns L-T contain the log2 (expression) of each gene in all samples, shaded so that blue and red indicate expression values that are at least two standard deviations below and above, respectively, the mean (white) of each row. The rows are sorted in ascending order by one-way Each row corresponds to a single mouse Entrez Gene. Columns A-E contain the MBNI probeset identifier, links to the Entrez Gene record(s) for the mouse gene and any of its human homolog(s), and the gene symbol and description as obtained from version 17.0.0 of the MBNI mogene10stmmentrezg.db R package. Columns F-G contain the nominal p value and FDR q value from the moderated one-way ANOVA, and columns H-K contain the signed fold change (computed in linear space), moderated t statistic, nominal p value, and FDR q value for the comparison between the E18.5 and E16.5 time points. All FDR q values were computed after excluding any genes that were not expressed above the median value of at least one of all nine samples. Columns L-T contain the log2 (expression) of each gene in all samples, shaded so that blue and red indicate expression values that are at least two standard deviations below and above, respectively, the mean (white) of each row. The rows are sorted in ascending order by one-way ANOVA p value. (XLSX) S7 File. Summary of Gene Set Enrichment Analysis (GSEA) results for PLECs and GSE35485. Each row corresponds to a single gene set. Columns A-B contains the MSigDB v5.0 collection and category of each gene set. Column C contains the name of the gene set, represented as a link to the "card" with more information about each gene set at MSigDB. Column D contains the number of genes in the gene set that overlap with the genes in the ranked list. Columns E-G contain the NES (Normalized Enrichment Score), nominal p value, and FDR q value for each gene set in the PLECs; columns H-J contain the same values for the whole-lung dataset GSE35485. Nominal p or FDR q values equal to 0 indicate nominal p values < 0.001 (i.e., not one shuffled result out of 1000 was more significant than the actual result