Application of multi-omics data integration and machine learning approaches to identify epigenetic and transcriptomic differences between in vitro and in vivo produced bovine embryos

Pregnancy rates for in vitro produced (IVP) embryos are usually lower than for embryos produced in vivo after ovarian superovulation (MOET). This is potentially due to alterations in their trophectoderm (TE), the outermost layer in physical contact with the maternal endometrium. The main objective was to apply a multi-omics data integration approach to identify both temporally differentially expressed and differentially methylated genes (DEG and DMG), between IVP and MOET embryos, that could impact TE function. To start, four and five published transcriptomic and epigenomic datasets, respectively, were processed for data integration. Second, DEG from day 7 to days 13 and 16 and DMG from day 7 to day 17 were determined in the TE from IVP vs. MOET embryos. Third, genes that were both DE and DM were subjected to hierarchical clustering and functional enrichment analysis. Finally, findings were validated through a machine learning approach with two additional datasets from day 15 embryos. There were 1535 DEG and 6360 DMG, with 490 overlapped genes, whose expression profiles at days 13 and 16 resulted in three main clusters. Cluster 1 (188) and Cluster 2 (191) genes were down-regulated at day 13 or day 16, respectively, while Cluster 3 genes (111) were up-regulated at both days, in IVP embryos compared to MOET embryos. The top enriched terms were the KEGG pathway "focal adhesion" in Cluster 1 (FDR = 0.003), and the cellular component: "extracellular exosome" in Cluster 2 (FDR<0.0001), also enriched in Cluster 1 (FDR = 0.04). According to the machine learning approach, genes in Cluster 1 showed a similar expression pattern between IVP and less developed (short) MOET conceptuses; and between MOET and DKK1-treated (advanced) IVP conceptuses. In conclusion, these results suggest that early conceptuses derived from IVP embryos exhibit epigenomic and transcriptomic changes that later affect its elongation and focal adhesion, impairing post-transfer survival.


Introduction
Assisted reproductive technologies (ART) have been used in cattle breeding since the 1950s, generating millions of healthy animals since then [1][2][3]. After artificial insemination (AI), the ART most frequently implemented in cattle reproduction is embryo transfer, where embryos are produced either (i) in vivo by ovarian superstimulation leading to multiple ovulations followed by AI, embryo collection and transfer (MOET) or (ii) in vitro embryo production (IVP), involving maturation and fertilization of oocytes collected from live animals via transvaginal aspiration of follicles or by recovery from the ovaries after slaughter.
Amongst livestock species, ART are used to the greatest extent in cattle, due to their economic importance and the relative ease with which the reproductive tract can be manipulated. Furthermore, in terms of the commercial embryo transfer industry, embryo production has shifted from MOET to IVP during the last few years [4]. However, IVP embryos differ in several characteristics (morphological, ultrastructural, physiological, transcriptional, and metabolic) from those derived by MOET, which can impact their survival [5]. A comprehensive review of the post-transfer consequences of IVP embryos noted that such embryos yielded around~25% lower pregnancy rates compared with MOET embryos, according to the results of 12 studies performed from 1992 to 2014 [6]. Pregnancy losses were more prevalent early in gestation; around 40% of cows receiving an IVP embryo at day 7/8 post-oestrus were no longer pregnant at day 18 to 21 of gestation. One potential factor influencing embryo survival is impaired post-hatching elongation of IVP embryos, which begins at around day 13 of pregnancy [7]. Around this time, the trophectoderm (TE) begins to secrete interferon-tau, the pregnancy recognition factor in cattle [8]. Conceptuses derived from the transfer of IVP conceptuses were smaller at day 13 than their MOET counterparts [9], but were similar in length at day 16 and 17 [10,11]. This suggests that initiation of elongation might be impaired in IVP embryos; moreover, for embryos that succeed to elongate, this process may start slowly and eventually "catch up" with MOET embryos. Furthermore, it has been demonstrated that the transcriptomic response of the endometrium differs between MOET-and IVP-derived conceptuses both in vivo [12,13] and in vitro [14].
Several studies have shown that the use of ART can impact on the embryo transcriptome, both at the blastocyst stage [15][16][17][18][19], and after elongation [20], or in the transition from a spherical to an ovoid blastocyst [21]. Alterations in the transcriptomic profile are manifested in the capacity of the blastocyst to sustain development to term, for both MOET [22] and IVP embryos [23]. The embryo depends on the expression of certain genes encoding for key proteins required to undergo sequential development, including differentiation and lineage commitment, as well as appropriate temporal communication with the female reproductive tract leading to maternal recognition of pregnancy. ART-induced transcriptomic aberrancies may hamper the availability of gene products required for further embryo development.
The divergent transcriptomic profile is likely to be also associated with altered epigenetic regulation [24], defined as heritable changes in gene activity or function without changes in the DNA sequence [25]. DNA methylation is one of the best-characterized epigenetic mechanisms primarily known to influence gene expression [26]. Usually, hypermethylation of CpG islands in promoters is related to transcriptional suppression [27]. However, several studies have shown that gene body methylation is also strongly involved in gene expression [28][29][30], with a bias for exonic regions [31]. Furthermore, epigenetic modifications can be induced by several external factors [32], affecting IVP and MOET embryos differentially. For example, known factors including the in vitro culture medium used [33], and paternal characteristics, such as bull age [34], have been shown to impact at the blastocyst stage. Differential methylation was reported to be most prominent at intragenic sequences within the TE of IVP and MOET embryos, but not in the embryonic disc in elongated-day 17-embryos compared to similar stage embryos produced by AI [35]. Accordingly, abnormalities in the TE, the outermost layer of the conceptus that is in contact with the endometrium, may lead to impaired conceptus elongation.
Undoubtedly, these studies have helped to shed light on the underlying biological mechanisms in the IVP embryo that compromise pregnancy success. However, one of the remaining unanswered questions is whether transcriptomic aberrancies are exclusively related to the in vitro process per se or are induced by the lack of exposure to the oviductal and uterine environment, regardless of variables such as the maturation media, technical procedures, or parental characteristics. In addition, apart from one [21], these studies have compared the transcriptome or epigenome of embryos at the same stage. Therefore, an analysis integrating different sources of transcriptomic and epigenomic data of bovine embryos before and after elongation could unravel crucial differences between IVP-derived embryos and those derived by MOET.
The objective of the present study was to apply a multi-omics data integration approach to identify genes that are temporally differentially expressed and differentially methylated in the exonic regions between IVP and MOET embryos from the blastocyst stage to the elongated conceptus, that could impact on TE function. To accomplish this aim, publicly available data were integrated and re-analysed using a range of bioinformatics tools. In addition, to underpin our findings, the main results were employed to make predictions in additional independent external data, using machine-learning methods.

Materials and methods
The pipeline followed in this study is shown in Fig 1. Each step is explained in detail below.
Step 1: Embryo data collection and bioinformatic processing Both transcriptomic and epigenomic datasets were downloaded from a public functional genomic data repository: Gene Expression Omnibus (GEO) from the National Center for Biotechnology Information [36,37]. The R software platform [38] was employed for all bioinformatics procedures.
Transcriptomic data. Six datasets were employed to obtain transcriptomic data from MOET and IVP embryos. Data from the following types of embryo were downloaded from: • MOET blastocyst: GSE12327 and GSE21030.
All these studies employed the Affymetrix Bovine Genome Array platform for hybridization. The raw data obtained from all samples were processed with the gcRMA package [39]. Data were imported into R, background corrected, and then transformed and normalized using the quantile normalization method. Next, rows of each dataset were collapsed to retain the microarray probe with the highest mean value from the group of genes with the same official symbol. Batch effects (i.e., the fact that data were obtained from different studies) were removed with the ComBat function from the sva package [40].
Epigenomic data. Datasets related to MOET and IVP blastocysts were collected from four datasets: GSE128354, GSE101895, GSE97517 and GSE69173. Data from the TE of day 17 embryos were derived from a previous study [35]. These studies employed the two-channel 400 K EmbryoGENE DNA Methylation Array. Following the quality check of each sample [41], the sample size for the IVP and MOET groups was: blastocyst, n = 8; and day-17 conceptuses, n = 4; per group.

Step 2. Determination of temporally differentially expressed genes (DEG) and differentially methylated regions (DMR)
Identification of DEG and DMR was performed with the limma R package [42]. To identify genes that behaved differently over time in the IVP embryos relative to the MOET embryos, contrasts were made between day 16 and day 13 conceptuses, and between day 13 conceptuses and blastocysts in both groups. For determination of DMR, a previously developed pipeline [41] was adapted to the present study. Briefly, the M-values and A-values were determined in the probes that were above the background level. These values were employed for a withinarray loess normalization. Next, a between array quantile normalization was applied to ensure that the intensities had the same empirical distribution across arrays and across channels. The linear model was fit to the individual log-intensities for each probe using the lmscFit function. This last modification was done to analyse the channels separately, as described previously Step 1: Datasets with the accession number GSE# were downloaded from the Gene Expression Omnibus (GEO) or were obtained from a previous study [35] and processed for data integration. Step 2. Temporally differentially expressed genes (DEG) from day 7 (D7) to D13 and D16 and differentially methylated regions (DMR) in the trophectoderm (TE) from D7 to D17 were determined between groups. Differentially methylated genes (DMG) were the genes in the corresponding exons of the DMR. Step 3. DEG and DMG were overlapped to determine common genes, which were further analysed (Fig 2). Each step is explained in detail in the methodology section. IVP: embryos produced in vitro. MOET: embryos produced in vivo (after ovarian superovulation).
https://doi.org/10.1371/journal.pone.0252096.g001 [43]. Next, contrasts were made between day 17 conceptuses and blastocysts in the IVP vs MOET groups. For the sake of comparisons, and since the subsequent analyses validated the findings, a p-value <0.05 and a fold change higher than 1 were employed as statistical criteria for both DEG and DMR.
Step 3: Identification of common DEG and differentially methylated genes (DMG) Genes in the corresponding exons of DMR were defined as DMG. The DMG that overlapped with the DEG were also methylated in the promoter and/or the intron (S1 Fig) and presumably, these regions were extensively methylated. If a gene showed more than one differentially methylated region, only the region with the lowest p-value was considered. DEG and DMG that were differentially expressed and differentially methylated over time in IVP relative to MOET embryos were identified by Venn Diagram and were subjected to a hierarchical clustering according to their expression profile in day 13 and day 16 conceptuses, using Spearman Rank Correlation as similarity metric and complete linkage as clustering method, implemented with the Cluster 3.0 software [44]. The resulting dendrogram and the heat map were visualized with Java TreeView [45].
Genes in each of the resulting clusters were evaluated through a functional analysis with the DAVID software [46]. The Bos taurus protein-coding genome was selected as background. The goal of this step was to determine the top enriched annotation terms in the Functional Annotation Chart (p<0.01). Additionally, the proportion of hyper-or hypo-methylated DMG was assessed in each cluster, and the probability of such a proportion, given the proportions in all the DMR, was estimated through a hypergeometric test. The relative methylation levels for specific genes in each group were estimated from the fitted coefficients generated from the application of the lmscFit function of the limma package [42].
Moreover, the GSE56513 dataset was employed to verify the consistency of the expression profiles for genes in the resulting clusters. This dataset corresponds to a study where mRNA was extracted from MOET embryos at days 7 and 10, and embryos after AI at days 13, 16 and 19; and measured using RNAseq [47]. Data in RPKM from embryos at days 7, 13 and 16 were downloaded and log2 transformed. Afterwards, the trajectories of average expression for genes in the resulting clusters were plotted.

Predictions in independent datasets
With the aim of validating our results, the expressions of significant genes sets were employed to make predictions in independent datasets. For this, the following datasets were selected and downloaded from the GEO database: • GSE75750: Corresponding to a study where the transcriptome of short and long agematched day 15 conceptuses derived from MOET was compared [48]. Short conceptuses (n = 5) were ovoid in shape, with a length <10 mm. Long conceptuses (n = 5) were early tubular in morphology, with a length between 30 and 100 mm. This study employed the Affymetrix Bovine Genome Array platform for hybridization of the extracted embryonic mRNA. The raw data was processed as described for the transcriptomic data in Step 1.
• GSE126680: this dataset corresponds to a study in which IVP embryos were treated or not with 100 ng/ml of recombinant human Dickkopf-related protein 1 (DKK1) from day 5 to 7.5 of culture [49]. After transfer, conceptuses were recovered at day 15 of gestation and their lengths were measured. In the present study, only conceptuses with similar length were employed (DKK1, n = 5, length: 112 ± 33 mm; control, n = 5, length: 100 ± 26 mm). The extracted mRNA molecules were quantified through RNAseq. The raw counts were transformed through the variance stabilizing transformation method [50], using the vst function from the DESeq2 package [51] for R.
For the prediction, three models were constructed with the following groups of genes: temporally DEG, overlapped DEG/DMG (i.e., genes that were both differentially expressed and differentially methylated), and one of the gene clusters resulting from the clustering of the overlapped DEG/DMG. The training set consisted of the expression of those three groups of genes in the day 13 and day 16 conceptuses, while the expression in the short/long conceptuses and DKK1/control conceptuses constituted the testing sets. An add-on batch effect adjustment of the testing data with the training data was performed with the bapred package [52]. Support vector machine with linear kernels was used as a classifier, employing the leave-one-out cross validation (LOOCV) method as the internal control. The LOOCV was run with 100 resampling interactions. The unknown sample was predicted with 100% accuracy in every case, with a cost ranging from 0.1 to 2. The algorithm was applied with the kernlab package [53], through the caret package [54] for the R software. The accuracy of each prediction is reported.

Temporal DEG and DMR from day 7 blastocyst to elongated conceptus in IVP-and MOET-derived embryos
The numbers of DEG and DMR were 1535 and 43928, respectively. For the DMR, 19562 genes were differentially hypermethylated and 24366 were differentially hypomethylated in the IVP group relative to the MOET group from the blastocyst stage to the day 17 conceptus. The absolute proportions of hypermethylated CpG islands and exonic regions were greater for the IVP group than the MOET group (S2 Fig). Accordingly, there were 6360 DMG (i.e, genes that were differentially methylated in their exonic regions), corresponding to 4495 (70.7%) hypermethylated regions in the IVP group and 1865 regions (29.3%) in the MOET group.

Genes differentially changing both in expression and methylation levels from blastocyst to elongated conceptus in IVP-versus MOET-derived embryos
The main results of this step are illustrated in Fig 2. The Venn Diagram (panel A) depicts 490 genes that were both differentially expressed and hyper -or hypo-methylated from the blastocyst stage to the elongated conceptus in IVP-derived compared to MOET-derived embryos. The heat map and dendrogram resulting from the hierarchical clustering of these 490 genes according to their expression in day 13 and day 16 conceptuses are shown in panel B. There were three main clusters that presented the following average expression patterns in the IVP conceptuses compared to the MOET conceptuses (panel C). Cluster 1: 188 genes decreasing in expression by day 13 but increasing again by day 16. Cluster 2: 191 genes with similar expression by day 13 but lower expression by day 16. Cluster 3: 111 genes with increased expression by day 13 and day 16. The list of genes in each cluster is detailed in S1 Table. Accordingly, the expression profile for genes in these clusters in MOET embryos followed similar trajectories to those in AI-derived embryos, underlining the consistency of the results (Fig 3). The top enriched annotation terms (p<0.01) in each of the three clusters are shown in Table 1. The KEGG pathway "focal adhesion" was the top enriched pathway in Cluster 1 (FDR = 0.003), followed by the cellular component: "extracellular exosome" (FDR = 0.04). This last term was the top enriched term in Cluster 2 (FDR<0.0001), followed by UniProtKB keyword "Phosphoprotein" (FDR = 0.01). Genes in Cluster 3 also enriched the "extracellular exosome" term, but this enrichment was not significant at the adjusted p-value (FDR = 0.6).
The number and proportion of genes hypermethylated in the IVP or MOET groups, respectively, in each cluster, were as follows-Cluster 1: 151 (80.3%) and 37 (19.7%); Cluster 2: 130 (68.1%) and 61 (31.9%); Cluster 3: 78 (70.3%) and 33 (29.7%). From the three clusters, only Cluster 1 presented a significantly higher proportion of hypermethylated genes in the IVP group, given the proportions of methylation in the exons of the DMR (p = 0.001). Furthermore, the differences in expression and methylation levels for each of the genes in Cluster 1

Predictions in independent datasets
The overall rationale behind this analysis was the following: given that a delayed development would be manifested in reduced conceptus length, and that DKK1 treatment improves embryo survival, the expression pattern of at least some genes should resemble those of IVP conceptuses in the first case, and MOET conceptuses in the second. Thus, the hypotheses linked with these predictions, based on expression data of certain genes from day 15 embryos, were the following for each dataset. For GSE75750: short and long conceptuses will be predicted to be IVP and MOET embryos, respectively, while for GSE126680, DKK1-treated and non-treated embryos will be predicted to be MOET and IVP conceptus, respectively. Next, we show the results of each prediction, and the corresponding accuracy according to the hypotheses, using the expression from the following genes to construct each model: • 1535 DEG: the accuracy of the predictions was 70% for samples from both datasets, which was not significant (p = 0.17). Therefore, these genes do not have a comparable expression pattern between IVP and short MOET embryos or between MOET and DKK1-treated embryos.
• 490 overlapping DEG and DMG: for GSE75750 samples, all the short conceptuses and four out of five long conceptuses were predicted as IVP and MOET conceptuses, respectively, giving an accuracy of 90% (p = 0.01). For GSE126680, four out of five of the DKK1-treated and non-treated embryos were predicted as MOET and IVP embryos, respectively, with an accuracy of 80% (p = 0.054).
• Genes in Cluster 1: The accuracy of the prediction was 100% for samples from GSE75750, since all the short and long conceptuses were classified as being derived from IVP and MOET, respectively (p = 0.005). For samples from GSE126680, all the DKK1-treated embryos and four out of five control embryos were predicted as MOET and IVP conceptuses, respectively, with an accuracy of 90% (p = 0.01). Predictions with the other two clusters did not reach such a high accuracy (GSE75750: Cluster 2: 66%, p = 0.37; Cluster 3: 70%, p = 0.17. GSE126680: Cluster 2: 70%, p = 0.17; Cluster 3: 50%, p = 0.62).

Discussion
Embryo transfer technology has been used in cattle more intensively than in any other livestock species [4]. The number of bovine embryos that are transferred worldwide is constantly increasing [55], and thus, for the past few years the number of IVP embryos transferred has surpassed the number produced by MOET. Indeed, data collated by the International Embryo Technology Society illustrate this divergent trend regarding the type of production: a decrease (-17.5%) in the number of in vivo-derived embryos whereas the production of IVP embryos reached a plateau (+0.2%), being 71% of the reported transferred embryos worldwide in 2019 [4]. Nevertheless, the lower performance of IVP embryos compared to MOET embryos to sustain pregnancy is critical for the industry and remains a hot topic of debate; despite much research, a complete understanding of the underlying issues involved in the suboptimal competency of IVP embryo is still lacking. The current meta-analysis of available transcriptomic and epigenomic data from IVP and MOET embryos at different key stages of development will advance the field. However, this study presents challenges as embryos were produced in distinct environments, using, for example, different parental genetic combinations, culture conditions and superovulatory treatments. Furthermore, the combination of different datasets precludes the application of methods that requires the omics techniques to be measured in the same individual, and so, our approach consisted of single-omics analysis and posterior correlation. Nevertheless, and precisely because of the heterogeneity of the data, the results reported here can be interpreted as those occurring consistently because of the in vitro procedure, or more probably, the lack of oviductal environment during the first days, which impacts the subsequent embryonic development of the TE [56].
In the present study, results obtained from the re-analysis of the transcriptomic data were compared to the epigenomic data, to identify genes for which expression and DNA methylation patterns from blastocyst to elongated conceptus changed differentially between IVP and MOET embryos, which could impact the TE function. This embryonic region was chosen because differential DNA methylation was found to be prominent at intragenic sequences within the TE of IVP and MOET day 17 conceptuses, but not in the embryonic disc [35]. In addition, the smaller size observed in IVP embryos at day 13 [9], compared to MOET counterparts, may reflect a defect in elongation, potentially caused by errors in the TE.
Comparison of DEG and DMG resulted in the identification of 490 genes that were classified in three clusters according to their expression in day 13 and day 16 conceptuses (Fig 2A and  2B). To discern the variation more clearly in expression in the embryos undergoing elongation, samples from blastocysts were not considered for the hierarchical clustering. However, the plots of the expression levels for each cluster depict the pattern at the 3 time points, i. e., day 7, day 13 and day 16 of embryonic age (Fig 2C). The first cluster, Cluster 1, was constituted by genes that showed a remarkable difference in expression at day 13 between the IVP and MOET groups. Specifically, these genes, on average, decreased in expression in the IVP conceptuses at day 13, and increased again at day 16. In contrast, genes in Cluster 2 exhibited a similar expression at day 13 between groups but then strongly increased in expression in day 16 MOET-derived conceptuses. Lastly, genes in Cluster 3 were more highly expressed in IVP conceptuses than MOET conceptuses at both days 13 and 16. Notoriously, the comparable expression profiles of genes in the three clusters between MOET conceptuses and conceptuses produced after AI in an independent experiment (Fig 3), validate the pipeline followed to identify these genes and the reproducibility of the results reported here. In addition, they suggest that these genes could behave in a similar fashion in embryos produced in vivo but not in vitro, and they are weakly influenced by other factors such as the parental genetic combination. However, it is worth noting that the endometrium is sensitive to the embryo characteristics, since for example, both the origin [14] and length of the embryos [7] modify the endometrial transcriptome. Therefore, the maternal environment could influence the embryonic transcriptome as well.
The functional annotation analysis for genes in each of the identified clusters revealed that genes in Cluster 1 significantly enriched for the "focal adhesion" pathway (FDR<0.05). In addition, the cellular component "extracellular exosome" was enriched by genes in both Clusters 1 and 2 (FDR<0.05) and involved 23% and 26 of the genes in the Clusters 1 and 2, respectively. The expression pattern of genes encoding for adhesion proteins is critical to establish the biological shape and structure of the embryos since they maintain the polarity of cell associations with their neighbours and the surrounding extracellular matrix (ECM) [57]. Furthermore, remodelling of the actin cytoskeleton, a key component of eukaryotic cells, is achieved through actin-binding proteins [58], which was another term significantly enriched in Cluster 1 (FDR<0.1). Therefore, focal adhesion is essential for cytoskeletal organization, differentiation, proliferation, and survival of the embryo [59,60]. A closer evaluation of the expression and methylation profiles of the 13 genes from Cluster 1 involved in the "focal adhesion" pathway (Fig 4) showed that for almost all the genes (except for MYL12A) the methylation pattern differed at the blastocyst stage. Interestingly, previous work reported that culture conditions of IVP embryos de-regulated the focal adhesion pathway at the blastocyst stage [61]. Furthermore, supplementation of the culture media with epidermal growth factor (EGF) and hyaluronic acid (HA) during or after embryonic genome activation altered the expression and DNA methylation patterns of genes involved in this pathway [62]. Stimulation of growth factors, such as EGF, and adhesion to the ECM is required for normal cell growth [63], and HA is one of the main components of ECM, that can improve the blastocyst rate of IVP bovine embryos [64]. Thus, the deregulation of genes involved in the focal adhesion pathway impacts embryonic developmental competence and quality.
In the present study, all genes from Cluster 1 involved in this pathway (except for COL1A2, MYL12A and LAMA1) showed a similar DNA methylation pattern: it increased or decreased from the blastocyst stage to the day 17 conceptus for IVP or MOET embryos, respectively. For COL1A2 and MYL12A, DNA methylation pattern increased with embryo age in both groups, while for LAMA1, it decreased or increased from blastocyst to day 17 conceptus for IVP or MOET embryos, respectively. DNA methylation in gene bodies is surprisingly abundant and has been demonstrated to be positively correlated with gene expression [65]. However, gene expression levels are better inversely correlated with the methylation of the first exon than with that of the promoter [31]. Therefore, intragenic DNA methylation is involved in the regulation of gene expression, and its inhibition or suppression depends on the genomic region [66]. Here, the exact relationship between DNA methylation and gene expression cannot be directly estimated since changes in DNA methylation were calculated from the blastocyst to day 17 conceptus, while gene expression was assessed from blastocyst to day 13 and day 16 conceptus. Thus, it is not possible to dissect the dynamics of DNA methylation between the blastocyst and day 17 conceptus. One possibility is that the increasing methylation toward day 17 for genes in the focal adhesion pathway in IVP embryos was related with the "normalization" in the expression toward day 16, since, at day 13, the difference in expression between IVP and MOET embryos was remarkable but not at day 16. For COL1A2 and MYL12A, the DNA methylation pattern seemed to be inversely correlated with gene expression; while for LAMA1, the hypomethylation at day 17 of IVP embryos compared to MOET embryos was probably related with a higher expression at day 16. In addition to focal adhesion, the other term significantly enriched in both Cluster 1 and Cluster 2 (FDR<0.05), with around a quarter of the genes in those clusters, was "extracellular exosome". This term was enriched in Cluster 3 as well, but with an FDR = 0.6. For Cluster 1, seven out of the 13 genes involved in focal adhesion (CAPN2, COL1A2, FLNA, FLNB, LAMB1, TLN1 and THBS4) were part of this cellular component. Exosomes are vesicles released into the extracellular region by fusion of the limiting endosomal membrane of a multivesicular body with the plasma membrane [67]. These vesicles can be secreted by oviductal and endometrial cells [68], and also by IVP and in vivo derived embryos [69,70]. They can play essential roles in maternal-embryo communication [71]. Extracellular vesicles derived from trophoblast cells from the conceptus at day 15 and 17 have been shown to contain interferon-tau, in addition to bioactive molecules that modulate the adhesion pathway [70,72]. Therefore, these vesicles participate in embryo-maternal interactions during early embryonic development and the maternal recognition of pregnancy, although the temporal expression of the gene encoding for interferon-tau did not differ between IVP and MOET embryos in this study (S3 Fig). Interestingly, genes that encoded for extracellular vesicles were related to (p<0.05): ECM receptor interaction, collagen, and epidermal growth factor in Cluster 1 (genes downregulated at day 13 in IVP embryos); glycoprotein and glycolysis/gluconeogenesis in Cluster 2 (genes more expressed at day 16 in MOET conceptuses) and protein acetylation in Cluster 3 (genes more expressed at day 13 and day 16 in IVP conceptuses).
The final step in this study was to underpin our results through the application of machine learning methods, to make predictions in data from day 15 IVP and MOET conceptuses under different conditions. The hypothesis tested with these predictions was that short (i.e., potentially developmentally compromised) MOET conceptuses and IVP embryos treated with DKK1 (i.e., developmentally superior) would be predicted as IVP and MOET embryos, respectively, according to the expression of at least some genes. The accuracy of the predictions improved when using the expression of the overlapping genes between DEG and DMG, compared to using only the DEG, highlighting the value of combining two "omics" technologies rather than a single one, to explore biological events. Furthermore, using the expression of the 188 genes in Cluster 1, in the day 13 and day 16 IVP and MOET conceptuses to train the model, the accuracies of the predictions were higher than 90% (p<0.05). That is, all short and long embryos were predicted as IVP and MOET, respectively, while all DKK1-treated embryos and four out five control embryos were predicted as MOET and IVP, respectively. This high accuracy could not be achieved with the genes in the other clusters, probably because the expression at day 16 has more relevance to stratify IVP and MOET embryos, and the models were tested with day 15 conceptuses.
These results are not suggesting that IVP embryos are always shorter or that DKK1-treated embryos would resemble MOET embryos. Rather, these findings support the notion that genes in Cluster 1, which show a strong deviation in their expression at day 13 between the groups, are important for conceptus elongation. In other words, IVP embryos could exhibit de-regulation of genes involved with development in the TE, while the early treatment with DKK1 would normalize the expression of those genes. In the study by Barnwell et al., [48], from which the data about short and long conceptuses were obtained, the authors found DEG in the long versus the short conceptuses were related to actin filaments of the cytoskeleton, consistent with genes in Cluster 1. Thus, inadequate elongation of IVP conceptuses could be a major contributing factor to early embryonic death seen during the peri-implantation period [73]. Predictions performed in the dataset obtained from the study of Tribulo et al. [49] indicate that treatment of IVP embryos with DKK1 between days 5 and 7.5, just before embryo transfer, could induce epigenomics modifications during the morula to blastocyst transition to regulate subsequent trophoblast elongation. Treatment with DKK1 significantly increased the length of filamentous conceptuses from 43.9 mm to 117.4 mm and the intrauterine content of interferon tau from 4.9 μg/ml to 16.6 μg/ml. In the present study, conceptuses of similar length derived from the transfer of embryos treated or not with DKK1 were selected. Therefore, the action of DKK1 probably involved the induction of the expression of genes related to focal adhesion as well, which in part can explain the increase in blastocyst development and pregnancy retention in DKK1-treated IVP embryos [74].

Conclusion
This study applied a multi-omics data integration approach to identify genes that are both differentially expressed and methylated from the blastocyst to elongated conceptus stage, between IVP-and MOET-derived embryos. The integrated transcriptomic and epigenetics datasets were analysed using a set of bioinformatics methods and we evaluated the predictive ability of key genes using additional external data through machine learning methods. The results revealed a group of genes (Cluster 1) with a strong deviation in their expression between IVP and MOET embryos at day 13, when the elongation process is initiated. Several of these genes were significantly related to the focal adhesion pathway. Furthermore, their expression predicted less developed (short) day 15 MOET conceptuses as being IVP embryos and, conversely, day 15 IVP embryos treated with the progestomedin DKK1 as being MOET embryos. Therefore, the IVP process induces epigenomic and therefore transcriptomic changes in the early embryo that have consequences for subsequent conceptus elongation and focal adhesion, essential processes for survival. Results reported here can help in the understanding of some of the factors involved in the post-transfer pregnancy failures, that occur in cattle following the transfer of IVP embryos.