Time-resolved proteomics of adenovirus infected cells

Viral infections cause large problems in the world and deeper understanding of the disease mechanisms is needed. Here we present an analytical strategy to investigate the host cell protein changes during human adenovirus type 2 (HAdV-C2 or Ad2) infection of lung fibroblasts by stable isotope labelling of amino acids in cell culture (SILAC) and nanoLC-MS/MS. This work focuses on early phase of infection (6 and 12 h post-infection (hpi)) but the data is combined with previously published late phase (24 and 36 hpi) proteomics data to produce a time series covering the complete infection. As many as 2169 proteins were quantitatively monitored from 6 to 36 hpi, while some proteins were time-specific. After applying different filter criteria, 2027 and 2150 proteins were quantified at 6 and 12 hpi and among them, 431 and 544 were significantly altered at the two time points. Pathway analysis showed that the De novo purine and pyrimidine biosynthesis, Glycolysis and Cytoskeletal regulation by Rho GTPase pathways were activated early during infection while inactivation of the Integrin signalling pathway started between 6 and 12 hpi. Moreover, upstream regulator analysis predicted MYC to be activated with time of infection and protein and RNA data for genes controlled by this transcription factor showed good correlation, which validated the use of protein data for this prediction. Among the identified phosphorylation sites, a group related to glycolysis and cytoskeletal reorganization were up-regulated during infection. The results show specific aspects on how the host cell proteins, the final products in the genetic information flow, are influenced by Ad2 infection, which would be overlooked if only knowledge derived from mRNA data is considered.


Introduction
A virus infection consists of the entry of the virion into the host cell, the translation of viral mRNA by the host ribosomes, the replication of the viral genome, the assembly of the viral particles enclosing the genome, and the release of the infectious particles from the cell. Human adenoviruses (HAdVs) are non-enveloped icosahedral viruses with linear double stranded DNA genomes of 30-38 kb. They are among the most efficient DNA viruses that replicate in cell culture and therefore a good model system of infection. To ascertain an efficient a1111111111 a1111111111 a1111111111 a1111111111 a1111111111 production of progeny virus, HAdVs counteract the host cell antiviral defence and create an optimal condition for their DNA replication. For this purpose, HAdVs encode several proteins within the early regions: E1A, E1B, E3, and E4. The E1A proteins act as promiscuous transcriptional activators or repressors of both cellular and viral genes [1,2,3], and they are essential for forcing the host cell to enter the S phase. Two E1B proteins, E1B-55K and E1B-19K, play major roles in the counteraction of the pro-apoptotic program [4,5]. Various proteins encoded in the adenovirus E3 transcription unit interact with the host immune system for maintaining cell viability [6]. Proteins encoded in the E4 transcription unit are mainly involved in the viral mRNA export from the nucleus, the viral DNA replication, and the host protein synthesis shut-off [7]. After the onset of HAdVs DNA replication, viral transcription switches from the early to the late mode. Concurrently, the host cell biomolecules are changed.
Several RNA-based technologies have revealed that human adenovirus type 2 (HAdV-C2 or Ad2) infection in human primary lung fibroblasts (IMR-90) can be divided into four periods, and the major changes in mRNAs, miRNA and lncRNA expression occur after 24 hours post-infection (hpi) [8,9,10,11]. The first period is from 0 to 12 hpi, and during this time the viral gene expression begins. The first response to the incoming virus is most likely the growth suppression, since most of the observed regulated genes (RNAs) at early time points have functions linked to inhibition of cell growth [8]. The second period covers the time from 12 to 24 hpi, and it follows expression of the E1A gene. During this period, there are profound changes in the host cell gene expression that create an optimal environment for replication of the viral genome. At RNA level, about 50% of the altered genes are involved in cell cycle regulation, cell proliferation and antiviral response [9]. The third period extends from 24 to 36 hpi. By this time, the virus has gained control of the cellular metabolic machinery, resulting in an efficient replication of the viral genome. During the fourth and last period at 42 hpi, the cytopathic effect becomes apparent and the number of down-regulated genes increases dramatically including many genes involved in intra-and extracellular structures.
The fact that most studies on the host cell response to Ad2 infection are derived from mRNA studies limits our knowledge, since proteins are the final actors in cellular processes. Proteomes and their variations are efficiently studied by mass spectrometry-based methods [12]. Most proteomic studies report static snapshots, e.g. comparing control and infected conditions that provide valuable but limited information since proteomes dynamically respond in space and time. A better alternative is to use protein dynamics data, e.g. time series, to describe these processes [13,14,15]. Time series have been previously used to study the Herpes Simplex Virus Type 1 infection of human foreskin fibroblast [16], the macrophage response to Vesicular Stomatitis Virus infection [17], or the infection by H9N2 influenza virus in a human gastric carcinoma cell line [18], but also to investigate essential cellular functions [19]. Among the different strategies developed for the comparison of protein changes [20], the incorporation of stable isotopes using metabolic labelling of amino acids in cell culture (SILAC) has demonstrated to decrease the errors in quantification, and it offers good precision when comparing biological samples [21]. However, some additional problems at the MS quantification levels can be observed due to the incomplete incorporation of isotopic amino acids or experimental mixing problems. These problems can be solved using computational approaches [22,23], or experimental corrections such as including SILAC label-swap replicates [24]. Different bioinformatics, statistical and graphical software have been developed to extract the relevant biological information from these analyses [25], and to visualize the massive amount of data generated by these high-throughput techniques [26]. SILAC in combination with LC−MS/MS has been used in Ad2-studies to investigate the protein degradation after inactivation of the virus by sunlight and UVC light [27], the quantitative changes in the protein composition of the nucleolus during infection [28], the temporal characterization of the non-structural proteome and phosphoproteome of Ad2 [29], or to develop a strategy to identify proteins using transcriptomic data from adenovirus type 5 (Ad5) infection of HeLa cells [30]. We have previously used this technology to study the host cell protein regulation during the late phase of infection [31]. This study demonstrated that at 24 hpi, the up-regulated proteins were related to the carbohydrate and nucleoside metabolism, and at 36 hpi, these proteins are also involved in protein translation, and protein and DNA metabolic processes. Proteins involved in cellular structures and in the integrin mediated cell signalling were down-regulated. Using the data from all the proteins and their corresponding mRNA, an overall low correlation (% 0.3) was observed [31]. This result is not surprising as it has been observed and discussed [30,32,33,34]. The low correlation makes the prediction of the protein changes based on the use of RNA data extremely uncertain and it is therefore highly relevant to study the protein changes at specific times during Ad2 infection with separate technologies.
The aim of the present study is to characterize the host cell protein regulation induced by Ad2 infection using SILAC and high-resolution nanoLC−MS/MS at early and late phases. Stringent cut-off levels were applied to certify high quality data, and different bioinformatics and statistical tools were developed and combined for the identification of biological functions, the visualization of the protein regulation in the different pathways, and the analysis of the altered transcription factors affected during the infection process.

Cell culture, infection and sample harvest
Human lung fibroblast (IMR-90) were purchased from American Type Culture Collection and cultured in 10 cm 2 plates at the same conditions as previously described [31]. After six cell doublings and when the confluence was reached, cells were kept for 2 days at 37˚C for synchronization by growth inhibition. Then, 16.8 x 10 6 labelled cells were either mock infected (only medium) or infected with Ad2 at a multiplicity of infection of 100 fluorescence-forming units/cell in 1 mL in serum free medium for 60 min. Afterwards, viruses were removed, and the cells were collected at 6, 12, 24 and 36 hpi. All samples were harvested at the same occasion, but late time points (24 and 36 hpi) were used in our previous study. A biological replicate in form of a swap-labelling experiment was performed. Cells were washed with PBS and directly snap frozen on dry ice.
The raw data from 6 and 12 hpi, and the previously published raw data from 24 and 36 hpi (PRIDE Accession number: PXD004095), were processed together using MaxQuant (1.4.1.2) [36] and database searches were performed using the implemented Andromeda search engine [37]. MS/MS spectra were correlated to the Uniprot human database (release 2017-02, 156787 entries) combined with a human Ad2 database (release 2017-02, 560 entries). False discovery rate (FDR) was calculated based on reverse sequences from the target-decoy search, and an FDR of 1% was accepted for protein and peptide identification. For data processing, only peptides with a minimum of 7 amino acids and two maximum miss cleavages were accepted, and the mass tolerance was 4.5 ppm for the main search and 20 ppm for the fragment masses. Trypsin was selected as digesting enzyme, carbamidomethylation of cysteines as fixed modification, and oxidation of methionine, acetylation of the protein N-terminus and phosphorylation (STY) as variable modifications. For protein identification, at least one unique peptide and two peptides were required. For SILAC labelling quantification, Lys8 and Arg10 were set for heavy labels, and two ratio counts was the minimum. The mass spectrometry proteomics data have been deposited to the ProteomeXchange Consortium via the PRIDE [38] partner repository with the data set identifier PXD008980. The data for protein identification and quantification is provided in S1 Table.

Statistical and bioinformatics analysis
After MaxQuant analysis, the "proteinGroups.txt" file obtained was loaded to the Perseus software (http://141.61.102.17/perseus_doku/doku.php?id=start). Prior to any statistical analysis, identifications flagged as reverse, potential contaminants, or proteins identified only by site were excluded for further analysis. MaxQuant automatically normalized each data set to the median of the ratios to correct for the mixing of H and L labelled cells at 1:1 ratios, and the values were transformed to the log2 scale for a better comparison between the different conditions. Principal Component Analysis (PCA) was carried out using Statistica software, and Pearson's correlation test was performed using Perseus. To identify the proteins significantly altered/regulated, a 1.5-fold change cut-off (up-or down-regulated, equivalent to 0.585 and -0.585 in log2 scale) for each replicate was applied, and the average of the two replicates was obtained. The lists of significantly altered proteins were then uploaded into the web-based Panther software (http://www.pantherdb.org/) and into Ingenuity Pathway Analysis software (IPA; Qiagen, Redwood City, CA) to perform different analyses. Using Panther software, the overrepresented pathways in the lists of proteins were identified, and p-values were calculated using the Fischer's exact test and considered significant when < 0.05. Moreover, IPA was used to perform a causal Upstream Regulator (UR) analysis, where a transcription factor state (activation or inactivation) is predicted based on a Fisher's exact test between the list of proteins and the published targets of this regulator found in the literature. RNA data from a previous publication [31] was used for correlation of protein and RNA in the UR analysis.
After the functional annotation, and to generate heat maps, the log2-values of the proteins involved in the identified Panther pathways were plotted by hierarchical clustering. Thereafter, the values were normalized (subtracting the mean value of each data set and dividing by its standard deviation) and the Euclidean distances with respect to the centre were calculated and represented using box-plot graphs using GraphPad Prism software. In box-plot graphs, Euclidean distances are represented by different colours (red < 1; orange, 1-1.5; yellow 1.5-2; green, > 2).
Phosphopeptides were studied using the "Phospho (STY) sites.txt" file generated by Max-Quant. Identifications flagged as reverse or potential contaminants were excluded, and only phosphorylated sites with a localization probability > 0.75 were considered. To identify phosphorylated sites significantly altered, a 1.5-fold change cut-off (equivalent to 0.585 and -0.585 in log2 scale) for each replicate was applied, and the average of the two replicates was obtained. The data for the phosphorylated site identification and quantification is provided in S2 Table. Results and discussion

Study of cellular protein regulation during early phase of Ad2 infection using SILAC-MS technology
As a first step, the incorporation efficiency/degree of heavy lysine and arginine was evaluated and was > 97% after five passages before infection. The infection process was then studied in time series with four time points and two biological replicates as swap labelling experiments, as previously suggested [24]. Proteins from Ad2(Light)/Mock(Heavy) and Mock(Light)/Ad2 (Heavy) were combined 1:1, fractionated using SDS-PAGE, in-gel digested, and analysed by nano LC-MS/MS analysis to obtain SILAC protein ratios for accurate relative quantification of proteins. Using data from all four time points and applying a FDR of 1%, 4591 proteins were identified (S1 Table). An average of 3300 proteins could be quantified in at least one biological replicate at each time point ( Fig 1A and Table 1) and among them, more than 82% were quantified in both biological replicates at each time point. This proved good reproducibility of the technology used. The results were comparable to previous studies by us and by Evans et al [30,31]. In the latter, HeLa cells were infected by Ad5 and the cellular proteome was evaluated after 8 and 24 hpi. Even though the samples were fractionated in mores slices than in the present work (14 instead of 10) and a longer analytical column was used (250 instead of 100 mm), a similar number of proteins were quantified [30]. As presented in the Venn diagram (Fig 1B), 2169 proteins (64% of the quantified proteins) could be quantified at all time points, while some proteins were time-specific.
Thereafter, the correlation between the two biological replicates at each time point was evaluated using the Pearson's correlation test. Using all proteins quantified in both replicates ( Fig  1A), the obtained Pearson's correlation coefficient was low (r = 0.30) at 6 hpi, and medium (r = 0.54) at 12 hpi. However, the correlation increased at late time points (r = 0.68 for 24 hpi and r = 0.67 for 36 hpi) (Fig 2A, in grey). To increase the reliability of the protein quantification, different filtering criteria were applied. Firstly, proteins with opposite regulation profiles between the two biological replicates were removed from the analyses in order to only provide trustful biological data (Table 1). Resulting from this filtering, the Pearson's correlation coefficient increased substantially (Fig 2A, in black), being > 0.80 in all time points. The number of proteins not passing the filtering criteria at early time points (% 800) was higher than the number of proteins removed at late time points (% 400) ( Table 1). These results suggest that the changes in host proteome are more variable during early phase of infection and that the infection stabilizes during the progression of the infection, giving more reproducible proteomes at late time points. The number of proteins considered at 24 and 36 hpi was lower than in our previous study [31], since this combined analysis of all time points is more stringent. For instance, 60% of the proteins removed due to inconsistent changes between replicates had a fold change between 0 and ± 0.1 (in log2 scale) in at least one of the replicates. Such small changes should not be used for biological conclusions. As a result, higher correlation values were obtained (0.90 at 24 hpi and 0.89 at 36 hpi), compared to the previous ones (0.81 and 0.89) [31]. Proteins passing this first filtering criterion were analysed by PCA to determine, compare and visualize the overall relation between the four conditions studied (Fig 2B). This analysis indicated that the combination of the two main components, PC1 and PC2, captured more than 65% of the variance of the data. PC1 could separate the samples representing 6 and 12 hpi vs. 24 and 36 hpi, while PC2 could distinguish the two different early time points (6 and 12 hpi), and also the two different late time points (24 and 36 hpi).
Secondly, and in agreement with other publications [15], a 1.5-fold change cut-off (equivalent to 0.585 and -0.585 in log2 scale) was applied to identify the significantly altered/regulated proteins at each time point (Table 1). In total, approximately 25% (at early phase) and 35% (at late phase) of the quantified proteins were altered with this fold change, and their ratios can be found in Part A-D of S3 Table. During Ad5 infection in HeLa cells, only 1% and 8% of the proteins showed either twofold-or-greater change in protein abundance at 8 and 24 hpi, respectively [30]. These differences could, in addition to the higher cut-off level be explained by the  [8,39,40]. The replication of adenovirus in HeLa cells is also extremely efficient, and therefore the time window for examination of the details of cellular gene expression in this cell line is narrower than in IMR-90 cells [8,39,40]. This could also affect the protein regulation. However, the most relevant proteins, quantified during Ad5 infection of HeLa cells (HSPA1A, MRE11A, ITGA3, RAD50, POLDIP3), were similar (S1 Fig) to Ad2 infection of IMR-90 cells.

Progression of the cellular protein regulation during infection
After the data filtering, different groups/clusters of proteins were formed according to their altered abundances.
Proteins uniquely altered at 6 hpi. Among the 431 proteins significantly altered at 6 hpi, 40 were unique compared to 12, 24 and 36 hpi data (Part A of S3 Table). Of them, 39 and 1 had positive and negative ratios, respectively. Enrichment analysis of these proteins did not reveal any overrepresented pathways due to the small number. However, by searching for functions using STRING database (https://string-db.org) and Genecards (https://www. genecards.org), it became apparent that some of them have interesting functions. For example, DSC1 is involved in cell-to-cell adhesion; ACTR1B and TRIP6 in actin polymerization and cytoskeletal organization; and KIF5B and TWF2 are microtubule related proteins. Other proteins are involved in protein folding (PPIL3), prevention of protein aggregation (CRYAB) or in the nucleolar-cytoplasmic transport (TCOF1). One especially interesting protein is BNIP2 (also known as BCL2/adenovirus E1B interacting protein). BNIP2 interacts with the adenovirus E1B 19 kDa protein, which has been implicated in the protection against the cell death program induced by viral infection [41]. The only supressed protein was CYR61. CYR61 is one of the six extracellular matrix-associated proteins (CCN family), and it plays diverse roles in cell proliferation, survival and migration through interactions with cell adhesion receptors, including integrins [42,43]. Specifically, CYR61 binds to integrin αvβ3 to mediate endothelial cell adhesion [44] and can inhibit cell proliferation and down-regulate the mRNA expression of COL1A1 in normal human fibroblasts [45]. Integrin αvβ3 has been also reported to be used by  Time-resolved proteomics of adenovirus infected cells adenoviruses for internalization [46,47]. Moreover, CYR61 can also interact with ITGB5, a protein which becomes supressed from 6 to 24 hpi. ITGB5, together with ITGAV, has been demonstrated to act as primary receptor in Coxsackievirus and Adenovirus receptor (CAR)negative cells [48] Proteins uniquely altered at 12 hpi. Of the 544 proteins significantly altered at 12 hpi, 57 (54 and 3 had positive and negative ratios, respectively) were uniquely altered in comparison with the other three time points (Part B of S3 Table). The Cytoskeletal regulation by Rho GTPase, the Inflammation mediated by chemokine and cytokine signalling pathway and the Integrin signalling pathways were overrepresented among these proteins. Even though these proteins were uniquely altered at this time points, their corresponding pathways were also active at later time points and will therefore be described in the following sections.
Proteins uniquely altered at both 6 and 12 hpi. To compare early versus late time points of infection, protein data from 6 and 12 hpi were merged and compared with merged 24 and 36 hpi data. The set of proteins significantly altered only at early time points included 48 proteins (47 and 1 with positive and negative ratios, respectively) (Part C of S3 Table). All these proteins showed the same direction of regulation at 6 and 12 hpi. The functional enrichment analysis indicated an overrepresentation of the Cytoskeletal protein binding and the Structural constituent of cytoskeleton molecular functions. Other altered proteins are involved in the regulation of glutathione metabolism and detoxification processes (GCLC, GLRX, TXNDC17, GSTM3, TXN), in the organization of the cytoskeleton (FSCN1, TUBB2B, TBCA, TUBB3, TMSB4X, PFN1) or the proteasome system (UBE2L3, UCHL1), and in signalling pathways such as TGF-β (FKBP1A), Rho (ARHGDIA, GDI1, GDI2) and NF-κB (MTPN). These signalling pathways have also been observed as altered at the transcriptome level [9]. We and other groups have pointed out the stress response protein HSPA1A/B as the protein with the highest change at late phase of infection [31,49]. However, the ratio of this protein was not significant at 6 hpi (0.41 in log2 scale), and it was mildly altered at 12 hpi (0.73 in log2 scale). Similarly, the ratio of mini-chromosome maintenance (MCM) proteins was significantly positive late, but not in the early phase. Conversely, the proteins with the lowest negative ratio at 6 and 12 hpi were collagen proteins (COL1A1, COL1A2, COL3A1, COL5A1, COL5A2, COL12A1) and related proteins such as PCOLCE, which drives the enzymatic cleavage of type I procollagen, and TIMP3, which can form complexes with collagenases. In most cases, similar negative ratios were observed at later time points. In addition to the CYR61 protein mentioned above, another protein involved in cell-to-cell interaction (THBS1) was less abundant, which fully agrees with previous mRNA results [9]. Other proteins involved in cell adhesion also had negative ratios: CSPG2 and CTGF. CTGF is another member of the CCN family, and it mediates cell adhesion, directional migration, and proliferation through integrin αvβ3 [50]. It is also related with the platelet-derived growth factor receptor PDGFRA, a protein with a negative ratio from 6 to 36 hpi. Moreover, a negative ratio was observed for SERPINE1, the principal inhibitor of tissue plasminogen activator and urokinase, as well as for the two proteins that are under its control, PLAT and PLAU.

Pathway analysis (Functional annotation)
The lists of proteins significantly altered at each specific time were simultaneously analysed using the overrepresentation function included in the web-based software Panther. The significantly overrepresented pathways (p-value < 0.05) in at least one time point are shown in Table 2, and if so, the values for the other time points are presented for comparison.
6 hpi pathways. Among the identified pathways, two were only found overrepresented at 6 hpi: the Serine glycine biosynthesis and the Mannose metabolism. Even though the ratios of the proteins considered in these pathways were significant (and positive) at 12 and 24 hpi (and some of them at 36 hpi), these pathways were only overrepresented at 6 hpi because other pathways were more significant at later time points. In the case of Serine glycine biosynthesis pathway, the proteins PSAT1, PHGDH, and PSPH were considered. These proteins are involved in the conversion of the glycolytic intermediate 3-phosphoglycerate into serine, which can then be transformed into glycine. In the case of Mannose metabolism pathway, the proteins GMPPA, PMM2, and GMPPB are involved in the generation of GDP-mannose, which contribute to N-glycosylation, O-glycosylation, C-mannosylation, and GPI anchor synthesis [51]. 12 hpi pathways. Different pathways were uniquely overrepresented at 12 hpi, such as the PDGF signalling pathway, the Fructose galactose metabolism, and the Toll receptor signalling pathway. In these cases, 13, 4 and 10 proteins were significantly altered, respectively. The Toll receptor signalling pathway was the most significantly overrepresented at this time, but as presented in Fig 3A, the overrepresentation was even higher at 24 hpi. In this pathway, several MAP kinases (MAPK1, MAP2K1, MAP2K2, MAPK14, MAP2K3), ubiquitin-conjugating enzymes (UBE2V1 and UBE2N), the NF-kappa-B subunits 2 (NFKB2) and 3 (RELA), and the Inhibitor of nuclear factor kappa-B kinase subunit beta (IKBKB), had positives ratios. Toll-like receptors play critical roles in the innate immune system by recognizing pathogen-associated molecular patterns derived from various microbes, such as adenoviruses [52,53]. Upon the activation of the receptors, the proteins UBE2V1 and UBE2N can form a heterodimer that acts in concert with TRIM5 to activate the MAP3K7/TAK1, which in turn phosphorylates and activates IkB kinase (IKK), leading to the activation of NF-kB [54,55].
Another interesting pathway overrepresented at 12 hpi was the PDGF signalling pathway. This pathway is initiated upon binding of PDGF to the PDGF receptor complex. Different homodimers of A-, B-, C-and D-polypeptide chains, and the heterodimer PDGF-AB can be formed, which are synthesized as precursor molecules and must be activated by cleavage [56]. In the case of PDGF-CC and-DD homodimers, the activation is performed by the tissue-type plasminogen activator (PLAT), and in the case of PDGF-DD, by the urokinase-type PA (PLAU). Both PLAT and PLAU had negative ratios at 12 hpi, indicating that the cell is trying to minimize the activation of the PDGF signalling pathway. Once activated, the PDGF receptors create docking sites for signalling molecules, such as the signal transducers and activators of transcription (STATs). However, these receptors also bind adaptor molecules which form complexes with other signalling molecules, such as the regulatory subunit p85 of the phosphatidylinositol 3 0 -kinase (PI3K), and GRB2, which binds the nucleotide exchange molecule SOS1, activating RAS and the ERK MAP-kinase pathway. The activation of PI3K can also lead to actin reorganization via the GTPase activator for the Rho, Rac and Cdc42 proteins (ARH-GAP1). These proteins were found with a positive ratio in the early phase. Furthermore, the PDGF receptors can interact with other receptors, such as the EGF receptor [57] or, integrins [58]. The pathways controlled by these receptors were also found overrepresented in this study ( Table 2), but the receptors had negative ratios in the late phase. In accordance with the regulation of theses receptors, the Platelet-derived growth factor receptor alpha (PDGFRA) had a negative ratio at all time points of the infection, and the beta receptor (PDGFRB) only at late time points, suggesting that the cell is trying to shut-off all these pathways to impede the ongoing infection.
The Fructose galactose metabolism function was also highly enriched, and the ratios of 4 proteins (ALDOA, ALDOC, GALM, and GALE) out of the 12 contained in this pathway were positive. These proteins are involved in the production of precursors that are used in the glycolysis cycle for energy production. Moreover, the Gonadotropin-releasing hormone receptor pathway and the Insulin/IGF pathway-mitogen activated protein kinase/MAP kinase cascade were also overrepresented. However, the proteins considered in these pathways are not pathway-specific, therefore these results were regarded as false positives.
The most interesting pathway overrepresented at early time points was the Cytoskeletal regulation by Rho GTPase. This pathway has been deeply studied during adenovirus infection [59]. After the attachment of adenovirus via CAR and/or the αv integrin, adenovirus endocytosis requires the reorganization of the host cell actin cytoskeleton. As commented above, PI3K is responsible of the activation of ARHGAP1, which activates the small GTP-binding proteins Ras, Rho, Rac, and CDC42. Rho GTPases also regulate certain mitogen-activated protein (MAP) kinase pathways following integrin or growth factor ligation [60]. Another interesting protein involved in the adenovirus internalization is PAK1, a member of the p-21 serine-threonine kinase family, which contains a high-affinity binding site for the GTP-bound form of Rac and CDC42 [61]. Binding of Rho GTPases to PAK1 or PAK2 results in autophosphorylation and activation of their kinase activity, leading to reorganization of the actin cytoskeleton. PAK2 also phosphorylates MAPK4 and MAPK6 and activates the downstream target MAPKAPK5, a regulator of F-actin polymerization and cell migration. Furthermore, the Arp2/3 complex is involved in regulation of actin polymerization. Together with the nucleation-promoting factor (PFN), it mediates the formation of branched actin networks [62]. Time-resolved proteomics of adenovirus infected cells Several components of the Arp2/3 complex were found with positives ratios during the infection (ARPC1A, ARPC1B, ARPC2, ARPC3, ARPC1A) as well as the PFN1 and PFN2 units. Different tubulin members (TUBB, TUBB2B, TUBB3, TUBB6) and STMN1 (involved in the regulation of the microtubule filament system) also had positive ratios. As it can be observed in Table 2, the number of proteins considered in this pathway was higher at 12 hpi, and for those found at both at 6 and 12 hpi, the ratios were higher at 12 hpi (Fig 3B). However, most of these proteins were not significantly altered at later time points. The functional annotation analysis performed in our previous work using transcriptomic data indicated that the most down-regulated mRNA at 24 hpi are involved in the regulation of cytoskeleton (such as RhoB, ARHGEF1, and ARHGAP22) [9]. The down-regulation of these mRNAs could explain the non-significant ratios observed at late time points, since it is known that mRNAs are less stable than proteins and it takes longer to degrade proteins [33].
24 / 36 hpi pathways. When considering the late time points, we also identified the Ras Pathway as overrepresented at 12 and 24 hpi. This pathway plays a key role in transducing extracellular cues, and stimulating cellular proliferation, differentiation and survival. As we noted previously, this pathway could be triggered by the increased levels of GRB2 after activation of the PDGF signalling pathway, but the key protein KRAS was found with negative ratios at 24 and 36 hpi, as well as the protein RRAS2 at 36 hpi. In our previous study, we observed the overrepresentation of the Pentose phosphate pathway at 24 hpi at the protein level but not at the mRNA level [31]. In this combined study of all time points, the ratios of two proteins involved in this pathway were also higher than the threshold at 12 hpi (GPI and PGD). The Pentose phosphate pathway cooperates with glycolysis for energy production, indicating that the virus requires high amounts of nucleotides for its replication at late time points, and therefore it requires ribose-5-phosphate. Complementary to this pathway, the serine synthesized from the glycolytic intermediate 3-phosphoglycerate can be converted to glycine (an important precursor for purine biosynthesis), via the Serine glycine biosynthesis pathway [63]. Some proteins involved in this pathway (PSAT1, PHGDH, and PSPH) were also more abundant at 24 hpi. Because of its conversion to glycine, serine is also the donor of folate-linked one-carbon units which are required for nucleotide biosynthesis [64]. The Pentose phosphate pathway is tightly connected with the de novo synthesis of nucleotides [65] and we observed that the De novo purine biosynthesis and the De novo pyrimidine ribonucleotides biosynthesis were overrepresented at all time points, but with more proteins and higher ratios at late time points of infection (Table 2). Moreover, we observed the overrepresentation on the Glycolysis pathway at all time points, most strikingly at 12 hpi (Panel A in S2 Fig).
Time independent pathways. Several signalling pathways were also overrepresented in this study. Among them, the FGF signalling pathway and the EGF receptor signalling pathway were found significant at 6, 12 and 24 hpi while the Integrin signalling pathway was significant at all time points. The ratios of most of the proteins involved in the FGF and the EGF receptor signalling pathways were positive, but they decreased with time post infection (Panel B and C in S2 Fig). As well as for the regulation of the cytoskeletal function, the inactivation of these signaling pathways could be explained by the down-regulation of different growth factors at the mRNA level (such as FGF1, FGF2, CTGF or VEGFC) observed at 24 hpi [9]. On the other hand, several proteins involved in the Integrin signalling pathway had negative ratios, such as different integrins (ITGB5, ITGAV, ITGA5, ITGA2, ITGB3, ITGA3), collagens (COL1A1, COL3A1, COL1A2, COL12A1, COL5A2, COL5A1, COL6A2, COL6A1, COL6A3, COL14A1) and RAS family members (RRAS, RRAS2, KRAS). Fig 3C shows that, compared to mock infection, the negative ratios of the proteins in this pathway decreased with the time, and most of the genes encoding these proteins were also found down-regulated at 24 and 36 hpi at the mRNA level [31].

Upstream regulator analysis
The UR analysis is a bioinformatic analysis that identifies upstream transcriptional regulators that are connected to gene or protein datasets through a set of direct or indirect relationships stored in the Ingenuity 1 Knowledge Base. In the present work, our lists of significantly altered proteins were analysed with this tool and transcriptional regulators considered significant in at least one time point are reported in (Table 3).
This analysis was primarily developed to work with gene datasets, but it has also been useful for protein expression datasets [16,66,67]. Protein synthesis is a biosynthetic process downstream from transcription and its regulation. We have previously observed that the overall correlation between the RNA and the protein levels was low at 24 and 36 hpi (r % 0.3) [31], and it has been observed even lower in other studies [30]. However, after the application of a 1.5-fold change cut-off at the protein level, the correlation significantly rose up (r % 0.5). This strategy has previously been applied to increase the confidence for the use of differential mRNA for biological discovery [68]. The resulting correlation should reflect differences in mRNA and In grey, transcription factors activated (z-score > 2.5) or inactivated (z-score < -2.5) with a p-value < 0.05.
protein dynamics, as mRNAs are produced at a slower rate than proteins, and the mRNA turnover can be faster while the effect on the protein level may still prevail [34,35]. Thus, the exact time at which transcription factors are activated/deactivated is difficult to determine based on the protein data. To validate our proteomics data used in the UR-prediction, the changes of those proteins controlled by altered transcription factors were correlated (by Pearson's correlation test) with their corresponding RNA changes upon infection [31]. In this case, an acceptable correlation coefficient between proteins and RNAs > 0.7 was obtained, and the most relevant results are reported in Panel A in S3 Fig. This exemplifies that a good correlation between altered ratios of protein and RNA can be achieved when zooming in on specific processes, but not when considering all the processes. Similar results were noted for the correlations between mRNA and protein abundances for open reading frames (ORFs) that varied over the course of the cell cycle in yeast [69]. The correlation was observed higher (r = 0.89) for those ORFs that show a large degree of variation, while for ORFs with minimal variation, the correlation was poor (0.2). Among all the transcription regulators included in Table 3, MYC was highly activated, and its activation increased with time. It has been shown that during adenovirus infection, the E1A adenovirus protein decreases MYC expression at the mRNA level, but not at the protein level [70]. Moreover, a later study demonstrated that stabilization of MYC requires the p400 protein, and that E1A promotes the association of MYC and p400 at MYC target genes, leading to induction of their transcription [71]. Different studies have characterized virus-induced changes in host cell metabolism [72], and it has been suggested that adenovirus E4orf1 binds to MYC and induces the expression of glycolytic target genes and nucleotide biosynthesis from glucose intermediates [73]. Therefore, the activation of MYC by E1A and E4orf1 adenovirus proteins could explain why the Glycolysis and the Pentose phosphate pathways were overrepresented in spite of the decrease in the MYC mRNA level.
Another transcription factor found activated at late time points was E2F1. We observed in our previous transcriptomic studies that the expression of several genes involved in cell cycle and DNA replication and controlled by the E2F were up-regulated at 12 hpi, caused by the interaction of E1A protein with retinoblastoma protein [9,74]. Our proteomic data demonstrates that the ratios of several components of the minichromosome maintenance complex (MCM2, MCM3, MCM4, MCM5, MCM6, MCM7), some replication factors (RFC2, RCF3, RFC4, RCF5), and different cell cycle regulators (CDK2, CDKN2A), were consistent with the ratios for these genes at the mRNA levels at 24 hpi, but not at 12 hpi. It is therefore likely that the mRNA transcription starts at 12 hpi, but the effects in the protein levels are delayed and only observed at 24 hpi.
The transcriptomics data also revealed an up-regulation of genes controlled by the ATF/ CREB family, predicting an overrepresentation of ATF1-4 and CREB1 at 24 hpi [9]. These transcription factors control the expression of genes involved in DNA and RNA metabolism, but also genes involved in the stress response. The UR analysis confirmed the activation of CREB1 at late time points but the activation of the transcription factor ATF4 could only be observed at early time points (6 and 12 hpi). The UR analysis also showed the activation of the transcription factor XBP1 at early time points. XBP1 is part of the Unfolded Protein Response (UPR) triggered by endoplasmic reticulum (ER) stress. Interestingly, different studies have demonstrated an ER stress response in virus infection [75]. XBP1 mRNA is specifically spliced and activated by the inositol-requiring kinase 1 (IRE1), which increases the transcription of chaperones and other proteins involved in vesicular trafficking and transport between the Golgi complex and the ER, such as the coatomer proteins. Many of these coatomer proteins (COPA, COPB1, COPE1, COPG1 and SEC31A), which had positive ratios mainly at early time points, have been suggested to play important roles in the infection by several viruses [76]. The UPR also generates excess levels of reactive oxygen species (ROS). One of the most important cellular defence mechanisms against ROS excess is controlled by the nuclear erythroid-related factor 2 (NFE2L2 or NRF2), highly activated at all time points in our study. NRF2 inhibits lipogenesis, activates the oxidation of fatty acids, simplifies the flux through the pentose phosphate pathway and increases NADPH regeneration and purine biosynthesis.
Finally, the UR analysis indicated that the RB1 transcription factor was inactivated after 24 hpi. Our previous transcriptomic study indicated that the RB1 mRNA level was increased at 24 hpi [9], which might due to the host cell trials in increasing its activity to combat the infection, possibly because of the E1A blockage. Moreover, adenoviruses can block apoptosis via two proteins: E1B-55K and E1B-19K. The E1B-19K protein blocks apoptosis by interacting with and inhibiting the p53-inducible and death-promoting BAX protein [77]. In the present study, the ratio of BAX (and its activator BID) was positive at all-time points, which indicates that the host cell tries to stop the anti-apoptotic effect of the E1B-19K protein. In fact, this non-structural adenoviral protein was identified late during Ad2 infection in a recent work using the same model [29].

Comparison with previous functional annotation and transcription factor analyses
In our previous RNA data, we observed that most of the up-regulated genes (at 12 and 24 hpi) are involved in DNA replication, nucleic acid biosynthesis, biosynthesis of DNA, RNA and protein, or related with the cell cycle progression [9]. Most of the down-regulated genes, were involved in signal transduction, vesicle trafficking and cytoskeletal organization [9]. These results agree well with most of the current results as we observed pathways such as the De novo purine and pyrimidine biosynthesis pathways, which are intimately connected with the nucleic acid biosynthesis, as overrepresented. The ratios of the MCM components, DNA replication factors, and some proteins related with the cell cycle (CDK2, CDK5, CDKN2A, CDC5L) were also positive at the protein level. We also observed the inactivation of the Integrin signalling pathway and the negative ratios of several transport proteins (RAB3B, RAB3D), solute carrier family proteins (SLC35F6, SLC35B2, SLC4A7, SLC38A2) and syntaxins (STXBP3, STX12, STX4, STX7). However, some differences were also noted. The ratios of proteins involved in the vesicle trafficking such as SEC24D, SEC16A, SEC31A, SEC24A, SEC23IP were positive, while they were negative at the mRNA level. Moreover, the Cytoskeletal regulation by Rho GTPase pathway was only scored at the protein level. A previous study using RNA data from mouse embryo fibroblasts infected with Ad5, demonstrated that several pathways related with the cytoskeletal regulation (Focal Adhesion, Tight Junction and Actin Cytoskeletal pathways) are up-regulated [78]. In addition, different studies suggest that some of these pathways are interlinked with Toll-Like Receptor pathway [79,80], a pathway also activated after infection of human A549 cells with Ad5 [78]. This pathway was not found overrepresented in the mRNA data, but it was the most overrepresented in the protein data at 12 hpi.
Among the transcription factors predicted from the RNA data, AP2, E2F1 and different ATF/CREB members were the most overrepresented, while SRF, NFKB and EGR1 were overrepresented in the down-regulated category [9]. The transcription factors E2F1 and ATF4 were also scored as activated using the protein data, and one of the members of the AP2 family (TFAP2A) was found slightly activated at late time points but with a score below the cut-off threshold (S4 Table). The same result was obtained for EGR1, which was found deactivated at 36 hpi but below the cut-off threshold. The activation state of NFKB transcription factor was not predicted, and for SRF, the prediction was the opposite compared to the value obtained from RNA data. However, SRF has been observed activated in mouse embryo fibroblasts infected by Ad5, like the NRF2 transcription factor [78]. In agreement with our observations here, that study noted the activation of other transcription factors (IRF7, REL or PPARG).
Differences in changes at the protein and mRNA levels during other viral infections have been described in previous reports. A combined transcriptomic and proteomic study of human hepatocellular carcinoma cells infected by Hepatitis C virus demonstrated that only 15 genes/proteins were common in transcriptomic (RNA-Seq) and proteomic data, and only 6 were common when proteomic data was compared with data from expression microarrays [81]. In addition, only one of the most relevant canonical pathways was found common when RNA and protein data were compared. Moreover, in a recent publication where the expression dynamics of transcripts, proteins and phosphoproteins upon VSV infection of macrophages was studied, surprising differences between the three levels were demonstrated [17].

Phosphopeptide analysis
This study enabled us to identify 305 phosphorylated sites without performing any phosphopeptide enrichment (S2 Table). After the application of the same filtering criterions as for the protein regulation, several phosphorylated sites could be quantified at the different time points (Table 1). Among them, only a few were significantly altered, and their ratios can be found in S5 Table. These low numbers, as well as the lack of values for all time points, indicate that enrichment methodologies are needed in future studies to obtain a better view of the infection progression at the phosphoproteome level. Despite these limitations, some interesting results were found, such as the quantification of Ser21 residue in the triosephosphate isomerase (TPI), a protein involved in the Glycolysis pathway. This phosphorylated site had a positive ratio between 6 and 24 hpi, and it has been suggested as a substrate of the CDK2 kinase protein during etoposide-induced apoptosis in HeLa cells [82]. Another phosphorylated site found in higher concentration between 12 and 36 hpi was the Ser2152 of the filamin A (FLNA). The phosphorylation of this residue by the PAK1 protein has been pointed out as essential for the PAK1 induced cytoskeletal reorganization [83], which suggests that the activity of PAK1 is induced. Unfortunately, PAK1 was not detected in the present study. Moreover, Ser82 of the HSPB1 protein was more abundant at 6 and 24 hpi, which is one of the three phosphorylations needed for the main functions of the protein [84]. Moreover, some proteins were reported with multiple phosphorylation sites, such as Ser6 and Ser37 in CAV1 protein, or Ser39 and Ser56 in VIM protein. In the case of CAV1, both phosphorylated sites had positive ratios at the late time points, but in the case of VIM1, the ratios of Ser39 and Ser 59 were positive and negative, respectively, at 36 hpi. These results indicate that further in depth phosphoproteomics studies are necessary to clarify the whole picture of the Ad2 infection.

Conclusions
Virus infections are a large problem in the world and we need mechanistic insights to understand how we can combat them. By using a time-resolved proteomics approach based on SILAC and mass spectrometry, we have successfully studied the early and late phases of the infection of IMR-90 cells by Ad2. The altered abundances of 2169 proteins were possible to track between 6 and 36 hpi. These findings point out the De novo purine and pyrimidine biosynthesis, the Glycolysis and the Cytoskeletal regulation by Rho GTPase pathways as pathways activated early during the infection, while the inactivation of the Integrin signalling pathway starts between 6 and 12 hpi. The predicted activation of several transcription factors such as MYC can explain the induction of the expression of glycolytic target genes, as well as the increased nucleotide biosynthesis needed for optimal adenovirus replication. This is also supported by the alteration of some phosphorylated sites related with the glycolysis or the cytoskeletal reorganization. The reported processes are important for a deeper understanding of the battle between the adenovirus and the host cell and the results clearly show that proteomic data is essential and cannot be predicted from RNA data for revealing the mechanisms.