Asgard archaea do not close the debate about the universal tree of life topology

Comment on: Complex archaea that bridge the gap between prokaryotes and eukaryotes. [Nature. 2015] Lokiarchaea are close relatives of Euryarchaeota, not bridging the gap between prokaryotes and eukaryotes. [PLoS Genet. 2017]


There is evidence of contamination in the lokiarchaeal genomes
On top of the very high heterogeneity detected in Loki's genome (78.21%), our quality analyses with CheckM (Parks, Skennerton, Imelfort, http://ecogenomics.github.io/CheckM/)) and Anvi'o (Meren, http://merenlab.org/software/anvio/) also revealed a high contamination index (between 45% and 57%), necessarily underestimated because quality analyses are limited to defined sets of markers [5]. Considering a good part of this contamination index as the result of the heterogeneity still leaves room for actual contamination from other sources. There is no need for important contamination to bias phylogenetic reconstruction if it is located in strong markers, as we have shown with the elongation factor 2 (EF2) [1].
We could identify such a likely contamination in EF2 by detecting long insertions matching to eukaryotic paralogs, essentially in Heimdall LC3 (formerly Loki 3) [1]. No other Asgard sequences [3] contain the LC3-specific insertions, reinforcing the contamination hypothesis. In an EF2 single-protein tree without bacteria to increase the signal, Heimdallarchaeota are not monophyletic, with LC3 still branching with eukaryotes, whereas all the other Asgards are sister groups to Euryarchaeota (Fig 1), suggesting that patches of contaminating sequences indeed remained in Heimdall LC3 after trimming. The complex evolutionary history of this protein could perhaps explain the accumulation of artificial insertions in the LC3 genome. We suspected that additional hidden patches of contamination could similarly be present elsewhere in other Asgard universal proteins, especially in Heimdall LC2 and LC3 reconstructed through a Multiple Displacement Amplification (MDA) process. This could explain why not only Asgards belonging to different phyla (Lokiarchaea, and Heimdallarchaea [represented by

Competing interests:
The authors have declared that no competing interests exist. the formerly named Loki2 and Loki3]) but also to the same phylum (Heimdallarchaeota) were not always monophyletic in our single-protein analyses [1].
We suspected that patches of contamination could also be present in Heimdall LC3 RpoA since this subunit is encoded by a single gene in LC3 (like in Thaumarchaeota and the related Bathyarchaea and Aigarchaea), whereas all other Asgards have a dimeric version (like all Crenarchaeota, all Euryarchaeota, and most DPANN). Unlike Spang et al. [4], we never observed best hits to Lokiarchaea nor to the two other Heimdallarchaeotes using the complete LC3 sequence (best hits to Bathyarchaea only) or various portions of N-and C-terminal sequences (Loki3/Heimdall LC3 RpoA access numbers: AKC94880/OLS19521). In contrast, we recovered best hits to other Asgards when using Heimdallarchaeote LC2 or AB_125 sequences. The branching of LC3 at the base of Asgards in the RpoA tree [3] could indicate that LC3 RpoA contains Heimdallarchaeote sequences mixed with sequences from other archaea related to Bathyarchaea.

The presence of FES in dataset favours 2D topologies
Spang et al. [4] argued that their 2D trees are not affected by Long-Branch Attraction (LBA), because LBA would have attracted eukaryotes outside the Archaea. However, the branching of Archaea in the well-known fast-evolving Methanopyrus kandleri in the 2D tree of Spang et al. [2] clearly reveals an LBA artefact. The Euryarchaea could then be possibly attracted outside Archaea because of their evolutionary relationship with DPANN [6][7][8]. This probably explains why removing FES dramatically increased (from 1 to 11) the number of trees recovering the 3D topology [1].

There is no strong LBA affecting the position of Eukarya in the 3D trees
Spang et al. [4] argued that LBAs were affecting the 3D trees we obtained (i.e., with the concatenated 6 AU-relevant Woese and the 11 Woese proteins, and the RNA polymerase [1]), resulting in the misplacement of Eukaryotes between the bacterial outgroup and the Archaea. This seems unlikely for the RNA polymerase, since we have shown that Asgards remained a sister group to Euryarchaeota (Archaea rooted in Thaumarchaea) in the absence of bacteria ( Figure S35 in [1]). We have now performed similar analyses with the two concatenated Woese protein datasets after inclusion of all available Asgard sequences and obtained trees again displaying the grouping of Asgards with Euryarchaeota (Fig 2). These results strongly support the absence of strong LBA affecting the position of Eukarya in the 3D trees. If an LBA artefact was indeed misplacing the Eukaryotes, removing the bacteria would have dramatically impacted the trees. Our new analyses corroborate instead the position of the putative Asgard superphylum as a sister group to Euryarchaeota.

Phylogenetic analyses excluding EF2 are compatible with the presence of conflicting signals within the universal markers
We never claimed that the Asgard/Eukarya affiliation could not be obtained without EF2. We ourselves obtained it when we removed EF2 from the concatenation of the eocyte proteins ( Figures S26, S27 in [1]). Zaremba-Niedzwiedzka et al. indeed obtained significant support in a maximum-likelihood framework for a 2D universal proteins tree based on 48 universal markers without EF2 [3]. However, their species dataset still contained many FES that could introduce a bias favouring 2D trees [1]. Furthermore, they could not obtain any "good" (maxdiff <0.1) nor "acceptable" (maxdiff <0.3) convergence in Bayesian framework with the CAT-GTR model (Supplementary Table 4 in [3]; see Phylobayes manual) and hence could not corroborate their results with this approach. Consequently, despite having more Asgard taxa, Zaremba-Niedzwiedzka et al. [3] could not obtain results as congruent and robust as those obtained by Spang et al. [2] once they removed EF2. The lack of convergence in their Bayesian analysis is actually compatible with the presence of conflicting signals within the universal proteins, previously hidden by the presence of EF2 [1].
Finally, the confirmed presence of many Eukaryotic Signature Proteins (ESPs) in the genomes of the additional Asgards [3] cannot be seen as a confirmation of their grouping with Eukarya since their presence could also be explained by their losses in other archaeal lineages, as suggested for ESPs in Thaumarchaeota [9][10][11], or ancient gene transfers with protoeukaryotes.

Conclusions
Spang et al. [4] stated that we 1) used "inadequate methodology", without explaining, 2) "misinterpret[ed] data" while presenting some of our findings out of their context, and 3) "ignore [d] previous work", which our publication is actually based on. Results, criticism, and debates should be welcome in science and not trigger hostility. Although our analyses presently favour the 3D topology, we consider that the relationship between Archaea and Eukarya is still an open question requiring more studies.