Figure 1.
The phylogenetic position of the centipedes (Chilopoda), with respect to other arthropods, according to the currently best-supported phylogeny.
(See text for details). The four traditionally accepted arthropod classes are marked in bold.
Figure 2.
Plot showing that DNA from a male individual contains a distinct fraction of scaffolds that is underrepresented (black arrow), and presumably derives from heterogametic sex chromosomes.
No such fraction is present in the sequenced DNA of two individual females. The data underlying this plot is presented in File S4.
Figure 3.
Conserved macro synteny signal between S. maritima and the chordate lancelet B. floridae clustered into ancestral linkage groups.
Each dot represents a pair of genes, one in B. floridae, one in S. maritima, assigned to the same gene family by our orthology analysis. The ancestral linkage group identifiers refer to groups of scaffolds from the S. maritima (SmALG) or B. floridae (BfALG) assemblies, as detailed in File S2. The identification of ALGs is described in the SI. Note that two S. maritima scaffolds were divided across ALGs, and so appear multiple times in File S2.
Figure 4.
(A) The Hox gene cluster of S. maritima compared to the cluster that can be deduced for the ancestral arthropod. S. maritima provides the first instance of an arthropod Hox cluster with tight linkage of an Even-skipped (Eve) gene (see text). Hox3 is the only gene missing from the S. maritima Hox cluster, but may be present elsewhere in the genome on a separate scaffold (see main text and Text S1 for details). The S. maritima cluster is drawn approximately to scale and spans 457 kb from the start codon of labial (lab) to the start codon of Eve-b. Arrows denote the transcriptional orientation. (B) Remains of clustering and linkage of ANTP class genes in S. maritima. The blue boxes are genes belonging to the ANTP class. The brown box is a gene belonging to the HNF class. The orange box is a gene belonging to the SINE class. The intergenic distances are indicated in kb. (C) Clusters of non-ANTP class homeobox genes in S. maritima. The green boxes are genes belonging to the TALE class. The red boxes are genes belonging to the PRD class. The intergenic distances are indicated in kb, except in the case of Rx-Hbn as these genes are overlapping but with opposite transcriptional orientations. All scaffold numbers are indicated in brackets.
Table 1.
Instances of homeobox gene clustering and linkage.
Figure 5.
Expansion of chemosensory receptor families.
(A) Phylogenetic relationships among S. maritima (Smar), I. scapularis (Isca), D. pulex (Dpul), and a few insect GRs that encode for sugar, fructose, and carbon dioxide receptors (Dmel, D. melanogaster, and Amel, A. mellifera). (B) Phylogenetic relationships among S. maritima, I. scapularis, and a few D. melanogaster IRs and IgluR genes (the suffix at the end of the protein names indicates: i, incomplete and p, pseudogene).
Figure 6.
Ancestral protein kinases are extensively lost during arthropod evolution.
S. maritima is an exception and retains the largest number of ancestral kinases. Numbers of kinase subfamilies in selected species are shown in parentheses after species names. The gains, losses, and inferred content of common ancestors are listed on internal branches. Kinases found in at least two species from human, C. elegans and Nematostella vectenesis were used as an outgroup.
Figure 7.
Presence and absence of immunity genes in different arthropods.
Counts of immune genes are shown for S. maritima, D. pulex [131], A. mellifera [86], T. castaneum, Anopheles gambiae, and D. melanogaster [132]. ∼, identity of the gene is uncertain; -, not investigated.
Figure 8.
Dscam diversity caused either by gene and/or exon duplication in different Metazoa.
aOnly canonical Dscam paralogues were considered. bIn D. melanogaster and D. pulex the paralogue Dscam-L2 has two Ig7 alternative coding exons. cPotential number of Dscam isoforms, circulating in one individual, produced by mutually exclusive alternative splicing of duplicated exons.
Figure 9.
Frequency histogram of CpG(o/e) observed in S. maritima gene bodies.
The y-axis depicts the number of genes with the specific CpG(o/e) values given on the x-axis. The distribution of CpG(o/e) in S. maritima is a trimodal distribution, with a low-CpG(o/e) peak consistent with the presence of historical DNA methylation in S. maritima and the presence of a high CpG(o/e) peak. The data underlying this plot are available in File S4.
Figure 10.
Arthropod phylogenetic tree (with nematode outgroup) showing selected events of gene loss, gene gain, and gene family expansions.
Main taxa are listed on the tips, with representative species for which there is a fully sequenced genome listed below. Major nodes are also named. Data from the genome of S. maritima allow us to map when in arthropod evolution these events occurred, even when these events did not occur on the centipede lineage. A plausible node for the occurrence of each event is marked and colour-coded, with the possible range marked with a thin line of the same colour. The events, listed from left to right are: (1) Dscam alternative splicing as a strategy for increasing immune diversity is known from D. melanogaster, as well as the crustacean D. pulex, and thus probably evolved in the lineage leading to pancrustacea, after the split from centipedes. (2) Several wnt genes have been lost in holometabolous insects, leaving only seven of the 13 ancestral families. This loss occurred gradually over arthropod evolution, but reached its peak at the base of the Holometabola. (3) Selenoproteins are rare in insects. The presence of a large number of selenoproteins in S. maritima as well as in other non-insect arthropods suggests that the loss of many selenoproteins occurred at the base of the Insecta. (4) Expansion of chemosensory gene families occurred independently in different arthropod lineages as they underwent terrestrialisation. The OR family is expanded in insects only. (5) Chemosensory genes of the GR and IR genes have undergone a lineage specific expansion in the genome of S. maritima. As these are probably also linked with terrestrialisation we suggest that this expansion happened at the base of the Chilopoda, but it could have also occurred later in the lineage leading to S. maritima. (6) Cuticular proteins of the RR-1 family are numerous in the S. maritima genome. They are found in other arthropods, but not in chelicerates nor in any non-arthropod species. This suggests that the RR-1 subfamily evolved at the base of the Mandibulata. (7) The genome of S. maritima has a large complement of wnt genes, but is missing wnt8. Since this gene is found in the Diplopod G. marginata (a species without a fully sequenced genome), the loss most likely occurred at the base of the Chilopoda. (8) Unlike the situation in D. melanogaster, immune diversity in the S. maritima genome is achieved through multiple copies of the Dscam gene. This expansion of the family could have happened at any time after the split between Myriapoda and Pancrustacea. (9) Both circadian rhythm genes and many light receptors are missing in S. maritima. These losses are most likely due to the subterranean life style of geophilomorph centipedes and are probably specific to this group. However, we cannot rule out the possibility that they were lost somewhere in the lineage leading to myriapods. (10) The existence of JH signalling in S. maritima as well as in all other arthropods studied to date strengthens the idea that this signalling system evolved with the exoskeleton of arthropods, though its origins could be even more ancient and date back to the origin of moulting at the base of the Ecdysozoa.