Functional Characterization of Transcription Factor Motifs Using Cross-species Comparison across Large Evolutionary Distances

Extended computational pipeline to use information from other species.

Motifs and GO annotations were collected from Drosophila (“D.mel. Motifs” and “D.mel. Gene Sets”), and the best motif scanning score for each motif was obtained as described in Figure 1(C). GO annotations in Nasonia (“N.vit. Gene Sets”) were obtained from the Drosophila gene sets using a “homology map” for the two genomes. Motif scanning was performed using the selected scores, followed by motif function map construction in each genome separately. Motif – GO associations that were statistically significant in both species were reported, along with information on evolutionary conservation of the motifs. An example of how motif conservation was investigated is shown in the bottom left panel. The homeobox domain of the transcription factor ABD-A was identified in Drosophila and Nasonia using HMMER (row 1), the orthologous domains in the two species were aligned (rows 2 and 3), and a similar domain from the PDB database was added to the alignment (row 4). The positions marked in yellow are where amino acid substitutions were seen, but none of these coincides with positions of DNA-contact (rows 5 and 6) as revealed by the structural template, suggesting that the DNA-binding specificity of ABD-A is conserved (“four stars”, for MCS = 4) between the two species.

