Bioinformatic analysis of a novel Echinococcus granulosus nuclear receptor with two DNA binding domains

Nuclear receptors are ligand-activated transcription factors capable of regulating the expression of complex gene networks. The family includes seven subfamilies of proteins with a wide phylogenetic distribution. A novel subfamily with two DNA-binding domains (2DBDs) has been reported in Schistosoma mansoni (Platyhelminth, Trematoda). This work describes the cDNA cloning and bioinformatics analysis of Eg2DBDα, a 2DBD nuclear receptor isoform from the parasite Echinococcus granulosus (Platyhelminth, Cestoda). The Eg2DBDα gene coding domain structure was analysed. Although two additional 2DBD nuclear receptors are reported in the parasite database GeneDB, they are unlikely to be expressed in the larval stage. Phylogenetic relationships between these atypical proteins from different cestodes are also analysed including S. mansoni 2DBD nuclear receptors. The presence of two DNA binding domains confers particular interest to these nuclear receptors, not only concerning their function but to the development of new antihelminthic drugs.


Introduction
The field of nuclear receptors (NRs) has undergone an astounding evolution since the biochemical identification of the first nuclear receptor in the early of the 1960's [1]. Nevertheless, important gaps in our knowledge remain to be addressed, concerning NRs action, ligands, coactivators and co-repressor proteins and other components of the transcriptional machinery.
NRs belong to a large protein superfamily of ligand-activated transcription factors that bind specifically to DNA sequences. This family includes receptors for steroid hormones, thyroid hormone, vitamin D, retinoids, fatty acids and prostaglandins, which bind hydrophobic ligands and regulate a variety of mammals' genes [2][3]. A large number of them have no defined ligand and are hence described as orphan receptors [2,[4][5]. Receptors of this family have been reported in insects, worms and amphibians, in addition to mammalian species [6]. According to phylogenetic studies, NRs emerged long before the divergence of vertebrates and invertebrates, during the earliest metazoan evolution [7]. PLOS  hydatid cyst fluid with protoscoleces was aseptically aspirated according to Esteves and collaborators [23]. Once settled by gravity, the protoscoleces were extensively washed in phosphatebuffered saline (PBS) to remove dead protoscoleces debris and were observed under a light microscope for viability (flame cell and vital dye exclusion). Finally, the material was kept frozen at -80˚C in TRizol TM Reagent (ThermoFisher Scientific) until use.

Identification of putative nuclear receptors of E. granulosus
A search of sequences of nuclear receptors was performed in the GeneDB database (www. genedb.org) using Mus musculus peroxisome proliferator-activated receptor PPARα amino acid sequence as the template. Three sequences with high score were identified. The following primers were designed to amplify each coding region from the conserved DBD to the stop codon:

RNA isolation and cDNA synthesis
Total RNA was isolated from protoscoleces using TRizol TM Reagent according to the manufacturer's instructions. Retrotranscription was performed using RT Superscript III (Invitrogen), with 200 ng of random primers and 4.6 μg of total RNA. Total RNA was DNA digested with RQ1 RNase-Free DNase (Promega), following the manufacturer's instructions.

Polymerase chain reaction
Polymerase chain reaction (PCR) was performed using a KAPA HiFi HotStart kit (KAPA BIO-SYSTEMS) with a final volume of 25 μl, 10 pmol of forward and reverse primers (Fw37-Rev37, Fw24-Rev24, Fw45-Rev45), 0,8 mM dNTPs, 1x KAPA buffer, 1,5 mM MgCl2, 1 μg E. granulosus protoscoleces cDNA and 0,5 U of the enzyme. Reaction conditions were 3 min at 95˚C followed by 35 cycles of 20 sec at 98˚C, 30 sec at 66˚C, 120 sec at 72˚C, with a final extension of 4 min at 72˚C. Negative controls were included. The absence of contaminant genomic DNA was controlled through the amplification of a known gene (egfabp1) having an intron [23]. The unique PCR product obtained was fractionated by 1% agarose gel electrophoresis, the band was excised from the gel and purified with a Universal DNA purification kit (Tiagen). This product was subjected to a second amplification using RANGER DNA polymerase (RANGER Mix, BIOLINE) and the same set of primers, following the manufacturer's instructions. The same cycling conditions were employed, except for primers annealing temperature (63˚C), the final extension (72˚C x 5 min) and the number of cycles (30). After the re-amplification step the DNA band was purified from the agarose gel as mentioned above and sequenced at Macrogen Service (Korea). This PCR product matched with EgrG 000379600.1 sequence. In addition, the following primers were designed to amplify the lacking 5´EgrG 000379600.1 coding region: 5´-ATGGCACAAACACCTGCAGCCACAG-3´(Eg2DBD-1F) and 5´-GATGAG TTTGTCCGCCAGAG-3´(37_MSLrev).

Cloning
Both PCR products from EgrG 000379600.1 were cloned to the pGM-T vector (pGM-T ligation Kit, TianGen) according to the manufacturer's instructions. After that, three clones of each cloning reaction were verified by colony PCR and the purified plasmids were sequenced at Macrogen Service (Korea), using universal primers (T7 and SP6 promoter), and specific primers (HKWRev: 5'-TTCTAGACGGCTCCACTTATG-3'; VPQRFw: 5'-GTACCACA GATGCCTATCAC-3') and those mentioned to amplify the 5´region.

Bioinformatics studies
Nucleotide sequence was analysed using Chromas v.

DNA binding domains structure
The amino acid sequence of Eg2DBDα was employed to search structurally homologous sequences in the protein data bank (PDB) using the sequences annotated by structure (SAS) server available at EBI [25]. The amino acid similarity between the target (query) and the best templates was observed to be less than 40%.
Two templates were selected to model the DBDs: 2FFO to model DBD I and 2NLL to model DBD II. Ab Initio models were obtained by The Iterative Threading Assembly Refinement (I-TASSER) server [26] with the following parameters (DBD I: TM-score = 0.78, Cscore = 0.51, RMSD = 2.8Å; DBD II: TM-score = 0.77, C-score = 0.44, RMSD = 2.9Å) and refined by means of an energy minimization in MOE ChemComp suite package (MOE-Molecular Operating Environment, Chemical Computing Group, Montreal, 2014), ETH-Am-ber12 force field, gradient conjugate algorithm with a convergence gradient of 0.01 kcal/mol. Both structures were superposed with their pdb template hits, 2FF0 and 2NLL respectively. The new coordinates of the DBDs were saved, together with the double strand DNA structure from 2NLL in order to show a possible binding/mechanism of interaction between the two macromolecules. PROCHECK was then used to analyse the structural and stereo-chemical properties of the domains by recognizing overall and residue-by-residue geometry. A Psi/Phi Ramachandran plot was used to assess the quality of the model. The reliability of the model was assessed via ERRAT, which examines the statistics of non-bonded interfaces between diverse atom types.

Phylogenetic studies
A rooted phylogenetic tree using programs from the MEGA (X) package and amino acid data sets for both DBDs of cestodes and S. mansoni 2DBDα, 2DBDβ, 2DBDγ was constructed [27]. Sequence alignment was performed using the CLUSTAL^algorithm software under default conditions. The topology and the branch lengths of the phylogenetic tree were estimated using the Maximum Likelihood method under Jones-Taylor-Thornton (JTT) substitution model, with a discrete Gamma distribution to model evolutionary rate differences among sites (eight categories (+G, parameter = 1.4068)) [28]. The significance of branching points was assessed by bootstrapping with 100 replicates. As an external group sponge NR sequence was included.

Primary structure analysis
Three E. granulosus 2DBD nuclear receptor sequences were extracted from the GeneDB database using Mus musculus PPARα as a template, with the following accession numbers: EgrG 000379600.1 (Eg37), EgrG 00240200 (Eg24) and EgrG 000458200 (Eg45). Although three specific primer sets were assayed, only one positive PCR amplification product from protoscoleces cDNA was obtained. The cloned 5´coding region overlaps 843 bp with the previously mentioned fragment. The identity of both fragments with EgrG 000379600.1 coding sequence was verified by alignment showing some differences explained below. The cDNA sequence reported here was deposited in GenBank under the accession number MH092994.2 and the coding protein as AZM65758.2 (Eg2DBDα). The full length sequence of 2493 bp encodes 830 residues. Comparison with the database sequence (Eg37) revealed some mismatches that lead to amino acid substitutions: A 313 to T, T 649 to M; S 775 to P and S 786 to P. A deletion of 22 residues was also observed. The translated sequence of the cloned nuclear receptor is depicted in Fig 1A. The genomic annotated sequence was compared with the translated sequence in order to identify splicing sites ( Fig 1B). Gene structure contains nine introns and ten exons expanding over 10868 bp that encodes an 830 amino-acid protein. Several approaches were performed in order to define the topological organization of the identified 2DBD nuclear receptors: SMART, PROSITE and alignment with known structures. All of them matched the DBD location but differed in the LBD identification, probably because it is a less conserved region and each tool employs different signatures for the identification of this domain. Fig 2 shows the domain organisation of the three E. granulosus 2DBD nuclear receptors according to the PROSIT.E tool.
UniProt analysis showed that the protein Eg2DBDα has two DBDs. The first DBD is predicted to be composed of two Zinc fingers with the conserved sequences Cys X 2 Cys X 13 Cys X 2 Cys and Cys X 5 Cys X 9 Cys X 2 Cys respectively, and the consensus P-box "EACKK" identified in S. mansoni [29]. The second domain also contains two Zinc fingers with almost the same pattern of the first DBD (Cys X 2 Cys X 13 Cys X 2 Cys and Cys X 6 Cys X 9 Cys X 2 Cys). Consensus P-box and D-box motifs were also identified (Fig 1).The first core DBD of 68 residues has 73% similarity and 41% identity with the second domain of 67 residues (typical core NR-DBDs are 66-70 residues). C-terminal extension (CTE region) sequences are not conserved among nuclear receptors. The lack of reported consensus sequences makes its precise identification difficult. The length of the hinge region is not easy to distinguish since no consensus regions are defined. They are usually short regions (40 amino acids approximately in most NRs) that allow flexibility between the two domains. However, the length of the hinge in Eg2DBDα can be estimated to be 163 amino acids and a similar estimation (156 residues) was obtained through PROSITE database.
Several papers have emphasized that in addition to the C domain, two regions of the LBD of vertebrate nuclear receptors are conserved, T signature and AF-2. The consensus sequence of the Tτ signature is φAKXhPXFXXLXXXDQXXLL where φ is an aromatic residue and h a hydrophobic one [16]. E. granulosus 2DBD cloned receptor Tτ region does not fit completely but is under the 20%-45% identity reported for vertebrate NR sequences [30] (Fig 1A).
Sequence alignment with solved LBD structures is particularly difficult since this region has low conservation, particularly between cestodes and vertebrates. However a careful visual inspection allowed us to detect a residue specific of class II NRs, and the absence of the specific residue of class I, reported by Brelivet et al. in 2004 [31]. The mentioned residue is a glutamic acid located at position 668 of Eg2DBDα. Anyway, as long as we have not solved the 3D structure of the molecule, we cannot say if belongs to this class of receptors. A typical AF-2 sequence was identified at residues 716 -LYVEMY-721 . The AF-2 consensus sequence is ZZXEZZ where Z is any hydrophobic amino acid [32].This motif is found in S. mansoni Sm2DBDα and Sm2DBDβ sequences [29].
The prediction algorithm from Predict Protein, localises the protein inside the nucleus. Considering this prediction and knowing that nuclear receptors should be shuttled from the cytoplasm to the nucleus we searched for a putative nuclear localisation signal (NLS). Eg2DBD under study does not contain the "classical" NLS from the SV40 large T antigen [33][34]. Since more complex nuclear localisation signal sequences than "classical" NLS have been reported, we propose two putative NLSs composed of residues 325 KRX2R 329 and 419 RRK 421 located at the putative hinge binding domain [35][36][37].
The Nt region and DBDs are also the target of post-translational modifications, particularly phosphorylation and SUMOylation. The search of the predicted sites for these modifications showed so many putative phosphorylation sites that is not possible to propose any of them. Consensus SUMOylation motifs could not be found in these regions. A RCC1 consensus motif (LAAGfHHGAYV) was identified in the first zinc finger of DBD I (residues 179-189).
For hinge region it is interesting to mention the presence of a long tract of Ser-Ala. An alanine tandem could confer a considerable flexibility to the region. Our analysis also allowed us to identify a motif similar to the GRIP-box of orphan receptors. The consensus sequence is RXGRZP where Z represents a hydrophobic amino acid while Eg2DBDα motif is 330 RIGRQP 335 where the hydrophobic consensus amino-acid is substituted by the polar residue Q [15] (Fig 1A). As previously mentioned, this motif could be considered the T-box of the CTE region.

DNA binding domains structure
The structure of Eg2DBDα DNA binding domains bounded to DNA is shown in Fig 3. Quality assessment via ERRAT and PROVE indicated that the model statistics were appropriate, consequently validating the built model. Thus, model validation suggested that the model adequately represented the native protein. Taking into consideration that there is no biological evidence of how this new class of nuclear receptor interact with the DNA in the cell and none crystallographic structure was reported, here we present a reliable model predicting one possible way of interaction between the Eg2DBDα DNA binding domains and a the response element from the crystal structure of PDB ID. 2NLL (Fig 3).
The face of DBDs that interacts with DNA contains a positive electrostatic charge potential that complements with the negative electrostatic potential of the DNA. The alpha-helix Ha of the DBD I engage the major groove side of the DNA, making base-specific interactions with the response element through de P-box motif. In particular, Lys194 and Lys195 act as hydrogen donor to establish hydrogen bonds with DNA bases at positions G504 and G505. In addition, the DBD II interacts with DNA but not through the P-box contributing to the stabilization of the complex.

Identification of 2DBD-NRs in other cestodes
A BLASTP search with Eg37 as the template through the GeneDB and WormBase Parasite databases led us to identify 2DBD nuclear receptors from other cestodes: Mesocestoides corti (MCOS_0000009801, MCOS_0000321501, MCOS_0000027401) Hymenolepis microstoma (HmN_002208800.1, HmN_000395600.1, HmN_000166400.1), Taenia saginata (TSA 0003204923; TSAS00045G06070), and Echinococcus multilocularis (EmuJ_000379600.1, EmuJ_000240200.1 and EmuJ_000458200.1). Alignment of DBD and LBD regions of these protein sequences showed specific residues that suggests that these receptors can be clustered in three groups (Fig 4). M. corti sequences were not included in the LBD alignment since the presence of long tracks of mismatches suggesting that the reference database has not been curated. P-boxes and finger patterns (C-X 2 -C-X 13 -C-X 2 -C and C-X 5 -C-X 9 -C-X 2 -C; C-X 2 -C-X 13 -C-X 2 -C and C-X 6 -C-X 9 -C-X 2 -C) were conserved in all cestodes receptor sequences analysed. However, the first DBD domain has lower sequence similarity (22%) than the second one (54%) (Fig 4). The high conservation of DBD II first Zinc finger sequences should be noted ( Fig 4B). The distance between the first and second Zinc finger of the DBD I is greater in group β NRs. The three residues of the P-box reported to be responsible for specific contacts with the DNA response elements are identical in the analysed cestodes but differ between DBD I (EAK) and DBD II (EGG). The D-box of both domains seems to be specific to each group denoting different dimerisation preferences. LBD regions of the analysed cestodes NRs have little sequence conservation. The length of the putative hinge sequence and Ct regions are completely specific to each organism with variable sequence conservation. The LBD signature (Tτ) shows few amino acids with 100% conservation with a consensus sequence that departs from the consensus sequences reported. AF-2 regions are conserved in all sequences with a consensus of L (YF) (VSTL) (EQ) M, with a bias characteristic of each subfamily (Fig 4).
We also compared Eg2DBD sequences with EgHR3-like protein, the first E. granulosus nuclear receptor characterised [32]. EgHR3-like protein DBD domain shares high identity with the second DBD domain of Eg2DBDs differing in the number of residues of the second finger. The remaining sequence has low similarity.

Phylogenetic studies
The analysis performed by Wu and collaborators [25] indicated that 2DBD-NRs underwent two rounds of duplication in a common ancestor of the Platyhelminths giving rise to three genes. To support this conclusion, we performed a phylogenetic reconstruction employing representative cestode sequences and S. mansoni 2DBD nuclear receptors Sm2DBDα, Sm2DBDβ and Sm2DBDγ (Fig 5). DBD I and DBD II were employed to this purpose. The cestode species analysed express a set of 2DBD-NR proteins that broadly overlap with that of the related platyhelminth S. mansoni 2DBD NRs (Sm2DBDα, Sm2DBDβ and Sm2DBDγ) [9,29].

Discussion
Nuclear receptors belong to the transcription factor family that regulates development, homeostasis, differentiation, and reproduction in metazoans via control of gene expression. NRs in parasitic platyhelminths were first identified in S. mansoni [18]. Later, nuclear receptors were reported in Taenia crassiceps, Opistorchis felineus and mainly in S. mansoni [9,38,[39][40]. Three nuclear receptors have been characterised in Echinococcus species, DAF-12-like, HR3-like, and estrogen-like receptors [32,[41][42]. The availability of E. granulosus sequences in the GeneDB database allowed us to search for PPARα-like nuclear receptor finding the members of the new 2DBD subfamily. The phylogenetic studies we carried out lead us to name the sequences annotated as Eg2DBDα (EgrG 000379600.1), Eg2DBDβ (EgrG 000458 200) and Eg2DBDγ (EgrG 00240200). The GeneDB database refers to the product of these sequences as 2DBDγ nuclear receptor. In this sense, we suggest to change the actual nomination of the other cestodes 2DBD sequences analysed here.
In order to characterize these proteins we used the sequence data for cDNA amplification. Only one transcript could be amplified from the larval stage using specific primers. Although it is possible that there are errors in the annotated sequences of Eg2DBDβ and Eg2DBDγ that would hinder the hybridisation of the designed primers, other scenarios are also reasonable. On the one hand, considering that the synthesized cDNA represents the set of transcribed genes, it is possible that these receptors are not being transcribed in the larval stage of protoscoleces. On the other hand, it has been reported that in this larval stage, 33% of the genes are subject to alternative splicing, so it is also feasible that the genes corresponding to Eg2DBDβ and Eg2DBDγ undergo a process of this type in which case the primer hybridisation regions could be absent [43]. In this case, only Eg2DBDα isoform is expressed in protoscoleces. Finally, we cannot discard that low-level expression of these two receptors did not allow their amplification.
Likewise, the cloned fragment presented some differences with respect to the sequence annotated in the database. Since the DNA polymerase used in this stage is a high-fidelity enzyme for its proof-reading activity, we do not consider that these changes are attributable to an error of the enzyme, but to the presence of an isoform of Eg2DBDα. Among these differences, four of them represent an amino acid change, while at position 502 of the protein sequence, the absence of 22 amino acids was observed. The analysis of the gene structure of the annotated sequence showed that the deleted region is encoded by the 3' end of the fifth exon, representing a splice with the use of an alternative 5' site. Alternative splice events can be commonly classified into seven types: intron retention (IR), exon skipping (ES), alternative 3'-site (A3'), alternative 5'-site (A5'), first alternative exon (FAE), last alternative exon (LAE) and specific of each cluster. Clustal O identity and similarity nomenclature is showed. For space reasons we do not include the complete sequence of the LBDs. Groups are indicated as α, β, and γ. mutually exclusive exon (MEE) [44]. In turn, it has been observed that the general patterns of alternative splicing vary between species, tissues and stages of development [45][46]. For E. granulosus protoscolex larval stage, the proportion of the seven alternative splice events is as follows: IR 39%, A5' 21%, A3' 17%, ES 16%, FAE 4%, LAE 2% and MEE 1% [43]. Accordingly, the type of alternative splicing event (A5') evidenced at position 502 of the protein sequence is one of the most frequent at this stage of the development of the parasite under study. Interestingly, NCBI genome project reported a protein embryonic gonad from E. granulosus (XP_024349575.1) with 70% similarity with the cloned Eg2DBDα. This protein seems to be a second isoform expressed in the adult stage of the parasite.
Bioinformatics tools were employed to characterise the Eg2DBDα cloned region. Specific nuclear receptor motifs, as well as consensus signal sequences, were identified. The Pbox motif is critical for DNA-binding specificity and is highly conserved within the family of nuclear receptors. The P-box, first labelled as "Proximal box", was identified by the Evans team as the essential amino acids for DNA binding [47]. Three residues of the P-box, the first, the second and the fifth residues, distinguish nuclear receptors and discriminate the central two nucleotides of the half recognition element of DNA. This triad was initially reported in Pboxes of Glucocorticoid receptor "GSV", Estrogen receptor "EGA", Thyroid hormone receptor "EGG"; Fushi Tarazu Factor-1 "ESG", verbA and knirps receptors "EGS" and Estrogen related receptors "EAA" [48][49].
Interestingly, the triad "EAK" from the first DBD of the Eg2DBD-NRs P-boxes (EACKK) seems to be specific to the 2DBD-NR subfamily. An exhaustive work with 57 P-box variants and various DNA repeat elements, which differ in sequences at the two central bases was reported [49]. According to their results, we propose that the target sequence for EAK Pbox could be AGG(C/G)CA. On the other hand, the "EGG" P-box triad of the second DBDs from Eg2DBD-NRs is identical to most members of the NR subfamily I, for instance, Thyroid hormone receptor, PPARs and ROR. In general, receptors of this subfamily contain an EGG P-box sequence and recognise DNA elements with the nucleotide sequence of AGGTCA. Following the previous argument, the putative target sequence for the second DBD P-box from Eg2DBD-NRs could be AGG(T/C/A)CA.
A second consensus motif has been defined in the DNA-binding domain, called D-box, which is a 5-amino acid loop that defines a strong dimerisation interface for homodimer formation and contributes, to a much lesser extent, to heterodimer stabilisation [11,49]. Eg2DBDα D-boxes, "PSGGN" and "VGCKDN", differ from D-boxes reported by Umesono and Evans for GR and ER/TR subfamilies [47]. These regions are also specific for each cestode subtype indicating different dimerisation behaviour. M. corti 2DBD-NRs are an exception since Mc2DBDα and Mc2DBDβ D-boxes of the first DBD differ from those of the other members of the subfamily.
The presence of two DBDs in the same receptor is a novel finding. There are no precedents that allow us to generate a hypothesis about how two different P-boxes would act in the same protein. The fact that each DBD has specific P-and D-boxes could indicate that each domain acts independently. The 3D model of Eg2DBDα DNA binding domains bound to DNA suggests that only one DNA binding domain is able to specifically recognize a DNA response element. The other DBD could contribute to the stabilisation of the complex.
Based on phylogenetic analysis, it was suggested that 2DBD-NRs have a close functional or evolutionary relationship with NR subfamily I [29]. We consider that the known ligands of subfamily I members, fatty acids, prostaglandins, cholesterol and retinoic acid could be suitable ligands for E. granulosus 2DBD-NRs. Of particular interest are fatty acids, since the parasite does not synthesize them de novo. Consequently, these molecules should be obtained from the host and distributed among cellular organelles by fatty acid transporters [23,50]. Recently, our research group has demonstrated the presence of the fatty acid-binding protein, EgFABP1, in the nuclei of protoscoleces cells, suggesting that it could be involved in fatty acid transport to the nucleus and delivery to PPAR-like receptors [50]. The identified Eg2DBDα-NR could be a good candidate to interact with EgFABP1.
NRs characteristically possess a major NLS (termed NL1) that is composed of basic amino acids at the C-terminal region of the DNA-binding domain [36]. Nuclear localisation signals are an apparently diverse set of sequences with a generally polybasic characteristic. The best characterised NLS of proteins is exemplified by the "classical" NLS of the SV40-T antigen, which is identifiable in the primary sequence of a protein as a series of basic residues in the form K(K/R)X(K/R) [32,[51][52]. However, several NLSs that do not completely fit this consensus have been identified. Functional NLS have been reported for glucocorticoid receptor (GR) as KKKIK, for mineralocorticoid receptor (MR) KKX2K, KRK for CRBP II, and KRR for FABP4 [34][35][36][37]. A second NLS (termed NL2) was also identified within GR and estrogen receptor LBDs [36]. A new class of NLSs termed NL0 (K/R X7RR) requiring a serine/threonine motif, was reported in MR [36]. Taking together these data, we propose two putative NLSs composed by residues 326 -KRX2R-331 and 419 -RRK-422 located at the hinge domain. The same location has been reported in other NRs [12,53].
SUMOylation has been reported to regulate protein subcellular localisation, protein-DNA binding, protein-protein interactions, transcription, DNA repair and genome organisation [54]. SUMO activating enzyme has been identified in E. granulosus (GenBank accession number CDS16643.1) suggesting that the SUMOylation pathway could be active. Two consensus SUMOylation sequences were identified in Eg2DBDα isoform at the hinge and C-terminal region. This post-translational modification has been reported for RORα, ERα and PPARα at the hinge region [55][56] and for PPARγ, LXRβ and PPARα at the LBD domain [57][58][59]. SUMOylation in the C-terminal AF-2 region is now viewed as a critical mechanism regulating the balance between trans-activating and trans-repressive functions of NRs [12]. In this sense, Eg2DBDα could undergo a similar regulatory mechanism.
The NRs have become attractive targets for the development of drugs of small molecular size, due to the lipophilic nature of their ligands and their ability to modulate the expression of multiple genes in the same pathway [60]. Parasitic helminth NRs have been proposed as a target for a new therapeutic strategy different from the known anthelmintic drugs, such as the benzimidazoles, whose target is tubulins [60]. In addition, it should be noted that approximately 13% of drugs approved by the FDA interact with NRs [61]. Taking into account that vertebrates lack 2DBD-NRs, these atypical proteins seem to be good putative targets as novel drugs.
Considering the previously mentioned, the elucidation of 2DBD-NRs targets in E. granulosus could allow for the discovery of new and specific pathways differing from those of their host.

Conclusions
Echinococcus granulosus Eg2DBDα, Eg2DBDβ and Eg2DBDɣ proteins are members of the new 2DBD-NRs subfamily. These proteins possess the typical architecture and conserved motifs of NRs. Members of this subfamily are present in other cestodes. These proteins, as well as those analysed from cestodes are phylogenetically related to S. mansoni 2DBD-NRs subtypes α, β and ɣ supporting the evolutionary relationship with NR subfamily I suggested by Wu et al. [9].
Despite the fact that E. granulosus database reports three different Eg2DBDs, only one of them could be amplified under the conditions tested. The transcript obtained corresponds to an isoform of the GeneDB database gene Egr_000379600.1. It could be the most expressed 2DBD-NR in the protoscolex larval stage. The characterised Eg2DBDα isoform might be a good candidate to interact with EgFABP1. Furthermore, this isoform contains putative NLS and SUMOylation signals.
The high similarity among the DBD domains of the Eg2DBD-NRs does not allow elucidation of whether they could have different targets, however, they probably have specific activation mechanisms since the LBDs are poor conserved among them. This fact should give rise to specific dimerisation behavior, probably involving different ligands, activators and repressors. The obtained 3D model suggests a new hypothesis of how two different P-boxes would act on a target gene. Identification of putative ligands, dimerisation behaviour and target genes will be the focus of future work to further characterise the function of these particular receptors.