Genome-wide characterization, evolutionary analysis of WRKY genes in Cucurbitaceae species and assessment of its roles in resisting to powdery mildew disease.

The WRKY proteins constitute a large family of transcription factors that have been known to play a wide range of regulatory roles in multiple biological processes. Over the past few years, many reports have focused on analysis of evolution and biological function of WRKY genes at the whole genome level in different plant species. However, little information is known about WRKY genes in melon (Cucumis melo L.). In the present study, a total of 56 putative WRKY genes were identified in melon, which were randomly distributed on their respective chromosomes. A multiple sequence alignment and phylogenetic analysis using melon, cucumber and watermelon predicted WRKY domains indicated that melon WRKY proteins could be classified into three main groups (I-III). Our analysis indicated that no recent duplication events of WRKY genes were detected in melon, and strong purifying selection was observed among the 85 orthologous pairs of Cucurbitaceae species. Expression profiles of CmWRKY derived from RNA-seq data and quantitative RT-PCR (qRT-PCR) analyses showed distinct expression patterns in various tissues, and the expression of 16 CmWRKY were altered following powdery mildew infection in melon. Besides, we also found that a total of 24 WRKY genes were co-expressed with 11 VQ family genes in melon. Our comparative genomic analysis provides a foundation for future functional dissection and understanding the evolution of WRKY genes in cucurbitaceae species, and will promote powdery mildew resistance study in melon.


Introduction
WRKY proteins are widely distributed in all plants and comprise one of the largest transcription factor families. Over the past decades, these proteins have been found to play an increasing number of functions in a wide range of physiological and biochemical processes [1][2][3][4][5] cucurbitaceae species. In the present study, we identified a total of 56 proteins with complete WRKY domains in melon using a hidden Markov model (HMM) that allows for the detection of the WRKY domain across highly divergent sequence. Multiple sequence alignments, phylogenetic relationships, chromosome distributions, gene duplication, syntenic relationship and selection pressure analysis of WRKY orthologous pairs from cucurbitaceae species were also performed. In addition, this study also determined the expression patterns of CmWRKY genes in 29 different tissues and measured their abundance under powdery mildew fungus infection using quantitative RT-PCR (qRT-PCR). Furthermore, we also performed the correlation analysis between CmWRKYs and CmVQs expression using the data from RNA-seq. Our results will undoubtedly provide a foundation for further evolution and functional studies of cucurbitaceae WRKY family genes.

Plant materials and powdery mildew fungus inoculation
Powdery mildew fungus was collected from cultivated melon grown on the experimental farm of Shandong Academy of Agricultural Sciences with normal day/night period. Cultivated melon (B29) were grown in the greenhouse with a photoperiod of 16/8 h (day/night) and a temperature of 22˚C/12˚C (day period/night period). Plants with two or three true leaves were inoculated by powdery mildew fungus with a concentration of 1× 10 6 /mL as previously described [31]. Leaves were harvested at 0, 24, 72, 168 h post inoculation, and immediately frozen in liquid nitrogen and stored at -80˚C for the following RNA extraction. Three biological replicates were prepared for each sample.

Identification of candidate WRKY gene family members in melon
The Hidden Markov Model (HMM) of the WRKY domain (PF03106) was downloaded from the Pfam protein family database (http://pfam.sanger.ac.uk/) and used to identify putative WRKY proteins from the melon protein sequence database (http://cucurbitgenomics.org/) using the HMMER search program with E-value threshold of 0.01 (HMMER 3.0; http:// hmmer.janelia.org/). All non-redundant protein sequences encoding complete WRKY domains were selected as putative WRKY proteins and confirmed using the SMART software program (http://smart.embl-heidelberg.de/).

Multiple sequence alignment and phylogenetic analysis
Multiple sequence alignment of all conserved WRKY core domains was determined using Clustal Omega online software (http://www.ebi.ac.uk/Tools/msa/clustalw2/) with default parameters: number of combined iterations (0), max guide tree iterations (-1) and max HMM iterations (-1). Subsequently, MEGA7.0 software was used for phylogenetic analysis using the neighbor-joining (NJ) method under the Jones-Thornton-Taylor (JTT) model with 1000 replicates of bootstrap based on the alignment results and the flowing parameters: substitution type, poisson model, uniform rates, partial deletion. The phylogenetic tree showed only branches with a bootstrap consensus > 50. Furthermore, a phylogenetic tree was also constructed using Maximum Likelihood (ML) method based on the JTT matrix-based model with 1000 replicates of bootstrap and the flowing parameters: substitution type, poisson model, uniform rates, partial deletion. Branches corresponding to partitions reproduced in less than 30% of bootstrap replicates were collapsed. Based on the multiple sequence alignment, phylogenetic analysis and the previously reported classification of CsWRKY and ClWRKY genes, the CmWRKY genes were assigned to different groups and subgroups.

Chromosomal location, gene duplication and selection pressure analysis
The chromosomal location information of all melon WRKY genes was obtained from cucurbit genomics database (http://cucurbitgenomics.org/). The map was generated using MapInspect software (http://mapinspect.software.informer.com/). To detect segmental and tandem duplication events, every WRKY sequence was respectively aligned against the other WRKY protein sequences in melon using BLASTp program to identify potential homologous gene pairs (Evalue < 1e -5 , top 3 matches) and output format as tabular. Then, the destination tabular file and the GFF file of melon genome were inputted into software MCScanX to analyze duplication types with the flowing parameters: match score (>50); match size (5); gap penalty (-1); overlap window (5); E-value: 1e -5 ; max gaps (25) and visualized using CIRCOS (http://circos. ca/) [32]. Non-synonymous (Ka) and synonymous (Ks) substitution of each orthologous gene pair were calculated by PAL2NAL program based on the codon sites model with the following option setting: Codon table, universal; Remove mismatches, yes; Use only selected positions, yes; Output format, PAML. With this setting, the codon alignment corresponding to the specified regions is generated in the PAML format [33,34].

RNA-seq based expression analysis and correlation calculation
The normalized expression levels of melon WRKY and VQ genes in different tissues as well as in leaves artificially inoculated with powdery mildew fungus based on RNA-seq data were obtained from the Melonet database for functional genomics study of muskmelon (http:// gene.melonet-db.jp/cgi-bin/top.cgi). Gene expression data are presented as log 2 (FPKM value +1) to reveal difference in expression levels among different tissues. To visualize the expression patterns of the WRKY genes in different melon organs, a heat map was created using R project (http://www.r-project.org/). The co-expression analysis among WRKY and VQ genes was performed using R programming language (http://www.r-project.org/). Firstly, the Pearson Correlation Coefficient (r) and respective false discovery rate (FDR) adjusted p-values were calculated according to the expression values in different tissues as well as different developmental stages using multiple hypothesis testing method included in R software package. Then co-expressed WRKY and VQ gene pairs with r > 0.6 and FDR < 0.05 were selected to construct co-expression network using Cytoscape software (http://www.cytoscape.org/).

RNA isolation and qRT-PCR analysis
Total RNA samples were extracted from leaves using the Trizol reagent according to the manufacturer's instruction (Invitrogen, CA, USA). Reverse transcription reactions were performed at 42˚C for 1 h and were terminated at 85˚C for 5 min with 20 μl system contained 1×reverse transcriptase buffer, 50 nM Olig(dT) primer, 0.25 mM each of dNTPs, 50 units reverse transcriptase, 4 units RNase inhibitor and 2 μg DNase I-treated total RNA. CmActin was used as the internal control. The qRT-PCR program was set as follows: 95˚C for 5 min, then followed by 40 cycles of 95˚C for 15 s and 60˚C for 1 min. The 2 -44Ct method was used to calculate the relative expression levels and the analysis of covariance method (ANCOVA) was used to access whether the results were statistically different. Primers used for qRT-PCR experiments were listed in S1 Table.

Identification of WRKY family genes in melon
To identify WRKY family genes in melon, Hidden Markov Model (HMM) and BLASTP searches were performed against reference genomes of melon using the consensus sequence of the WRKY domain. As a result, a total of 56 proteins with complete WRKY domains were identified in melon, which were termed as CmWRKY1 to CmWRKY56. The numbers of WRKY family members was approximately equal with other cucurbitaceae species cucumber and watermelon but less than that in model plants Arabidopsis (74 members) and in rice (109 members). The nucleotide and amino acid sequences of all identified WRKY families are presented in S2 Table. As shown in S2 Table, the lengths of the putative CmWRKY proteins ranged from 111 to 768 amino acids, with an average of 342 amino acids. Based on the number of WRKY domain and the type of zinc finger motif, CmWRKY proteins could be classified into three groups, group I (11 sequences), group II (40 sequences) and group III (5 sequences) based on the number of WRKY domain and the type of zinc finger motif (S3 Table).

Multiple sequence alignment and phylogenetic analysis of WRKY domains
The WRKY domain consists of approximately 60 amino acid residues and contains one or two highly conserved short peptide WRKYGQK as well as a conserved C2H2-or C2HC-type zinc finger motif. The conserved short peptide WRKYGQK is considered to be important for recognizing and binding to W-box elements. A multiple sequence alignment of the core WRKY domain in melon were performed and shown in S1 Fig. WRKYGQK sequences represented the major variant in 54 melon WRKY proteins. WRKYGKK sequence was observed only in two WRKY proteins (CmWRKY16 and CmWRKY48) that belong to group IIa.
To identify the evolutionary relationships of the WRKY proteins among three cucurbitaceae species (melon, cucumber, and watermelon), a NJ and a ML phylogenetic tree were generated using multiple sequence alignments of all the conserved WRKY domain sequences with a bootstrap analysis (1,000 replicates), respectively. Based the classification of WRKY domains in cucumber and watermelon, all melon WRKY genes were classified into three main groups, with five subgroups in group II. As shown in Fig 1 and S2 Fig, group I contained 11 melon WRKYs, 11 watermelon WRKYs and 12 cucumber WRKYs. While, group III contained 5 melon WRKYs, 8 watermelon WRKYs and 6 cucumber WRKYs in both NJ and ML phylogenetic tree. Besides, 40 CmWRKY proteins in group II could be classified into five subgroups based on the phylogenetic trees, IIa (3 sequences), IIb (7 sequences), IIc (15 sequences), IId (7 sequences), and IIe (8 sequences). Meanwhile, 34 ClWRKY proteins in group II could be classified into IIa (2 sequences), IIb (8 sequences), IIc (11 sequences), IId (6 sequences) and IIe (7 sequences), and 38 ClWRKY proteins in group II could be classified into IIa (5 sequences), IIb (7 sequences), IIc (12 sequences), IId (7 sequences) and IIe (7 sequences). Interestingly, many WRKYs belong to group IIa in NJ phylogenetic tree were classified into group IIc in ML phylogenetic tree, which may because the domain sequences of WRKYs in group IIc were prone to mutate. Detailed information about the classification of the WRKY genes, as well as the sequences of conserved WRKY domains and zinc-finger motifs in each gene can be found in S3 Table. Genome-wide distribution and duplication of WRKY family genes Fig 2 showed the distribution of the WRKY genes on melon chromosomes. As shown in the figures, the WRKY genes were unevenly distributed throughout all chromosomes, and the number on each chromosome was not related to its length. In melon, chromosome 6 contained the largest number (9) of CmWRKY genes. Several patterns of gene duplication including whole-genome duplication (or segmental duplication) and single gene duplication have been identified. Single gene duplication is involved in five types of duplication-tandem, proximal, retrotransposed, DNA-transposed and dispersed duplication. Of these patterns, segmental and tandem duplications have been suggested to represent two of the main causes of gene family expansion in plants. Therefore, we focused on segmental and tandem duplication events in our study. To access the contribution of segmental and tandem duplications to the genomewide expansion of WRKY family in the melon genome, we evaluated the gene duplication events of CmWRKY genes using MCScanX program. There are 8 CmWRKY pairs (CmWRKY2-CmWRKY39, CmWRKY3-CmWRKY43, CmWRKY9-CmWRKY25, CmWRKY9-CmWRKY35, CmWRKY15-CmWRKY49, CmWRKY25-CmWRKY35, CmWRKY27-CmWRKY45, CmWRKY26-CmWRKY50) were identified as segmental duplications and no tandem duplication was identified (Fig 3), suggesting that most CmWRKY genes were possibly generated by gene segmental duplication. Indeed, Ling has discovered that no tandem gene duplication events occurred in CsWRKY gene evolution because of no paralogs of cucumber can be detected through phylogenetic and nucleotide identity analysis.

Orthologous gene identification and selection pressure analysis of WRKY orthologous genes among cucurbitaceae species
In our study, we further identified 44 orthologous pairs between melon and watermelon, 43 orthologous pairs between melon and cucumber, and 40 orthologous pairs between watermelon and cucumber according to the phylogenetic and homogeneity analysis (S4 Table). The highest and lowest amino acid identity between melon and cucumber were the pairs CmWRKY50-CsWRKY24 (98.62%) and CmWRKY7-CsWRKY35 (74.21%) with an average sequences identity of 93.00%. The highest and lowest protein sequence identity between melon and watermelon were the pairs CmWRKY53-ClWRKY16 (95.92%) and CmWRKY5-ClWRKY26 (65.51%) with an average sequences identity of 83.57%. The highest and lowest amino acid identity between watermelon and cucumber were the pairs CsWRKY24-ClWRKY15 (95.65%) and CsWRKY37-ClWRKY45 (61.67%) with an average sequences identity of 83.93%. The chromosomal location and syntenic relationship of orthologous gene pairs were shown in Fig 4. Physical mapping revealed that most WRKY genes (98%) in cucurbitaceae species were not located in the corresponding chromosomes of melon, watermelon and cucumber, suggesting the occurrence of large chromosome rearrangement in the cucurbitaceae genomes. Furthermore, the dissimilarity level between the non-synonymous substitution (dN) and synonymous substitution (dS) values was used to infer the direction and magnitude of natural selection acting on WRKY orthologous gene pairs in melon, watermelon and cucumber. The results showed that the WRKY orthologous gene pairs in cucurbitaceae species underwent strong purifying pressure during evolution (Table 1).

Expression patterns of CmWRKYs in different tissues at different developmental stages
To gain insights into the potential functions of CmWRKYs, the expression patterns of the CmWRKYs were investigated using publicly available data from RNA-seq of melon transcript expression generated from 29 different tissues as well as different development stages of melon. The results showed that the transcript abundances of different CmWRKYs were extremely diverse (Fig 5)

Co-expression analysis of CmWRKY and CmVQ in response to powdery mildew infection
In order to gain information about hypothetical interactions between WRKY proteins and VQ proteins in melon, we performed the expression correlation analysis between CmWRKYs and CmVQs using the data from RNA-seq. Firstly, 24 VQ motif-containing proteins were identified in melon genome. A total of 24 CmWRKY genes were co-expressed with 11 CmVQ genes with the correlation coefficient was greater than 0.7 (Fig 6; S5 Table). Nine CmWRKY genes were simultaneously co-expressed with two different CmVQ genes and 15 CmWRKY genes were only co-expressed with one CmVQ gene. Besides, some CmWRKY (CmVQ) genes showed co-expression patterns with other members of WRKY (VQ) family genes. Therefore, the co-expression may be important for further functional analysis of CmWRKYs.
To access the function of WRKY and VQ family genes in resisting to powdery mildew fungus infection, expression profiles of all CmWRKY and CmVQ in leaves artificially inoculated with powdery mildew fungus were performed based on data from RNA-seq. A total of 16 CmWRKY and five CmVQ exhibited different expression levels in response to powdery mildew inoculation, indicating that they are powdery mildew fungus responsive genes and might play important roles in resisting to powdery mildew disease in melon (Fig 7). Among them, most CmWRKY were up-regulated after inoculation except for CmWRKY15. Besides, up-regulation was also observed for CmVQ6, CmVQ16, CmVQ23, while CmVQ18 and CmVQ21 showed the   opposite pattern after inoculation. To validate the RNA-seq data, qRT-PCR was performed to examine the expression of several CmWRKYs and CmVQs that may be involved in resisting to powdery mildew disease and the results were in agreement with the sequencing data (Fig 8).

Genome-wide exploration and phylogenetic analysis of WRKY genes among cucurbitaceae species
The evolutionary relationship and function analysis of WRKY genes have been thoroughly studied in most plants. In the previous report, genome-wide analysis of WRKY family genes in cucumber and watermelon has also performed, and 55 CsWRKYs as well as 63 ClWRKYs in the cucumber and watermelon genome were identified, respectively [11,14]. To further gain information with respect to the evolutionary relationship and function of WRKY genes among cucurbitaceae species, we exploited available genomic resources to characterize the WRKY Genome-wide characterization, evolutionary analysis of WRKY genes in Cucurbitaceae species family genes in the other cucurbitaceae species, melon. A total of 56 putative WRKY genes (proteins) were identified in the genome of melon. The numbers of WRKY family members in melon were approximately equal to that in cucumber (56) and grape (55) but less than that in Arabidopsis (72 members) and in rice (102 members), peanut (152) and other crops, which maybe because unlike Arabidopsis and rice, cucurbitaceae species and grape have a much smaller genome and have not undergone recent whole-genome duplication after the differentiation of eudicots and monocots. Indeed, previous studies showed that Arabidopsis has undergone three whole-genome duplication events, and rice has also undergone at least one wholegenome duplication event, which promoted the rapid expansion of gene family [35,36]. However, Huang and Garcia-Mas have respectively observed the absence of recent whole-genome duplications in cucumber and melon genomes [28,29].
It is worth noting that ClWRKY11 that contain one WRKY domain were clustered in group IN from the phylogenetic analusis. Meanwhile, two CsWRKY, CsWRKY46 and CsWRKY51, were clustered in group IN, and CsWRKY47 and CsWRKY52 were clustered in group IC. Thus, these WRKYs in group II may have arisen from a two-domain WRKY protein in group I that lost one of its WRKY domains located in the N-terminal or C-terminal during evolution. The previous studies have reported that group I, group II, group III contained 13, 45, 14 WRKY proteins in Arabidopsis and 13, 42, 47 WRKY proteins in rice, respectively [37]. Compared with the numbers of WRKY families in Arabidopsis and rice, it is apparent that variations in the number of WRKY genes in group III are the primary cause of the diversity of WRKY gene family size between melon and other plants. Therefore, a key role of group III WRKY genes in plant evolution may exist, which is possible that the genome or gene family duplication events have resulted in the different size of the group III WRKY genes among cucurbitaceae species, Arabidopsis and rice.

The orthologous and expression analysis of WRKY genes provide important clues for their function
The recent gene duplication events including segmental duplication and tandem duplication are most important in the expansion and evolution of gene families, and were considered to be important driving forces in the expansion and evolution of gene families and the raw materials for new biological functions. Therefore, we further analyzed the influence of recent duplication events to WRKY family genes in cucurbitaceae species. The results showed that no recent tandem duplication event in WRKY family genes of melon was found. Besides, only 8 segmental duplications of CmWRKY were confirmed in melon genome, suggesting that suggesting that low tandem and segmental duplications events existed in CmWRKY genes family, which consists with the results in watermelon and cucumber [14]. Therefore, the absence of recent duplication events in melon, watermelon and cucumber genome may attribute to the small size of WRKY members.
Given that orthologous genes among different plants are generally supposed to retain similar functions and to share other key properties. Thus, the comparative analysis of WRKY orthologous genes among cucurbitaceae species could help to predict their genetic relationship and the potential functions of WRKY proteins in melon, cucumber and watermelon. For example, AtWRKY46 played very important roles in response to drought and salt stresses in Arabidopsis, and as the orthologous gene of AtWRKY46 in watermelon, ClWRKY23, was also reported to be up-regulated under drought and salt stresses, implying that these two WRKY play similar functions in different plant species [14]. In our study, a total of 44 orthologous pairs between melon and watermelon, 43 orthologous pairs between melon and cucumber were identified, which provided important clues for further functional prediction of WRKY genes in melon. A WRKY transcription factor from cucumber, CsWRKY46, confers cold resistance in transgenic plants by regulating a set of cold-stress responsive genes in an ABA-dependent manner [38]. As its orthologous genes, CmWRKY29 may also play similar roles in response to cold stress in melon. However, further molecular and biological experiments should be carried out to investigate their biological function.
WRKY transcription factors play very critical roles in different tissues or different developmental stages to control plant growth and development. For instance, virus-induced silencing of GmWRKY58 and GmWRKY76 in soybean causes severe stunted growth with reduced leaf size and plant stature, and overexpression of the OsWRKY31 could reduce lateral root formation and elongation [39,40]. To provide important clues for gene function prediction, thus, we conducted a digital gene expression analysis for CmWRKY genes in different tissues including leaf, root, stem, tendril and flower ovary and fruit as well as their different developmental stages using released data from RNA-seq. The temporal and spatial diversification of WRKY gene expression is widespread, which is important for gene function analysis. CmWRKY6, CmWRKY17, CmWRKY19, CmWRKY23, CmWRKY42, CmWRKY47 and CmWRKY52 had a higher expression level in all tested tissues, implying that these genes play key roles in the whole-plant growth and development. Furthermore, many CmWRKYs had a higher expression level in certain organs/tissues or in different developmental stages, which suggested that they play divergent roles in the different developmental processes.

CmWRKY might play important roles in resistance to powdery mildew disease through interaction with CmVQ
In addition to the roles of WRKY proteins in plant growth and development, increasing reports indicated that WRKY proteins in various plant species are involved in the response to various biotic and abiotic stresses. Moreover, many WRKY proteins are also the components of plant biotic stress regulatory networks and directly regulate the expression of several critical genes of defense-signaling pathways [41,42]. For example, AtWRKY33 can positively modulate defence-related gene expression and improve resistance to B. cinerea disease, while AtWRKY7 and AtWRKY48 have an immediate negative effect on plant defense response [43][44][45][46][47]. Overexpression of OsWRKY67 in rice confirmed enhanced disease resistance to Magnaporthe oryzae and Xanthomonas oryzae, but led to a restriction of plant growth in transgenic lines [48]. Furthermore, Guo demonstrated that 16 VvWRKY have a negative effect on grape powdery mildew resistance and 22 VvWRKY have a positive effect on grape powdery mildew resistance in grape [49]. Transcriptome analysis and qRT-PCR expression profiles in melon leaves generated in the current study also revealed that many WRKY genes were responsive to powdery mildew inoculation, thus highlighting the extensive involvement of WRKY genes in resisting to powdery mildew disease.
Previous reports showed that some rice and grape WRKY genes were co-expressed with VQ genes during the response to attacks by three different pathogens [50,51]. Further studies have revealed that WRKY proteins can interact physiologically with VQ motif-containing proteins [2,52]. Thus, VQ and WRKY proteins might assemble to form one complex to regulate the target gene. Indeed, VQ9 protein acts as a repressor of the WRKY8 protein to modulate salinity stress tolerance and pathogen resistance in Arabidopsis [53]. In the present study, we found that 24 CmWRKY genes were co-expressed with 11 CmVQ transcription factors (Fig 6). Furthermore, some of the co-expressed CmWRKY and CmVQ genes were shown to be response to powdery mildew infection, implying that these VQ and WRKY proteins are involved in the same biological pathway. For example, CmVQ6 positively co-expressed with CmWRKY47, CmWRKY55, CmWRKY56 and all these were up-regulated by powdery mildew inoculation (Figs 7 and 8), suggesting that different WRKY have the same VQ partner. From above results, we have shown which WRKY proteins interact with which VQ proteins. Further research should carried out to explore their physical interactions in vitro during the responses to powdery mildew infection as well as various abiotic stresses in melon to provide further molecular evidence for these interactions.

Conclusions
In this study, we identified 56 WRKY family genes in melon. WRKY proteins in cucurbitaceae species were under strong purifying pressure. The expression pattern analysis of CmWRKYs in different tissues as well as under powdery mildew infection indicated that they were involved in the growth and development of various tissues and that they might positively or negatively participated in plant tolerance against powdery mildew disease. Furthermore, the co-expression analysis between CmWRKY and CmVQ will assist in understanding the roles of these CmWRKY transcription factors in response to powdery mildew disease and their potential interactions among defence-related genes in the disease resistance network. Collectively, our findings provide valuable clues for further research on the specific function and regulatory mechanisms of WRKYs in cucurbitaceae species, and could help to select appropriate candidate genes for further characterization of their pathogen resistant functions in melon.