16 Jan 2014: Zhang DY, Ali Z, Wang CB, Xu L, Yi JX, et al. (2014) Correction: Genome-Wide Sequence Characterization and Expression Analysis of Major Intrinsic Proteins in Soybean (Glycine max L.). PLOS ONE 9(1): 10.1371/annotation/e3307d0c-bb59-4f75-89b4-8e0d5af087d5. https://doi.org/10.1371/annotation/e3307d0c-bb59-4f75-89b4-8e0d5af087d5 View correction
Water is essential for all living organisms. Aquaporin proteins are the major facilitator of water transport activity through cell membranes of plants including soybean. These proteins are diverse in plants and belong to a large major intrinsic (MIP) protein family. In higher plants, MIPs are classified into five subfamilies including plasma membrane intrinsic proteins (PIP), tonoplast intrinsic proteins (TIP), NOD26-like intrinsic proteins (NIP), small basic intrinsic proteins (SIP), and the recently discovered X intrinsic proteins (XIP). This paper reports genome wide assembly of soybean MIPs, their functional prediction and expression analysis. Using a bioinformatic homology search, 66 GmMIPs were identified in the soybean genome. Phylogenetic analysis of amino acid sequences of GmMIPs divided the large and highly similar multi-gene family into 5 subfamilies: GmPIPs, GmTIPs, GmNIPs, GmSIPs and GmXIPs. GmPIPs consisted of 22 genes and GmTIPs 23, which showed high sequence similarity within subfamilies. GmNIPs contained 13 and GmSIPs 6 members which were diverse. In addition, we also identified a two member GmXIP, a distinct 5th subfamily. GmMIPs were further classified into twelve subgroups based on substrate selectivity filter analysis. Expression analyses were performed for a selected set of GmMIPs using semi-quantitative reverse transcription (semi-RT-qPCR) and qPCR. Our results suggested that many GmMIPs have high sequence similarity but diverse roles as evidenced by analysis of sequences and their expression. It can be speculated that GmMIPs contains true aquaporins, glyceroporins, aquaglyceroporins and mixed transport facilitators.
Citation: Zhang DY, Ali Z, Wang CB, Xu L, Yi JX, Xu ZL, et al. (2013) Genome-Wide Sequence Characterization and Expression Analysis of Major Intrinsic Proteins in Soybean (Glycine max L.). PLoS ONE 8(2): e56312. https://doi.org/10.1371/journal.pone.0056312
Editor: Dongsheng Zhou, Beijing Institute of Microbiology and Epidemiology, China
Received: August 13, 2012; Accepted: January 8, 2013; Published: February 20, 2013
Copyright: © 2013 Zhang et al. This is an open-access article distributed under the terms of the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are credited.
Funding: ZA is thankful for financial assistance from Jiangsu and Punjab sister provinces of China and Pakistan for his visit at Jiangsu Academy of Agricultural Sciences, Nanjing, People's Republic of China. This study was sponsored by National Science Foundation of China (31101166, 30971798); the Jiangsu Natural Science Foundation, China (BK2010474); the Special Fund for Independent Innovation of Agricultural Science and Technology in Jiangsu (CX(12)5132, CX(12)1005-2); Jiangsu Key Laboratory for Bioresources of Saline Soils (JKLBS2012002) and the Project in the National Science & Technology Pillar Program during the Twelfth Five-year Plan Period (2011BAD35B06-4-3). The funders had no role in study design, data collection and analysis, decision to publish, or preparation of the manuscript.
Competing interests: The authors have declared that no competing interests exist.
Water is essential for all living organisms. Like other living organisms, plant growth and development depends on water uptake and transport regulation across cellular membranes and tissues. For a long time, it was thought that water moved across cell membranes by free diffusion through a lipid bilayer. However, its transport was thought to be highly selective thus preventing uncontrolled movement of other solutes, protons, and ions. The first aquaporin gene (AQP1) was identified from human erythrocytes , and NOD26 from nitrogen-fixating symbiosomes in root nodules of soybean plants . Since their discovery, many studies have indicated that aquaporins provide an important selective pathway for water transport across cellular membranes and they have changed our understanding of water flow regulation in plants under different physiological conditions , , .
Aquaporins are integral membrane proteins belonging to a large family of water channel proteins that assist the rapid movement of water across cellular membranes. Water and solute transport are universal requirements for living cells and these proteins are found in most organisms. Of the five AQP subfamilies, the original definition of the PIP and TIP subfamilies was based on their assumed location in the plasma membrane and tonoplast, respectively. When proteins of the TIP group were localized using antibodies, the signal was always confined to the tonoplast membrane fractions . PIP localization seems less well defined. A PIP-family protein has been located to the plasma membrane in Arabidopsis . PIPs to a small extent were detected in the plasma membrane of M. crystallinum but mostly in a vacuolar fraction in continuous sucrose gradients or, more likely, in a membrane fraction with a density similar to tonoplasts . This is not surprising as it has been documented that cycling of mammalian AQPs between the plasma membrane and internal vesicles is under hormonal control . The distribution of different TIPs in distinct plant vacuoles may also be based on similar mobility .
Aquaporins (AQPs) belong to the ancient major intrinsic proteins (MIPs) family found in animals, microbes, and plants. Since discovery of AQP1, 13 different AQPs have been identified in mammals while surprisingly a high number of their homologues have been found in plants such as 35 AQPs in Arabidopsis , , 31 full length expressed AQP genes in Zea mays , 33 in Oryza sativa , 23 in Physcomitrella patens , 37 in Solanum lycopersicum , and 71 in Gossypium hirsutum . Plant AQPs sequence homologies are categorized into four subfamilies: the plasma membrane intrinsic proteins (PIPs), the tonoplast intrinsic proteins (TIPs), the nodulin26-like intrinsic proteins (NIPs) and the small and basic intrinsic proteins (SIPs) , . However, in several dicots an uncategorized X intrinsic protein (XIPs), a novel AQP subfamily, has been reported . For instance, on the basis of sequence homologies, 37 AQPs in tomato were classified into 18 PIP, nine TIP, six NIP, three SIP, and one novel XIP isoform .
Plant AQP gene expression is differentially regulated in various tissues and is also altered under different physiological and environmental stresses , . AQP gene expression patterns in many plant species in specific tissues, cell types or in response to phytohormones or environmental factors has highlighted the putative role of water channels. AQPs play a central regulatory role in plant water relations and cellular water transport , . AQPs mediate root water transport regulation in response to a variety of environmental stimuli and facilitate water transport from the roots through inner leaf tissues during transpiration and in expanding tissues . AQPs also facilitate the transport of low molecular weight molecules like urea, boric acid, CO2 etc through the plant cell membrane ,  and regulate assimilate transport in the phloem via sieve elements, stomatal control, movement of leaves, control of cytoplasmic homoeostasis etc , , . Many different mechanisms appeared to be involved in regulation of plant aquaporin activity in cellular membranes. Beyond the initial regulatory alteration of gene expression based on plant cell type, developmental stage and environmental state, the subsequently translated aquaporins are sent to their target membrane, and then, when required, they facilitate the transmembrane flux of water and/or small non-electrolytes .
The first step in investigating the role of MIPs in soybean water relations is the identification of the MIP gene family. Therefore, the objective of this study is to identify soybean MIP genes and to investigate both their structural properties and expression patterns. In this study we identified 66 MIPs in the soybean genome. This paper presents their isoforms and genome-wide classification, and expression analysis specific to various tissues and water stress.
Materials and Methods
Identification of GmMIPs
A comprehensive search using the tblastn tool at www.phytozome.net/ across all the Physcomitrella and Arabidopsis MIPs was conducted. The CDS (Coding DNA Sequence) and putative protein sequences specific to soybean were downloaded using the BioMart online tool available at the website. Every sequence was individually compared with functional annotations by browsing the soybean genome database at www.phytozome.net/cgi-bin/gbrowse/soybean resulting in the identification of 66 MIPs for further analyses. The unclassified MIPs were classified into different isoforms by comparing the phylogenetic relationship of their putative protein sequences with clearly classified MIPs from soybean and Arabidopsis downloaded from http://www.phytozome.net/search.php?show=blast and http://www.uniprot.org/uniprot/, respectively.
Multiple alignment, phylogenetic, and domain analysis
Sixty six MIPs were aligned together using ClustalX2 http://www.ebi.ac.uk/Tools/clustalx2/index.html. The untreated phylogenetic tree was constructed by the neighbor-joining method using TreeView software. The transmembrane regions were detected using the online tool available at http://www.ch.embnet.org/software/TMPRED_form.html.
In silico subcellular predicted localization, gene expression analysis and computation of ka/ks value
The protein subcellular localization was predicted using the online tool WoLF PSORT available at http://wolfpsort.org/. The gene expression in silico was obtained by in putting the locus name using an on-line search tool at www.soybase.org. The Ka/ks values of the GmMIPs were calculated using the on-line computation service at http://services.cbu.uib.no/tools/kaks. Where, ka and ks are numbers of non-synonymous and synonymous substitutions per site, respectively. Ka/ks>1 indicates gene evolution under positive selection, Ka/ks<1 indicates purifying (stabilizing) selection and Ka/ks = 1 suggests a lack of selection or possibly a combination of positive and purifying selection at different points within the gene that cancel each other out.
Identification of specificity determining positions (SDPs)
The aligned sequences of GmMIPs were inspected manually for SDPs following the prediction explained else where  and grouped into various function groups. The sequence of function groups were aligned using the ClustalW in GDE format. The SDPs were predicted using SDPpred (http://bioinf.fbb.msu.ru/SDPpred/algo.html; last accessed November 2012) and the positions where the Z-scores exceeded the Bernoulli estimator threshold were considered as SDPs .
Plant materials and growing conditions
Seeds of soybean, Glycine max var. Sudou 3, were used to grow seedlings and extract total RNA for expression analysis of MIPs in the following experiments. The soybean plants were grown in 10 cm dia pots placed in the greenhouse/field at 28/25°C day/night temperatures, 12 h photoperiod and 75% humidity. The agronomic requirements of soybean were followed and kept uniform for all the plants.
Expression analysis of GmMIPs in various tissues
The roots, stems, leaves, flowers and young pods were harvested separately from plants at the three leaf stage and total RNA extracted from the leaves, roots and stems and at maturity from flowers and pods. Semi-quantitative polymerase chain reaction (Semi-qPCR), as described in following paragraphs was used for expression analysis.
Drought or no watering inducible expression analysis of GmMIPs
The soybean plants were grown following the protocols described in the preceding paragraphs. Water application was withheld at the three leaf stage. Roots were harvested at 0, 7, 14 and 21 d from both stressed (drought) and unstressed control plants for total RNA extraction. Semi-qPCR was used for expression analysis.
Polyethylene glycol inducible expression analysis of GmMIPs
The soybean plants were grown following the protocols described above. At the three leaf stage, plants were carefully up-rooted to avoid root injury. Up-rooted plants were immediately transferred to 80 mL glass tubes containing 20% polyethylene glycol (PEG) and placed in the growth chamber. Up-rooted plants were also transferred to a control treatment in 80 mL glass tubes with no PEG. Roots were harvested at 0, 2, 4, and 12 h of PEG stress for RNA extraction. Real time or qPCR was used for expression analysis as per the conditions described in following paragraphs.
Total RNA isolation and RT-PCR
The total RNAs were extracted from the collected samples for expression analyses using TRIzol® reagent (Invitrogen & Co.) following the manufacturer's instructions and quantified using Bio–Photometer (Eppendorf). The first-strand cDNA was synthesized through reverse transcription PCR (RT-PCR) using avian myeloblastosis virus (AMV) reverse transcriptase (Promega, USA). One µg RNA was used as a template to produce cDNA in a total reaction volume of 25 uL following the manufacturer's instructions.
The reaction mixture of semi-quantitative polymerase chain reaction (semi-qPCR) consisted of 2.5 uL of 10× PCR buffer, 1.5 uL of 25 mmol/l MgCl2, 0.5 uL dNTP (10 mmol/1), 1.0 uL of each primer (10 µmol/1), 17.3 uL PCR-grade water, 0.2 uL(5 U/l) of rTaq, and 1.0 uL of the template consisted of reaction product (cDNA) from RT-PCR. The constitutive expression gene GmTubulin (accession number: XM_003550379, forward primer, 5′- AACCTCCTCCTCATCGTACT -3′; reverse primer, 5′- GACAGCATCAGCCATGTTCA -3′) was used as the internal control. The primer sequences, positions and expected product sizes are given in Table S1. The cycling parameters of semi-qPCR consisted of an initial denaturation at 94°C for 3 min; 27 subsequent cycles of denaturation at 94°C for 45 s, annealing at 55°C for 45 s, and extension at 72°C for 1 min; and finally extension at 72°C for 5 min. The qRT-PCR products were separated on a 1.0% agarose gel. The gel was viewed with a high performance CCD camera fixed in a Peiqing Gel photo system (Shanghai Peiqing Science and Technology Co., Ltd, Shanghai, China). The quantification of the bands and normalization was performed following . Three independent repeats of the semi-qPCR experiments were carried out.
Real time or qPCR
Real-time or qPCR was performed using a real-time PCR detection system (F. Hoffmann-La Roche Ltd, www.roche.com) with the SYBR® green supermix. Primers for qPCR were designed with the Primer Premier5.0 program (http://www.premierbiosoft.com/crm/jsp/com/pbi/crm/clientside/ProductList.jsp) and enlisted in Table S1. Sample preparation and qPCR analysis was conducted following SYBR® Premix Ex Taq™ (Perfect Real Time). A 10 µl of mix consisted of 5 µl of SYBR® Premix Ex Taq™ (TaKaRa Bio Inc; Shiga, Japan http://www.takara-bio.com/), 0.8 µl of each primer (forward and reverse, each 10 µmol/1), 1 µl of template consisted of reaction product (cDNA) from RT-PCR and 2.4 µl of d2H2O. The soybean GmPEPC gene (accession number: NM_001250673, forward primer, 5′-TTCCTTTATCAGAAATAACGAGTTTAGCT-3′; reverse primer, 5′-TGTCTCATTTTGCGGCAGC-3′) was used as an internal control or reference to detect the expression of the target MIPs . An equal amount of cDNA template was used for each sample including the internal control. PCR amplification conditions were as follows: an initial denaturation step for 10 min at 95°C; 40 cycles of quantification consisting of denaturation for 10 s at 95°C, annealing for 20 s at 58°C, and extension for 30 s at 72°C; and completed by melting curve analysis to confirm the specificity of the PCR product. According to the manufacturer's instructions, similar results were obtained from relative gene expression data using the change in threshold cycle (ΔCt) (i.e., ΔCT) method described by Winer . Specific gene expression levels were considered unavailable (N/A) if Ct (gene) >30 or <15. The qPCR analysis was repeated in three independent experiments.
The soybean GmMIP genes, nomenclature and their distribution
By mining the database of soybean MIPs, we identified 66 different GmMIPs (Table 1). The nomenclature of GmMIPs was established using phylogenetic relationships with known genes of Physcomitrella patens, Arabidopsis thaliana, Zea mays and Oryza sativa (Fig. S1) done previously for MIPs of other species. When aligned and compared by the Clustal-X/TreeView programs, the deduced protein sequences separated into five major branches (PIP, TIP, NIP, SIP and XIP; Fig. 1 and Fig. S2). These branches are consistent with current nomenclature developing in this field.
A comprehensive phylogenetic analysis was conducted to establish groups of homology within the GmMIP gene family. The 66 GmMIPs were classified into various subfamilies including 22 GmPIPs, 23 GmTIPs, 13 GmNIPs, six GmSIPs and two new GmXIPs i.e., uncharacterized isoforms (Fig. 1 and Tables 1 & 2). The GmXIPs might be similar to the novel plant AQP subfamily XIP recently reported in moss  and tomato . All the members of the GmPIPs subfamily localized to plasma membranes (Table 1). The predicted localization of members of the GmTIPs subfamily was diverse, and predicted localization included cytosol, plasma membrane, endoplasmic reticulum, vacuole, mitochondria and chloroplast. From the GmNIPs subfamily, GmNIP1;1 and GmNIP4;1 localized to vacuoles, GmNIP2;2 to extra-cellular structures and the rest to plasma membranes. Two members of the GmSIP subfamily localized to vacuoles and the remainder to plasma membranes. GmXIP1;1 localized to extra-cellular structures and GmXIP1;2 to cytosol. The ka/ks ratio was >1 and 1 for PseudoPIP#3 and PseudoSIP#1, and PseudoPIP#2, respectively. The remaining GmMIPs showed ka/ks ratio <1 (Table 1, also see Table S2 and Fig. S3 for details).
The GmMIPs were distributed throughout the soybean genome except chromosome 17 (Table 2). All chromosomes carried at least one (chromosome 14) and a maximum of six (chromosome 11) GmMIPs. Thirty one GmMIPs are on the + strand of double stranded DNA.
Exon-intron structure analysis
The 66 GmMIP sequences were also analyzed for distribution of introns and exons; the results are shown in Fig. 2 (also see Fig. S4). The number of introns ranged from zero (in GmSIP1;5 and GmSIP1;6) to 5 (in GmNIP1;4 and GmNIP4;1). The division into five subfamilies based on comparison of the deduced protein sequences (see above) was mirrored in the intron-exon structures. All GmPIPs included three introns except GmPIP2;13 which contained four introns (intron # 1 is additional intron). For the GmTIPs, two introns were usual, but three genes: GmTIP1;7, GmTIP1;8 and GmTIP1;9 contained only one intron while one gene GmTIP5;1 contained three introns. GmNIPs contained variable introns with the majority characterized by 4 introns. GmNIP5;1 lacked one intron (intron # 2), GmNIP6;2 lacked two introns (intron # 2 and 3) while GmNIP1;4 and GmNIP4;1 each had five introns. GmSIPs contained two introns except GmSIP1;5 and GmSIP1;6 which included no intron. GmXIPs have a single intron. All the AtPIPs, PtPIPs and some of OsPIPs have three introns, however, some OsPIPs contained fewer and one member OsPIP2;8 contained no intron , . In AtTIPs, PtTIPs and OsTIPs, the pattern was more varied; the majority of the genes contained two introns and the others either one or none. Most of AtNIPs, PtNIPs and OsNIPs were observed to have four introns while the AtSIPs, PtSIPs and OsSIPs have two , .
S stands for strand, NOI for number of introns, E for exon and I for intron.
The intron insertion positions were different among the five sub-families and also varied within sub-families. Intron length varied widely in the range of 30 to 8089 nucleotides. The length of each exon was similar for most members in each subfamily, however, deviations were also noted.
Paradigm of GmMIPs function
The MIPs specificity as a true water facilitator (AQP) or a glycerol facilitator (GlpF) or transporter of other elements such as ammonia, boron, urea etc, was characterized following previously deduced rules of sequence comparison , aquaporin specificity and phosphorylation sites , aromatic/argenine (ar/R)  and non-aqua substrates specificity ,  filters. Multiple alignments were carefully inspected to identify residues that are directly linked to substrate specificity/function.
Five discriminating positions were identified, which were conserved within each subfamily but differed between the subfamilies. These positions are located in highly conserved regions, and can be easily retrieved from any sequence (Fig. 3 and Table 3). Table 3 showed that five discriminating residues and two highly conserved NPA domains characterized all GmMIPs as water facilitators with some controversy in GmNIPs. Some GmNIPs have residues which are characteristics of GlpF-type.
PFR stands for position of first residue in each sequence segment. The positions P1 to P5 predicted to have a functional role in MIP proteins are boxed. The positions P6 and P7 indicate the positions of residues to differentiate between the subfamilies.
We also identified two adjacent additional residues within the same highly conserved region on the basis of 66 GmMIP sequences which in combination characterize the five subfamilies. In H3 transmembrane, the 6th and 7th residues following the most conserved Q residue of AQP1 and GlpF are VA and CA respectively. These residues characterized GmPIPs with C/S/V and G, GmTIPs with A/I/M/T/V and A, GmNIPs with C/L/S, and A, GmSIPs with G/V and G, and GmXIPs with I and G, respectively.
AQP1 and GlpF were used as structural templates the comparison of homology of the ar/R region in GmMIPs. In AQP1, the ar/R region is formed by Phe-58 (H2), His-182 (H5), Cys-191 (LE1), and Arg-197 (LE2; Fig. 4). In all the GmPIPs, the ar/R region is formed by Phe (H2), His (H5), Thr (LE1), and Arg (LE2). An examination of the ar/R region of GmPIPs shows close similarity to AQP1, however, the LE1 residue is Thr rather than Cys.
GmTIPs show three different ar/R subgroups: GmTIP Group IA (GmTIP1;1–8), IB (GmTIP1;9), GmTIP Groups IIA (GmTIP2;1–7), IIB (GmTIP3;3–4, and GmTIP4;1–2) and IIC (GmTIP3;1–2), and GmTIP Group III (GmTIP5;1). Homology comparisons of GmTIPs from Groups I and II show that the ar/R regions have a conserved His residue at the H2 position and a conserved Ile residue at the H5 position. His at the H2 position is replaced by Ser in Group III while Group IB and III possess Val at the H5 position. The loop E residues of GmTIPs at LE1 position is either an Ala (Group I, IIB and IIC) or a Gly (Group IIA and III). At LE2, Group I GmTIPs contain Val, Groups IIA and IIB contain the highly conserved Arg residue, Group IIC contains Leu and Group III contains Cys.
Nodulin 26 is a well-studied GmNIP, which has been identified as aquaglyceroporin. Six GmNIPs (GmNIP1;1–5 and GmNIP4;1, Group I) possess a conserved ar/R tetrad motif of the Nodulin 26. Group IIA (GmNIP5;1) and IIB GmNIPs (NIP7;1–2) have a divergent ar/R tetrad with the substitution of an Ala for Trp at position H2 and Gly for Ala at LE1. Group IIA has further substitution of Ile for Val at H5 position. Group III GmNIPs (GmNIP2;1–2) deviate from Group II with substitution of Gly and Ser at H2 and H5 positions. GmNIP6;1 is divergent from Group II with substitution of Asn and Ser at H2 and LE1 positions respectively, while in GmNIP6;2 Thr replaced Ala at H2 position. GmNIP5;1, GmNIP6;1 and GmNIP6;2 also possess a substitution within the NPA motif in loop E, a bulkier Val residue substituted the conserved Ala residue (i.e. NPV).
Analysis of the ar/R regions of GmSIPs suggests that three different combinations of residues are formed, Group I (GmSIP1;1–2), Group II (GmSIP1;3–4) and Group III (GmSIP1;5–6). Group I shows Ile, Thr, Pro and Phe residues at H2, H5, LE1 and LE2, respectively. The residues Val, Met, Pro and Asn in Group II are present at H2, H5, LE1 and LE2, respectively. In Group III, Asn and Ile replaced Val and Met of Group II at H2 and H5 positions. The GmSIP1;1–4 possess the NPT sequence and the GmSIP1;5–6 have the NPS sequence in place of the characteristic first NPA motif in loop B.
The GmXIPs show divergent ar/R region, three residues at H2, H5 and LE1 are different from AQP1 and GlpF. The residues at H2 and H5 are Val and Ile respectively in both the GmXIPs. At LE1 position, Ala replaced Val in GmXIP1;2 compared to GmXIP1;1. GmXIP1;2 also contains a substitution within both NPA motifs, Ile replaced Ala in first NPA while Ser replaced Asn in second NPA motif. GmXIP1;1 has substitution within first NPA motif where Ser and Val replaced respective residues of Asn and Ala (i.e., SPV).
All GmMIPs were analyzed for putative specificity determining positions (SDPs) for non-aqua substrate . SDPs of five substrates (ammonia, boric acid, CO2, H2O2, and urea) were observed while SDPs for silicic acid were not observed in GmMIPs (Table 4). Most of GmPIPs, GmTIPs and GmNIPs contained these SDPs while none were present in GmSIPs and GmXIPs (Fig. S5).
Regulation of MIPs functions.
The phosphorylation sites in C-terminal domain and N-terminus were detected and presented in Fig. 4. A highly conserved Ser residue is present in the N-terminal motif, RKXSXXR/K or only KXSXXR/K, which is conserved in all GmPIPs except GmPIP1;2. This motif is also conserved in GmTIPs, GmNIPs, GmSIPs and GmXIPs but usually Ser is replaced by Thr in GmTIPs, GmSIPs and GmXIPs, and by Pro in GmNIPs. Thr also replaced Ser in GmPIP1;2. His and Arg replaced Ser in GmSIP1;5 and GmSIP1;6 respectively. The C-terminal motif SFRS is present in all GmPIP2 subfamily members while the PFK/ST/S motif is present in all GmPIP1 subfamily members. The motif, KXXSXXK, is present in GmNIP1 subfamily members including NOD26, the KSXXR motif in GmNIP2 subfamily members, KIFKT in GmNIP4;1, the motif XSFRR in GmNIP5;1, GmNIP6;1 and GmNIP6;2, and XPFCS in GmNIP7 subfamily members. All GmTIPs, GmSIPs and GmXIPs subfamily members lack this phosphorylation motif in the C-terminal region.
Expression of GmMIP genes in various plant organs
The in silico expression of 23 GmMIPs in roots was >50 (Table 1). Based on in silico root specific expression and phylogenetic relationship (Fig. 1), 24 GmMIPs were selected for tissue specific expression analysis by semi-qPCR (Fig. 5). The expression patterns of GmMIPs in various organs of soybean were identified. GmPIP1;3 strongly expressed in roots, stems, leaves and pods. Other strongly expressed genes included GmPIP1;4 in stems and pods, GmPIP2;3 in roots, flowers and pods, GmTIP1;4 in flowers, GmTIP1;7 in stems, leaves, flowers and pods, GmTIP1;9 in roots, GmTIP2;1 in stems and pods, GmTIP2;2 in roots and pods, GmTIP2;6 in stems, leaves, flowers and pods and GmXIP1;2 in leaves. GmPIP2;6 and GmTIP4;1expressed marginally while GmPIP2;11 was marginally lower in all plant parts. The rest of the GmMIPs gave weak, marginal, marginal low or null expression.
Screening for dehydration-inducible GmMIP genes
The expression patterns of selective GmMIPs, expressing in root (Fig. 5) and distributed on chromosomes 11, 12 and 13 (Table 1), were further analyzed at 0, 7, 14 and 21 d of water stress using semi-qPCR (Fig. 6). Tubulin was used as the internal control. As GmTIP1;7 is an isoform of AtTIP1;1 or AtTIP1;2 which are known as salt induced tonoplast intrinsic protein (SITIP), thus it was specifically included for semi-qPCR analysis under drought. Most of the GmMIPs expressed stably in roots with slight deviations at various times of water stress. The expression of GmPIP1;8 peaked significantly at 7 d of stress whereas GmTIP1;7 only expressed at 7 d without watering and GmPIP2;4 only after 21 d of water stress.
The six MIPs were selected from 25 GmMIPs expressing in root (see Fig. 5) and distributed on chromosomes 11, 12 and 13 (see Table 1) as each of these chromosomes contain highest number of MIPs. The expression pattern of control plants (water was applied after three-leaf stage till 21 d) at various time points was similar to those at 0 d of no watering and thus picture not presented.
We compared the expression levels of 14 GmMIP genes with GmPEPC (internal control) in roots of soybean by qPCR at 0, 2, 4 and 12 h of 20% PEG treatment (Fig. 7). The relative expression (RE) of GmPIP1;4, GmPIP1;7, GmPIP2;3, GmPIP2;4, GmPIP2;5, GmTIP1;9 and GmTIP2;2 significantly increased (>2-fold) at 12 h compared to 0 h PEG stress. However, GmPIP2;3 first showed a significant decrease in RE before peaking at 12 h PEG stress. GmPIP1;4, GmPIP1;7 and GmTIP1;9 showed a gradual increase from 0 to 4 h followed by a rapid increase at 12 h. GmPIP1;8 showed significant decrease in RE at 2 and 4 h PEG stress. The expression pattern of GmPIP1;3 decreased 2 h after application of PEG, and began to increase at 4 h and peaked at 12 h after application. The expression of GmPIP2;10 and GmPIP2;11 decreased gradually to 4 h and expression was maximized 12 h post application of PEG. The expression pattern of GmTIP4;1 increased at 2 h and decreased gradually with minimum expression at 12 h. GmTIP2;6 expressed 2 and 4 h after application of PEG stress. The RE of the remaining GmMIPs fluctuated non-significantly at various time points after application of PEG stress.
Paradigm of GmMIPs function
The water channel protein family is part of the MIP superfamily of proteins. MIP proteins are highly diversified in plants and thus likely influence plants responses to external stresses. Most, if not all MIP proteins can be classified into two physiological groups: (1) AQPs to transport water and (2) GlpF to transport small neutral solutes such as glycerol . There is a general view that most AQPs in plants regulate water flow and that a subset may facilitate the movement of glycerol or other small molecules , . Considering these two main functions and following the rules of sequence comparison, aquaporins specificity and phosphorylation sites, and ar/R selectivity filters , , , , the present work focuses on the characterization of functional residues in the MIP proteins found in Glycine max. Most MIPs found in soybean are AQP type.
We identified a total of 66 aquaporin genes in soybean. The number of aquaporin genes described in this study is double that of rice, maize, tomato and Arabidopsis. It can be speculated that the palaeopolyploid nature of soybean resulted in duplication of these genes across the genome. These 66 genes separated into five subfamilies. The most frequent subfamilies observed were GmPIP and GmTIP, which contained 22 and 23 genes respectively (Fig. 1 and Table 1). The deduced rules of putative protein sequence comparison showed that all GmPIPs and GmTIPs have features similar to AQPs, however, GmPIPs residues at P6 (in H3) are similar to GlpF (Table 3, Fig. 3). The N-terminal phosphorylation site and ar/R selectivity filter showed that all GmPIPs are like AQP (Fig. 4). The C-terminus phosphorylation site in GmPIP1 subfamily members is not similar to AQP type where Pro replaced Ser. Members of GmPIP2 are similar to AQP (Fig. 4). AtPIP2 group are described as good AQPs in the Xenopus laevis expression system, whereas AtPIP1 proteins often cause lower osmotic water permeability (Pf) values in this expression system . It is speculated that AtPIP1 AQPs could be responsible for the transport of yet unidentified solutes across the plasma membrane.
Twenty three GmTIPs were classified into three subgroups based on ar/R filter (Fig. 4). AtTIPs have been classified into three groups . Subgroup IA and IIC are unique in the present studies. The NtTIP1 have been shown to transport water, glycerol, and urea when expressed in X. laevis oocytes .
All GmNIPs were classified into four subgroups. GmNIP2 and GmNIP6 subfamily members grouped in subgroup III and IV respectively appeared novel based on ar/R filter. NIPs have previously emerged as an interesting AQP subclass in terms of transport specificity and they are subdivided in two subgroups based on the predicted structure of their selectivity filter , . It has been reported that phosphorylation of Nodulin-26 on Ser 262 enhanced water permeability and that phosphorylation is stimulated by drought . AtNIPs have been reported to be classified in two subgroups . Mixed transport activities of NIPs have been observed in different organisms, for example, GmNOD26 can form a functional water channel and produce glycerol permease in X. laevis oocytes . The AtNIP1;1 was predicted as an AQP when expressed in X. laevis oocytes . The amino acids for ar/R filter in subgroup III consisted of Gly, Ser, Gly, and Arg (GSGR), compared with Ala, Ile, Gly, and Arg (AIGR) in subgroup IIA (Fig. 4), which are comparable with OsLsi1 and AtNIP5;1 respectively. The residue at the H5 position of the ar/R filter of both OsLsi1 and AtNIP5;1 were revealed to play a key role in membrane permeability to silicic acid (Si) and boric acid (B), although there is a relatively low selectivity for arsenite (As) . Previous reports show that G/A substitution at H2 (GSGR to ASGR) does not affect the transport activity of OsLsi1 for Si, As, B and water. The S/I substitution (GSGR to GIGR or AIGR) at H5 resulted in loss of Si, B and water transport activity. However, as transport was by 60% instead of total loss. AtNIP5;1, with AIGR residues for ar/R filter, was able to transport boric acid, water, and arsenite, but not silicic acid . A single or double mutation at H2 and/or H5 did not result in any Si transport activity.. In contrast, both single and double mutations at the H2 and/or H5 positions showed As transport activity and H5 I/S substitution led to decreased B transport activity while water permeability remained unaffected. The GmNIPs in subgroup III can be speculated to transport Si, As, B and water while those in subgroup IIA may transport As, B and water.
The GmSIP subfamily contained six diverse members, which separated into three unique subgroups based on ar/R selectivity filter. The GmXIP subfamily contained two members which are also different. The members of the GmNIP and GmSIP subfamilies were divergent while those of the GmPIP subfamily were more similar (Fig. 1). Similarity among members of the PIP subfamily and divergence observed in the NIP subfamily has also been reported in Arapidopsis , tomato  and cotton . The distribution of GmMIPs between the five subfamilies in soybean was similar to observations in species such as Arabidopsis, rice, maize etc. However, the two member subfamily of GmXIP has only recently been reported in moss  (see Fig. S1 for comparison of XIPs of soybean and moss), tomato  and cotton , and is present in mosses and dicots and is lost in monocots . Some GmNIPs and a few GmPIPs have characteristics of both AQP and GlpF, and can be called as aquaglyceroporin , .
GmXIPs are compared with 35 XIPs reported in  for NPA motif, intron/exon, P1-P5, and ar/R filters. The NPA motifs of GmXIP1;1 are similar to PtXIP2;1, while the 1st motif of GmXIP1;2 is NPI as reported for PtXIP1;1, PtXIP1;2 and PtXIP1;3. Several XIPs from fungi contain one intron as in the present studies. The ar/R filter of GmXIP1;1 (VIVR) is similar to PtXIP2;1 and/or RcXIP2;3.
MIPs also transport non-aqua substrates such as ammonia, urea etc . GmMIPs specificity for non-aqua substrates was detected in silico. Members of GmPIPs may facilitate transport of B, CO2, H2O2 and urea, while those of GmTIPs may transport H2O2 and urea (Fig. S5). The SDPs analysis revealed that GmNIP2s (subgroup III) may work as H2O2 transporters, GmNIP5;1 (subgroup IIA) as both H2O2 and urea transporter, however none of the GmMIPs are silicic acid transporters. GmNIP6;1 (subgroup IVA) and GmNIP6;2 (subgroup IVB) have SDPs for B and urea while GmNIP6;2 may also transport H2O2. Some members of the GmNIPs may also transport ammonia as well. SDPs analysis further predicted that ammonia and CO2 transport is a specific characteristic of GmNIPs and GmPIPs, respectively. Such specific characteristics of subfamilies has also been predicted in plant MIPs .
Expression analysis of GmMIPs
Specific tools such as cDNA hybridization, fine mapping based on in situ RT-PCR, qPCR etc. have recently been used to monitor expression of the whole aquaporin gene family. Hybridization of cDNAs to arrays carrying aquaporin gene specific tags have revealed a coordinated down-regulation of aquaporin genes in response to water and nutrient stresses , . Quantitative PCR analyses have been used to monitor expression of aquaporin transcripts in various tissues, organs or stress conditions , , . In the present study, we used semi-qPCR and qPCR analyses to establish the relationship of specific MIP abundance and drought tolerance in soybean roots. Among 24 soybean aquaporin genes selected on the basis of in silico expression in roots and phylogenetic relationship, our results indicated that at least ten genes expressed in roots and the expression of four genes encoding PIP and three genes encoding TIP was significantly greater than the internal control (Fig. 5). Most of these TIPs and PIPs are located on chromosomes 11 and 12. In Arabidopsis, a similar pattern has been observed in roots, leaves and flowers, where three PIPs, five TIPs, seven NIPs and one SIPs were expressed in roots . Orthologous MIPs in rice also expressed in roots, however the expression of OsTIP, OsNIP1;1, OsNIP1;4, OsNIP2;2, OsNIP3;3, OsNIP4;1 and OsSIP2;1 was weaker than the control . This study showed organ specificity of GmMIPs as reported earlier in Arabidopsis  and rice .
The genes expressing in roots and those which are located on chromosomes 11 and 12 were selected for subsequent testing water stress response. GmPIP2;4 didn't express in any organ (Fig. 5) until 21 d after cessation of watering (Fig. 6). However, it expressed in roots 12 h after growth in 20% PEG (Fig. 7). GmTIP1;7 also didn't express in roots (Fig. 5) until 7 d without watering (Fig. 6). Such expression patterns were also reported for OsTIP4;2 in rice, where its expression is very low in roots growing in normal conditions and significantly increased in roots after 10 h growth in 15% PEG and 8 h growth in 150 mM NaCl stress . It can be speculated that GmPIP2;4 and GmTIP1;7 are responsive to water stress as they only expressed after a period of water stress and are thus putatively useful. The qPCR analysis of nine GmPIPs and five TIPs further established the relationship of these GmMIPs with drought tolerance (Fig. 7). Three patterns were evident, down-regulation followed by up-regulation of GmMIPs and vice versa at different times after initiation of drought stress. The third type of expression pattern was revealed by qPCR analysis of GmPIP2;5 and GmTIP2;6. The former did not express in any plant organ (Fig. 5) but expressed after 12 h of drought stress in root (Fig. 7), while the later didn't express in roots (Fig. 5) but expressed 2 and 4 h after application of drought stress (Fig. 7). Similar expression patterns of various genes have been studied in Arabidopsis for urea transport  and for drought , and in rice for chilling and light reception  and for drought and salinity stress . The expression pattern of GmMIPs after application of drought treatment reflected a coordinated regulation of MIP isoforms that collectively contribute to the whole root water transport capacity. These GmMIPs are a potential resource for the genetic improvement of soybean drought tolerance.
This study identified and characterized 66 soybean aquaporin genes. Phylogenetic analysis of amino acid sequences divided the large and highly similar multi-gene family into 5 subfamilies. These genes were further classified into twelve subgroups: GmPIPs located in one subgroup, GmTIPs in three, GmNIPs in four, GmSIPs in three and GmXIPs in one. It can be speculated that GmMIPs contains true aquaporins, glyceroporins, aquaglyceroporins and mixed transport facilitators. However, their functionality remains to be properly validated. Our results indicate that the genes identified in this study represent an important genetic resource for the improvement of water use efficiency and/or drought tolerance as well as for transport of non-aqua substrate in soybean.
The primer sequences, positions and expected product sizes used in semi-qPCR and qPCR.
Phylogenetic analysis of 66 Gm MIPs with those of moss, Arabidopsis, rice and maize to establish nomenclature to unknown genes.
Phylogenetic analysis of 66 GmMIPs with bootstrap test.
ka/ks annotated evolutionary tree of 64 Gm MIPs.
Exon/Intron analysis of 66 GmMIPs. Exon 1 is shown by yellow color, 2 by green, 3 by blue, 4 by red, 5 by pink and 6 by dark green.
Conceived and designed the experiments: DYZ ZA CBW JXY IAK RMT HXM. Performed the experiments: LX ZLX XQL XLH YHH. Analyzed the data: DYZ ZA YHH. Contributed reagents/materials/analysis tools: JXY IAK RMT. Wrote the paper: ZA RMT.
- 1. Denker BM, Smith BL, Kuhajda FP, Agre P (1988) Identification, purification, and partial characterization of a novel Mr 28,000 integral membrane protein from erythrocytes and renal tubule. The Journal of Biological Chemistry 263: 15634–15642.
- 2. Fortin MG, Morrison NA, Verma DPS (1987) Nodulin-26, a peribacteroid membrane nodulin is expressed independently of the development of the peribacteroid compartment. Nucleic Acids Research 15: 813–824.
- 3. Javot H, Maurel C (2002) The role of Aquaporins in root water uptake. Annals of Botany 90: 301–313.
- 4. Maurel C, Javot H, Lauvergeat V, Gerbeau P, Tournaire C, et al.. (2002) Molecular physiology of aquaporins in plants. In: Thomas Zeuthen WDS, editor. International Review of Cytology: Academic Press. pp. 105–148.
- 5. Tyerman SD, Niemi etz CM, Bramley H (2002) Plant aquaporins: multifunctional water and solute channels with expanding roles. Plant, Cell and Environment 25: 173–194.
- 6. Chaumont F, Barrieu F, Herman EM, Chrispeels MJ (1998) Characterization of a maize tonoplast aquaporin expressed in zones of cell division and longation. Plant Physiology 117: 1143–1152.
- 7. Daniels MJ, Erik Mirkov T, Chrispeels MJ (1994) The plasma membrane of Arabidopsis thaliana contains a mercury-insensitive aquaporin that is a homolog of the tonoplast water channel protein TIP. Plant Physiology 106: 1325–1333.
- 8. Barkla BJ, Vera-Estrella R, Pantoja O, Kirch H-H, Bohnert HJ (1999) Aquaporin localization - how valid are the TIP and PIP labels? Trends in Plant Science 4: 86–88.
- 9. Gustafson CE, Katsura T, Mckee M, Bouley R, Casanova JE, et al. (2000) Recycling of AQP2 occurs through a temperature- and bafilomycin-sensitive trans-Golgi-associated compartment. American Journal of Physiology and Renal Physiology 278: F317–F326.
- 10. Jiang L, Rogers JC (1998) Integral membrane protein sorting to vacuoles in plant cells: evidence for two pathways. The Journal of Cell Biology 143: 1183–1199.
- 11. Boursiac Y (2005) Early Effects of Salinity on Water Transport in Arabidopsis Roots. Molecular and Cellular Features of Aquaporin Expression. Plant Physiology 139: 790–805.
- 12. Quigley F, Rosenberg JM, Shachar-Hill Y, Bohnert HJ (2002) From genome to function: The Arabidopsis aquaporins. Genome Biology 3: 1–17.
- 13. Chaumont F, Barrieu F, Wojcik E, Chrispeels MJ, Jung R (2001) Aquaporins constitute a large and highly divergent protein family in maize. Plant Physiology 125: 1206–1215.
- 14. Sakurai J, Ishikawa F, Yamaguchi T, Uemura M, Maeshima M (2005) Identification of 33 Rice Aquaporin Genes and Analysis of Their Expression and Function. Plant and Cell Physiology 46: 1568–1577.
- 15. Danielson JÅ, Johanson U (2008) Unexpected complexity of the Aquaporin gene family in the moss Physcomitrella patens. BMC Plant Biology 8: 45.
- 16. Sade N, Vinocur BJ, Diber A, Shatil A, Ronen G, et al. (2009) Improving plant stress tolerance and yield production: is the tonoplast aquaporin SlTIP2;2 a key to isohydric to anisohydric conversion? New Phytologist 181: 651–661.
- 17. Park W, Scheffler BE, Bauer PJ, Campbell BT (2010) Identification of the family of aquaporin genes and their expression in upland cotton (Gossypium hirsutum L.). BMC Plant Biology 10: 142.
- 18. Kaldenhoff R, Fischer M (2006) Functional aquaporin diversity in plants. Biochimica et Biophysica Acta (BBA) - Biomembranes 1758: 1134–1141.
- 19. Alexandersson E, Fraysse L, Sjövall-Larsen S, Gustavsson S, Fellert M, et al. (2005) Whole Gene Family Expression and Drought Stress Regulation of Aquaporins. Plant Molecular Biology 59: 469–484.
- 20. Aharon R (2003) Overexpression of a Plasma Membrane Aquaporin in Transgenic Tobacco Improves Plant Vigor under Favorable Growth Conditions but Not under Drought or Salt Stress. The Plant Cell Online 15: 439–447.
- 21. Heymann JB, Engel A (1999) Aquaporins: Phylogeny, Structure, and Physiology of Water Channels. News in physiological sciences : an international journal of physiology produced jointly by the International Union of Physiological Sciences and the American Physiological Society 14: 187–193.
- 22. Maurel C, Verdoucq L, Luu D-T, Santoni Ve (2008) Plant Aquaporins: Membrane channels with multiple integrated functions. Annual Review of Plant Biology 59: 595–624.
- 23. Uehlein N, Lovisolo C, Siefritz F, Kaldenhoff R (2003) The tobacco aquaporin NtAQP1 is a membrane CO2 pore with physiological functions. Nature 425: 734–737.
- 24. Chaumont F, Moshelion M, Daniels Mark J (2005) Regulation of plant aquaporin activity. Biology of the Cell 97: 749.
- 25. Hove R, Bhave M (2011) Plant aquaporins with non-aqua functions: deciphering the signature sequences. Plant Molecular Biology 75: 413–430.
- 26. Marone M, Mozzetti S, Ritis D, Pierelli L, Scambia G (2001) Semiquantitative RT-PCR analysis to assess the expression levels of multiple transcripts from the same sample. Biological Procedures Online 3: 19–25.
- 27. Tuteja JH, Clough SJ, Chan W-C, Vodkin LO (2004) Tissue-specific gene silencing mediated by a naturally occurring chalcone synthase gene cluster in Glycine max. The Plant Cell 16: 819–835.
- 28. Winer J, Jung CKS, Shackel I, Williams PM (1999) Development and Validation of Real-Time Quantitative Reverse Transcriptase–Polymerase Chain Reaction for Monitoring Gene Expression in Cardiac Myocytesin Vitro. Analytical Biochemistry 270: 41–49.
- 29. Gupta A, Sankararamakrishnan R (2009) Genome-wide analysis of major intrinsic proteins in the tree plant Populus trichocarpa: Characterization of XIP subfamily of aquaporins from evolutionary perspective. BMC Plant Biology 9: 134.
- 30. Johanson U, Karlsson M, Johansson I, Gustavsson S, Sjövall S, et al. (2001) The complete set of genes encoding major intrinsic proteins in Arabidopsis provides a framework for a new nomenclature for major intrinsic proteins in plants. Plant Physiology 126: 1358–1369.
- 31. Froger A, Tallur B, Thomas D, Delamarche C (1998) Prediction of functional residues in water channels and related proteins. Protein Science 7: 1458–1468.
- 32. Johansson I, Karlsson M, Johanson U, Larsson C, Kjellbom P (2000) The role of aquaporins in cellular and whole plant water balance. Biochimica et Biophysica Acta 1465: 324–342.
- 33. Wallace IS (2004) Homology modeling of representative subfamilies of Arabidopsis major intrinsic proteins. Classification based on the aromatic/arginine selectivity filter. Plant Physiology 135: 1059–1068.
- 34. Mitani-Ueno N, Yamaji N, Zhao FJ, Ma JF (2011) The aromatic/arginine selectivity filter of NIP aquaporins plays a critical role in substrate selectivity for silicon, boron, and arsenic. Journal of Experimental Botany 62: 4391–4398.
- 35. Maurel C, Chrispeels MJ (2001) Aquaporins. A molecular entry into plant water relations. Plant Physiology 125: 135–138.
- 36. Johansson I, Karlsson M, Shukla VK, Chrispeels MJ, Larsson C, et al. (1998) Water transport activity of the plasma membrane aquaporin PM28A is regulated by phosphorylation. The Plant Cell 10: 451–459.
- 37. Guenther JF, Roberts DM (2000) Water-selective and multifunctional aquaporins from Lotus japonicus nodules. Planta 210: 741–748.
- 38. Wallace IS, Choi W-G, Roberts DM (2006) The structure, function and regulation of the nodulin 26-like intrinsic protein family of plant aquaglyceroporins. Biochimica et Biophysica Acta (BBA) - Biomembranes 1758: 1165–1175.
- 39. Guenther JF, Chanmanivone N, Galetovic MP, Wallace IS, Cobb JA, et al. (2003) Phosphorylation of soybean nodulin 26 on serine 262 enhances water permeability and is regulated developmentally and by osmotic signals. The Plant Cell Online 15: 981–991.
- 40. Rivers RL, Dean RM, Chandyi G, Halli JE, Roberts DM, et al. (1997) Functional analysis of Nodulin 26, an Aquaporin in soybean root nodule symbiosomes. The Journal of Biological Chemistry 272: 16256–16261.
- 41. Weig A, Deswarte C, Chrispeels MJ (1997) The major intrinsic protein family of Arabidopsis has 23 members that form three distinct groups with functional aquaporins in each group. Plant Physiology 114: 1347–1357.
- 42. Maurel C (2007) Plant aquaporins: Novel functions and regulation properties. FEBS Letters 581: 2227–2236.
- 43. Li G-W, Peng Y-H, Yu X, Zhang M-H, Cai W-M, et al. (2008) Transport functions and expression analysis of vacuolar membrane aquaporins in response to various stresses in rice. Journal of Plant Physiology 165: 1879–1888.
- 44. Liu L-H, Ludewig U, Gassert B, Frommer WB, von Wirén N (2003) Urea Transport by Nitrogen-Regulated Tonoplast Intrinsic Proteins in Arabidopsis. Plant Physiology 133: 1220–1228.