Structural bioinformatics studies of glutamate transporters and their AlphaFold2 predicted water-soluble QTY variants and uncovering the natural mutations of L->Q, I->T, F->Y and Q->L, T->I and Y->F

Glutamate transporters play key roles in nervous physiology by modulating excitatory neurotransmitter levels, when malfunctioning, involving in a wide range of neurological and physiological disorders. However, integral transmembrane proteins including the glutamate transporters remain notoriously difficult to study, due to their localization within the cell membrane. Here we present the structural bioinformatics studies of glutamate transporters and their water-soluble variants generated through QTY-code, a protein design strategy based on systematic amino acid substitutions. These include 2 structures determined by X-ray crystallography, cryo-EM, and 6 predicted by AlphaFold2, and their predicted water-soluble QTY variants. In the native structures of glutamate transporters, transmembrane helices contain hydrophobic amino acids such as leucine (L), isoleucine (I), and phenylalanine (F). To design water-soluble variants, these hydrophobic amino acids are systematically replaced by hydrophilic amino acids, namely glutamine (Q), threonine (T) and tyrosine (Y). The QTY variants exhibited water-solubility, with four having identical isoelectric focusing points (pI) and the other four having very similar pI. We present the superposed structures of the native glutamate transporters and their water-soluble QTY variants. The superposed structures displayed remarkable similarity with RMSD 0.528Å-2.456Å, despite significant protein transmembrane sequence differences (41.1%—>53.8%). Additionally, we examined the differences of hydrophobicity patches between the native glutamate transporters and their QTY variants. Upon closer inspection, we discovered multiple natural variations of L->Q, I->T, F->Y and Q->L, T->I, Y->F in these transporters. Some of these natural variations were benign and the remaining were reported in specific neurological disorders. We further investigated the characteristics of hydrophobic to hydrophilic substitutions in glutamate transporters, utilizing variant analysis and evolutionary profiling. Our structural bioinformatics studies not only provided insight into the differences between the hydrophobic helices and hydrophilic helices in the glutamate transporters, but they are also expected to stimulate further study of other water-soluble transmembrane proteins.


Introduction
Glutamate transporters are a class of membrane proteins that play a vital role in the central nervous system (CNS) by removing excess glutamate from the synapse, involving in critical mechanisms of synaptic plasticity, memory, and neuronal or glial cell death [1,2].Thus, the proper functioning of glutamate transporters is essential for neuronal physiology and healthy brain function [3].Several subtypes of glutamate transporters are prevalent in peripheral organs, and their dysregulation has been associated with diverse types of tumors [4].
Vesicular glutamate transporters (VGLUTs) play a crucial role in the storage of glutamate, while the termination of glutamatergic signaling is predominantly mediated by the action of excitatory amino acid transporters (EAATs) located on the plasma membrane of astrocytes and neurons [3].Consequently, alterations in the functions of these transporters have been associated with a range of psychiatric and neurophysiological disorders [1,3,5].For instance, EAATs may be involved in the etiologies of Schizophrenia and affective disorders [6], and many other nervous system disorders [1,3,5].VGLUTs may also play an important role in several neurophysiological disorders [1].The functions of glutamate transporters extend well beyond the central nervous system, with a widespread presence in peripheral organs such as the heart, kidney, and liver [4].Certain glutamate transporters also exhibit distributions in the placenta, emphasizing their roles in the healthy development of the human fetus [7].Accordingly, the evidence for the roles of glutamate transporters in cancer biology is emerging as dysregulations can be seen in a range of tumor types [4].
While the glutamate transporters may present critical targets for therapeutics as some modulators are shown to have potential, current therapeutic options are limited because of poor efficacy [2].However, it holds a significant premise to investigate new strategies to effectively regulate transporters.Nevertheless, unlike water-soluble proteins, the study and manipulation of the transporter proteins is a daunting task since they are embedded within a phospholipid bilayer membrane [8].Due to their hydrophobic surface, detergents are required to isolate them, which is often unstable [8].To overcome these challenges, we present an innovative alternative, as the QTY (Glutamine, Threonine, Tyrosine) code, which allows for the design of water-soluble domains without the use of detergents, instead through specific amino acid substitutions [9][10][11].Alongside its promising role to develop new therapeutics while aiding researchers to generate effective therapeutic monoclonal antibodies, these soluble QTY variants of glutamate transporters may have several additional benefits; from designs of membrane proteins with improved properties; to potentially even the discovery of new functions.
We previously applied the QTY code to design a range of detergent-free transmembrane protein chemokine receptors and cytokine receptors and used conventional computing programs in this process [9][10][11].The expressed and purified water-soluble variants exhibited the predicted characteristics and maintained their ligand-binding activity [9][10][11][12][13][14].After the Alpha-Fold2 was released in July 2021, we immediately used AlphaFold2 to make QTY variant protein structure predictions and achieved improved results in less than an hour [15][16][17][18], compared to the previous method which took approximately 5 weeks per simulation [9][10][11].Additionally, we developed a program and website for designing water-soluble QTY variants of membrane proteins [19].The reverse QTY-code was recently described based on similar for the QTY code for GPCRs and OH2Laboratories licensed the technology from MIT to work on water-soluble GPCR variants.However, this article does not study GPCRs.SZ is an inventor of the QTY code and has a minor equity of OH2Laboratories.S.Z.founded a startup 511 Therapeutics to generate therapeutic monoclonal antibodies against solute carrier transporters to treat pancreatic cancer.S.Z. has majority equity in 511 Therapeutics.511 Therapeutics sponsored the study but had no influence and interference in the design of the study, in the data collection, analyses, or interpretation of data and in the writing of the manuscript, or in the decision to publish the results.All other authors have no competing interest.Additional statements Shuguang Zhang is also a co-founder and Scientific Advisor for 3DMatrix Co Ltd in Japan that produces selfassembling peptide hydrogels for accelerating wound healing for surgical and dental wound healing applications.This does not alter our adherence to PLOS ONE policies on sharing data and materials.
biochemical characteristics [20].AlphaFold2 greatly accelerated research on predictions of protein structures with high accuracy, enabling the design of novel proteins, and the identification of new protein interactions and functions [21].
We hereby report using the combination of multiple approaches including glutamate transporter structural analysis, genomic variant analysis, and evolutionary conservation studies, we can significantly advance our understanding of protein structures and ultimately allow effective options for the fields of medical treatment and diagnosis.A large number of protein-coding gene variants found in populations may provide researchers with a valuable tool.Such variant analysis is essential for drug design, as it enables the identification of amino acid residues crucial for a protein's activity or those that may be targeted by inhibitors.Furthermore, using archives of the human genetic variations found in patient samples, such as ClinVar [22], we show the phenotypical effects of the variants.Insights gained from evolutionary conservation studies may further aid the protein design process.Particularly in the case of glutamate transporters since the structural mechanism of amino acid symport is evolutionarily conserved from archaea to humans [23].
Our findings here provide a comprehensive analysis of the glutamate transporters and their water-soluble QTY variants while demonstrating the viability of in silico tools to manipulate the characteristics of vital transmembrane proteins.By utilizing specific approaches to generate water-soluble variants of proteins including the QTY code, researchers may be able to develop more effective therapies and diagnostic tools for various disorders that caused by dysregulation of glutamate transporters.
For comparing effects of the QTY code on the membrane spanning regions, transmembrane helix predictions for both native transporters and their QTY variants were carried out using TMHMM -2.0 [27,28], based on a hidden Markov model.The molecular weights (MW) and isoelectric point (pI) values of the native transporters and their QTY variants were calculated using the Expasy website (https://web.expasy.org/compute_pi/)[29][30][31].

AlphaFold2 predictions
The structure predictions of the QTY variants were performed using the AlphaFold2 [21,32] program, which can be accessed at (https://github.com/sokrypton/ColabFold). The program was run on 2 x 20 Intel Xeon Gold 6248 cores with 384 GB of RAM and a Nvidia Volta V100 GPU, following the instructions provided on the website.The European Bioinformatics Institute (EBI) houses over 200 million AlphaFold2-predicted structures and can be found at (https://alphafold.ebi.ac.uk).
The native structures of eight transporters and their QTY variants were predicted using AlphaFold2.The superposition of these structures was performed using PyMOL [35], which is available at (https://pymol.org/2/).

Structure visualization
In the study, two software programs were utilized for structure visualization: PyMOL [35] (https://pymol.org/2/)and UCSF ChimeraX [36] (https://www.rbvi.ucsf.edu/chimerax/).PyMOL was used for the superposition of molecular models, whereas the representation of hydrophobicity models was accomplished utilizing ChimeraX.Additionally, the visualization of natural mutations of the QTY variants was also performed using the ChimeraX software.

Building natural QTY and rQTY mutation libraries
PolyPhen-2 [38] (http://genetics.bwh.harvard.edu/pph2/)was used to predict the impact of the mutations on the protein function and structure.The input data for PolyPhen-2 analysis included all 19 amino acids substitutions possible to occur at the residue, which natural QTY or rQTY substitutions occurred.More than 1,800 potential variations analyzed, and the predicted effects were subsequently visualized using GNUPlot [39].

Building mutation libraries for the TM regions of EAA1
We used Polyphen-2 [38] to predict the effects of all 19 amino acids substitutions at the residue of L, I, V, F amino acids in the TM α-helices of the EAA1 (total 97 amino acids), regardless of their occurrence in the population or nature.The predicted effects of 1,843 variations were plotted using GNUPlot [39] and L, I, V, F -> Q, T, Y substitutions compared with other amino acid substitutions.

Evolutionary conservation profiles and analysis of sensitive domains
Mutation visualizations for glutamate transporters were accessed from PMut Repository [40] (https://mmb.irbbarcelona.org/PMut/).ConSurf server [41][42][43][44][45][46] (https://consurf.tau.ac.il/) used for generating evolutionary conservation profiles.The server ran with AlphaFold2 predicted native structures that were also used for RMSD calculations, and these structures were later complemented with SEQRES records.The.pdb files generated from AlphaFold2 did not contain the SEQRES sequences at the onset.The source sequences for the protein structures were derived from Uniprot in FASTA format.To translate and add the amino acid sequences to the.pdb files in the correct SEQRES format, visual basic for applications (VBA) scripting was utilized.
The conservation scores were computed using the Bayesian method, with the amino acid substitution model chosen based on the best fit.The default parameters were employed for homologues search, homologues thresholds and alignment, phylogeny, and conservation scores.The evolutionary conservation grades of each residue were visualized using the UCSF ChimeraX [36] software (https://www.rbvi.ucsf.edu/chimerax/).The conservation grades and residue exposure data obtained from the ConSurf server were complemented with secondary structure information and transporter topology.Per-residue helix and strand assignments of native glutamate transporters were deduced from the models available in the AlphaFold Database [21,32], the algorithm for Defining the Secondary Structure of Proteins (DSSP) [47] were run using UCSF ChimeraX [36] (https://www.rbvi.ucsf.edu/chimerax/).The default energy cut off parameters of -0.5 kcal/mol, as recommended by Kabsch and Sander [47], were used for the calculations, minimum number of residues allowed in a helix or strand were also set to the default value of 3.These data were subsequently correlated with the predicted phenotypical and structural effects of the natural QTY (as well as rQTY) variants investigated in this study.

AlphaFold2 predicted water-soluble QTY variants
The AlphaFold DB [21,32] (https://alphafold.ebi.ac.uk), a database developed by DeepMind and the European Bioinformatics Institute (EMBL-EBI) at EMBL, serves as the repository for all AlphaFold2 predictions, with over 200 million protein structures.For more detailed information on the water-soluble QTY variants that are reported in this study, please go to the website: https://github.com/eva-smorodina/glut.

Protein sequence alignments and other characteristics
The topological visualizations and predicted sequence features of EAATs and VGLUTs indicated that each transporter has an 8-transmembrane (TM) architecture, whereas the Y+L amino acid transporter-2 (YLAT2) has 12TM MFS-fold transporter topology (S2 Fig in S1 File) [23,33,48].Contrary to VGLUTs topology, EAATs also has a larger extracellular loop between TM3 and TM4, which is absent in the structures determined by X-ray crystallography or cryo-EM methods [23,33] 1).
The QTY (Glutamine, Threonine, Tyrosine) code substitute four hydrophobic amino acids (Leucine, Isoleucine, Valine, and Phenylalanine) with three neutral polar amino acids (Glutamine, Threonine, and Tyrosine) in transmembrane segments, reducing hydrophobicity.The 1.5Å electron density maps show very similar structures between leucine (L) vs glutamine (Q); isoleucine (I), valine (V) vs threonine (T); and phenylalanine (F) vs tyrosine (Y), leading to the implementation of the QTY code.
The QTY code results in significant substitutions in the transmembrane helices, ranging from 41% to 54% (Table 1).Despite the high substitution rate, the difference in molecular weight between the native and QTY variants is only a minimal amount, in the range of a few hundred Daltons (Da).This observation can be attributed to two factors.First, the substitution of the CH3-group (15Da) on leucine (L) and valine (V) with -OH groups (17Da) on glutamine (Q) and threonine (T) results in 2Da loss per substitution.Second, the addition of an OHgroup occurs while the substitution of phenylalanine (F) to tyrosine (Y) takes place.The sum of these changes results in a minor effect on the molecular weights of the proteins (Table 1).Furthermore, previous experimental research demonstrated that QTY variants show remarkable thermostability [9,10], despite the variants having a reduced number of aliphatic residues (A, L, V, I), resulting from the substitution of L with Q, and I as well as V with T. Additionally, the QTY substitutions does not introduce any charged residues into the protein, thus resulting in minimal changes of pIs, which could lead to non-specific interactions if changed.

AlphaFold2 predictions
Understanding the 3D structure of transmembrane proteins is a crucial task, as it is key to understanding how they function, interact with other molecules, and can be targeted for therapeutic purposes.However, experimentally determining the structure of transmembrane proteins is a notoriously difficult process, owing to the hydrophobic nature of transmembrane proteins that require detergents to stabilize the membrane protein after isolating them from the cell membrane.From gene expression, and protein production, to selecting the appropriate detergent for maintaining stability, and avoiding irreversible aggregation, every step poses significant challenges [8].Thus, the quantity of 3D structures experimentally determined for transmembrane proteins is significantly lag behind in comparison to that of water-soluble proteins.Consequently, Alphafold2 has a significant impact on the field of transmembrane protein research by providing researchers with accurate molecular structural models [21,32].
In previous work, we used AlphaFold2 to predict the structures of water-soluble QTY variants of G protein-coupled receptors [15], glucose transporters [16], solute carrier transporters (SLC) [17], and potassium ion channels [18].These predictions were in agreement with previously known experimentally-determined structures obtained through X-ray crystallography or cryo-EM methods.In this study, we also utilize AlphaFold2 to predict QTY variant and native transporters, as well as comparing them with two experimentally native determined structures.

Superposition of native transporters and their water-soluble QTY variants
In our current study, the native transporter structures determined by cryo-EM, or X-ray crystallography were superimposed and compared to their QTY variants.The experimentallydetermined structures used in this study are EAA1 (PDB ID: 5LLM) [33] and EAA3 (PDB ID: 6X2Z) [23], both obtained from RCSB PDB.The superposition of structures was performed for EAA1 Crystal vs EAA1 QTY , and EAA3 CryoEM vs EAA3 QTY .The cryo-EM/crystal structures of native proteins and their AlphaFold2 predicted watersoluble QTY variants were superposed less than 2.5Å (Fig 2).Despite a high substitution rate of 54% in the transmembrane alpha-helices in the water-soluble QTY variants, their structures remain similar to the native structures, demonstrated by the root mean square deviation (RMSD).The RMSD values for EAA1 crystal vs EAA1 QTY were 1.729Å, and for EAA3 CryoEM vs EAA3 QTY were 2.456Å (Fig 2).The molecular structures, both experimentally determined and predicted by AlphaFold2, were found to superpose very well.Furthermore, the cryo-EM and crystal structures were also superposed with corresponding AlphaFold2 predicted native structures (Table 2).The RMSD results support the accuracy of AlphaFold2's predictions, as the predicted native structures are in line with the experimentally determined structures.
Many glutamate transporters currently do not have experimentally determined structures, as in the case of numerous other transmembrane proteins.We obtained the structures of six native transporters (EAA2, EAA4, VGLUT1, VGLUT2, VGLUT3, and YLAT2) using Alpha-Fold2 predictions.Alongside predicted structures of these transporters, AlphaFold2 predicted native EAA1 and EAA3 were also compared with their predicted QTY variants (Table 1 and Fig 3).Despite differences in amino acid composition and chemical characteristics, the structural similarity between the native and QTY variants was high as demonstrated by the root mean square deviation (RMSD).The RMSD values were: EAA1 vs EAA1 QTY (0.717Å), EAA2 vs EAA2 QTY (0.948Å), EAA3 vs EAA3 QTY (0.905Å), EAA4 vs EAA4 QTY (0.796Å), VGLUT1 vs VGLUT1 QTY (1.604Å), VGLUT2 vs VGLUT2 QTY (0.971Å), VGLUT3 vs VGLUT3 QTY (1.422Å), YLAT2 vs YLAT2 QTY (0.528Å).The native glutamate transporters have four known   [49].The experimental-structures used in this study were outward structures for EAA1 and EAA3 [23,33].Meanwhile all AlphaFold2 predicted native and QTY-variant structures also corresponded to the outward-facing structural conformations, meaning the protein core located relatively outward to the rest of the protein (Fig 2).These close alignments reinforce the similarity between the native and watersoluble QTY variants, regardless of hydrophobicity and hydrophilicity (Tables 1 and 2, Figs 2 and 3).

Analysis of the hydrophobic surface of native transporters and the watersoluble QTY variants
Nature has evolved three types of chemically distinct alpha-helices [50][51][52].These are 1) Type I: the hydrophilic alpha-helix, composed mostly of polar amino acids D, E, N, Q, K, R, S, T, and Y [50], as found in water-soluble enzymes and circulating proteins; 2) Type II: the hydrophobic alpha-helix which contains mostly hydrophobic amino acids L, I, V, F, M, P and A [50], present in transmembrane proteins including G protein-coupled receptors, ion channels, the glutamate transporters and transmembrane helices in photosynthesis systems; and (3) Type III: amphiphilic alpha-helices, containing both hydrophobic and hydrophilic amino acid residues.These three types of chemically distinct alpha-helices have similar structures, regardless of their hydrophobicity or hydrophilicity, that is the molecular basis of the QTY code [9].The native structures of glutamate transporters have a high hydrophobicity content, particularly in their transmembrane alpha-helical segments, causing them to be insoluble in water and needing the use of surfactants for isolation [8].Without these surfactants, the transporters tend to aggregate and form precipitation, leading to a loss of biological function [8].By replacing the hydrophobic amino acids L, I, V, and F with hydrophilic ones (Q, T, Y), the hydrophobic surfaces were significantly reduced (Figs 4 and 5), this change in hydrophobicity does not disrupt the alpha-helix structure, which was previously unexpected before the systematic experiments were carried out in our recent publications.The experimental evidence that QTY transformation from hydrophobic to hydrophilic transporters retains structural stability and ligand-binding function has been demonstrated in previous studies [9][10][11][12][13].The QTY code approach is a valuable tool for studying transmembrane proteins, including glutamate transporters.The water-soluble variants of glutamate transporters may not only find potential applications in the design for diagnostic medicine but also in generating monoclonal antibodies and other therapeutics.

Analysis of genetic variants containing natural mutations of the QTY code
After the improvements in genomics and variant discovery, through the integration of vast data obtained from exome and genome sequencing, the genetic variant analysis found many applications in medical science [53].This variant analysis may also become a major tool for protein engineering since it provides valuable information on protein variants and their functional effects [54].Our study analyzed the natural mutations of glutamate transporters and revealed a QTY code that arose from natural processes.
We used the gnomAD database [37] of 125,748 exomes and 15,708 genomes to survey missense variations of the 8 glutamate transporters.The variants were filtered as QTY (L->Q, V/ I->T, F->Y) and reverse QTY (Q->L, T->V/I, Y->F).A total of 95 variants, as 63 QTY and 32 reverse QTY (rQTY), were identified in the glutamate transporter genes.The variations were all single amino acid changes and located at various positions within the transporter protein.The second base of the codon was the only base found to be mutated in all the variations listed, with a total of 95 mutations (Tables 3 and 4).The variations and their predicted effects were visualized (  Twenty-nine of the natural QTY mutations were outside the TM domain, corresponding to ~46.0%.Specifically, three mutations were found in the intramembrane regions, 7 in the extracellular regions, and 19 in the cytoplasmic regions.As a result, 15 of the mutations were predicted to be benign (15/29 = 51.7%),7 as "possibly" damaging with low confidence (7/ 29 = 24.1%),and 7 as probably damaging (7/29 = 24.1%).Notably, regardless of their location, more than half of the natural QTY mutations were predicted to be benign (Table 3).Per-residue secondary structure assignment from AlphaFold2 determined models showed that 53 mutations belong to a helical structure, and 29 of those were benign (Table 3).
On the other hand, 32 natural reverse-QTY (Q->L, T->V/I, Y->F) mutations examined in this study were predominantly found outside the TM regions (24/32 = 75%).In detail, three of  the rQTY mutations were found in the intramembrane regions, 11 in the extracellular regions, and 10 in the cytoplasmic regions.Outside the TM regions, 13 mutations were predicted to be benign (13/24 = 54.2%), 5 as "possibly" damaging with low confidence (5/24 = 20.8%), and 6 as probably damaging (6/24 = 25.0%).Regardless of their location, 17 out of 32 (53.1%) of the reverse QTY mutations were predicted to be benign.Secondary structure assignment data showed that 18 mutations belong to a helical structure, and 9 of those were benign (Table 4).The ClinVar archives [22] demonstrated the clinical effects of 13 natural QTY or rQTY substitutions (Tables 3 and 4).Two of the variants reported in the ClinVar database were benign (VCV000367038.7 and VCV000777038.3)and a total of 11 variants were associated with uncertain significance in three different conditions: episodic ataxia type 6 (VCV000906384.2),dicarboxylic aminoaciduria (VCV001701474.  1 Protein consequence of the mutation according to HGVS numbering. 2The second base of the residue codon for the corresponding mutation.

Name
3 Topological localizations of the mutations according to glutamate transporter molecular architecture (TM = Transmembrane, ECL = Extracellular loop, IM = Intramembrane, ICL = Intracellular loop).The topological information of the mature protein obtained from Uniprot. 4 Secondary structure of the corresponding residue, calculated from the determined models of native transporters available in the AlphaFold Database. 5Residue exposure according to the NACSES algorithm, predicted by ConSurf server 6 Evolutionary conservation grade of the residue predicted by ConSurf server; 1 to 9, in order of increasing conservation (1 = Variable, 5 = Average, 9 = Conserved). 7Variant effect predicted by Polyphen.Benign = predicted to be benign with high confidence;?damaging = possibly damaging, predicted to be damaging with low confidence; damaging = probably damaging: predicted to be damaging with high confidence. 8Based on ClinVar's January 21, 2023 release. 9A functional residue (exposed and highly conserved) predicted by ConSurf Server.  1 Protein consequence of the mutation according to HGVS numbering. 2The second base of the residue codon for the corresponding mutation. 3Topological localizations of the mutations according to transporter molecular architecture (TM = Transmembrane, ECL = Extracellular loop, IM = Intramembrane, ICL = Intracellular loop).The topological information of the mature protein obtained from Uniprot. 4 Secondary structure of the corresponding residue, calculated from the determined models of native transporters available in the AlphaFold Database. 5Residue exposure according to the NACSES algorithm, predicted by ConSurf server 6 Evolutionary conservation grade of the residue predicted by ConSurf server; 1 to 9, in order of increasing conservation (1 = Variable, 5 = Average, 9 = Conserved). 7Variant effect predicted by Polyphen.Benign = predicted to be benign with high confidence;?damaging = possibly damaging, predicted to be damaging with low confidence; damaging = probably damaging: predicted to be damaging with high confidence. 8Based on ClinVar's January 21, 2023 release. 9A functional residue (exposed and highly conserved) predicted by ConSurf Server.A structural residue (buried and highly conserved) predicted by ConSurf Server. https://doi.org/10.1371/journal.pone.0289644.t004

Natural mutations of L->Q, I->T, F->Y and Q->L, T->I, Y->F in glutamate transporters
The Genetic code's second position determines the chemical nature of amino acids [55,56].For example, i) amino acids with U at the second position are hydrophobic (Phe, Leu, Ile, Val, and Met); ii) amino acids with C at the second position are less hydrophobic (Pro and Ala), or with a hydroxyl -OH group (Ser and Thr); iii) amino acids with A at the second position are hydrophilic and water soluble (Asp, Glu, Asn, Glu, Lys, His and Tyr), and 2 stop codons Ochre (UAA) and Amber (UAG); iv) amino acids (Arg and Ser) with G at the second position are water soluble, Cys is partially water-soluble and Gly is achiral and has an H as the side chain [55,56].The stop codon is UGA.In general, pyrimidine U and C at the second position confer hydrophobicity; in contrast, purine A and G at the second position confer hydrophilicity (S1 Fig in S1 File).
In the glutamate transporters, there are many natural mutations of L->Q, I->T, F->Y and Q->L, T->I, Y->F.These mutations result from a single nucleotide change, all occur in the second position of the genetic code, including transition mutation, i.e., purine to purine (A->G, G->A) and pyrimidine to pyrimidine (C->U, or U->C); or transversion mutation (U->A, U->G, C->A, C->G, A->U, A->C, G->U, G->C).
In the case of L->Q, I->T, and F->Y mutations.For example, i) in L (leucine), two codons are CUA and CUG, and in Q (glutamine), two codons are CAA and CAG; in these cases, the second position of U is mutated to A, which is a transversion mutation.ii) In I (isoleucine), three codons are AUU, AUC, and AUA, in T (threonine), four codons are ACU, ACC, ACA, and ACG; in these cases, the second position of U is mutated to C which is a transition mutation.iii) In F (phenylalanine), two codons are UUU and UUC, in Y (tyrosine), two codons are UAU and UAC, and the second position of U is mutated to A which is a transversion mutation.
Likewise, in the mutations of Q->L, T->I, Y->F, it is the change of Q, T, Y to L, I, F. Namely, i) in Q (glutamine), two codons are CAA and CAG, when the codons are mutated to CUA and CUG, they changed to L (leucine).ii) Four codons of T (threonine) are ACU, ACC, ACA, and ACG, when they are mutated to AUU, AUC, and AUA which is the transition mutation, they changed T to I (isoleucine).iii) Following the same logic, two codons of Y (tyrosine) are UAU and UAC, when they are mutated to UUU and UUC which is a transversion mutation, the Y is changed to F.
No V->T, nor T-> V mutations in the transporters are observed (Tables 3 and 4).This is because such changes require at least 2 nucleotide changes.The four valine (V) codons are GUU, GUC, GUA, and GUG, and the four threonine (T) codons are ACU, ACC, ACA, and ACG.In this study, we only focused on the QTY relevant mutations and did not systematically examine other mutations since it is beyond the scope of this study.

QTY and rQTY mutation libraries
Mutation libraries are an essential tool for modern genetic and medical analysis.By collectively analyzing a diverse set of genetic variants, mutation libraries provide researchers and medical doctors with the means to investigate variants for desired traits, such as stability or phenotypical effects.These libraries are typically constructed through a process of in vivo and in vitro mutagenesis [57].In contrast, hereby we present the comprehensive genetic analysis using solely computational methods, which may be notably faster and less costly than conventional mutagenesis.
For the analysis of the amino acid residues which naturally occurred QTY and reverse QTY (rQTY) variations were submitted by large-scale sequencing projects, we built mutation libraries by calculating the effects of all 19 amino acid substitutions possible to occur at the residue, except the wild amino acid.In total, more than 1,800 potential variations and their impacts on the native protein were predicted.The Polyphen-2 algorithm considers hydrophobic potentials when predicting the effects of amino acid substitutions on protein function and structure [38].As a result, substitutions to the polar amino acids leading to soluble variants may be expected to have a higher predicted score since they are unlikely to be found in the proteins on the cell membrane.However, these substitutions may not necessarily change the overall structure of the protein, as the alignment results suggest.Accordingly, to further investigate the natural QTY variations, we compared the effects of naturally occurred substitutions of L->Q, I->T, and F->Y, which are polar, to substitutions involving other polar amino acids including L to D, E, R, K, H, N, S, T, Y; I to D, E, R, K, H, N, S, Q, Y; and F to D, E, R, K, H, N, S, T, Q.
The PolyPhen-2 calculations showed that the natural QTY code variations are notably less damaging compared to the average of other polar amino acid changes.For the residue where the natural QTY code variations occurred, the average pph2_prob score (represents the probability of a substitution being damaging, ranges from 0.0 to 1.0) for other polar amino acid substitutions was 0.725, whereas for the QTY code substitutions, it was 0.588.The natural QTY substitutions also showed a lower impact compared to the average of all 19 amino acids (0.648), regardless of their polarity.This is perhaps due to the similar molecular structures of L, I/V, F with Q, T, Y, respectively at particular position, thus these mutations have less change for the molecular structures.
For analyzing reverse QTY (rQTY) mutations, we compared the effects of naturally occurring substitutions of Q->L, T->I, and Y->F, to substitutions involving other nonpolar amino acids (A, C, G, I, L, M, F, P, W, V).The PolyPhen-2 calculations again showed that the rQTY variations are significantly less damaging compared to the average of other nonpolar amino acid changes.For the residue where the rQTY code variations occurred, the average pph2_prob score for other nonpolar amino acid substitutions was 0.562, and for the rQTY substitutions, it was just 0.339.Moreover, the rQTY substitutions also showed a prominently lower impact compared to the average of all 19 amino acids (0.541), regardless of their polarity.3D plots were drawn to visualize the predicted effect of 19 possible variations of the residue of which natural QTY and rQTY substitutions were submitted by sequencing projects (S8 and S9 Figs in S1 File).These findings can also be reasoned with the explanation described above.
Interestingly, transmembrane (TM) regions of glutamate transporters were found to be more conserved compared to the motifs in the N-and C-termini (S12-S19 Figs in S1 File).This conservation may be attributed to the crucial role played by TM regions in maintaining the structural integrity of these proteins.In support of this, mutation visualization of the whole transporter sequence also showed that the residues at the TM domains are more sensitive to amino acid substitutions compared to the N-termini and C-termini (S10 and S11 Figs in S1 File).As expected from the evolutionary profiling, EAATs were also found to be more sensitive to mutations than VGLUTs (S10 and S11 Figs in S1 File, respectively).
Despite many residues of glutamate transporters being evolutionarily conserved, the Q, T, Y mutations did not affect the overall predicted structure, and AlphaFold 2 predicted QTY variants superposed well with native structures.To further analyze the phenotypical effects of QTY code on the TM regions, alongside the natural variant analysis derived from genomic databases, we also built mutation libraries for all L, I, and F amino acids in the TM region of the EAA1 (total 97), regardless of their occurrence in the population or nature.The results showed that the TM regions are indeed sensitive to changes, confirming the evolutionary data and mutation visualizations of the entire sequence.The impact of the substitutions varied (S3-S7 Figs in S1 File).For instance, the substitution of L (leucine) with other nonpolar amino acids such as I (isoleucine) is predicted to have less impact on EAA1 function than substitution with polar amino acids (S3 Fig in S1 File).Substitution of I at certain positions in TM segments had minor impacts on protein function (S4 Fig in S1 File).Substitutions from F also had similar pattern with those from I and L, indicating effects of polarity on the amino acid substitutability (S5 Fig in S1 File).One possible explanation for this observation could be the structural similarity between I and V (as well as L), as their branched side chains allow for similar interactions.Such findings suggest that substituting certain amino acid residues that share similar structures may not significantly alter protein structure or function, aligning with the primary hypothesis of the QTY code [9].Regarding the primary focus of this study, the L->Q, I->T, F->Y substitutions (QTY code) had a slightly lower impact on function and structure (~0.819), compared to the average of the 19 amino acids (~0.825), and were notably less damaging than the average of other polar amino acids (~0.896).

Possible implications and future directions of the study
Our study provides insights into the influence of amino acid substitutions in the transmembrane (TM) region of the glutamate transporters, offering approaches to design diagnostics tools, and generate therapeutics monoclonal antibodies.Even if the TM domains are sensitive to substitutions and under strong evolutionary conservation, our findings suggest that it may be possible to create soluble variants of these domains that do not perhaps alter the overall structure of the transporters.Membrane localization also regulates the dynamics of native glutamate transporters, hence contributing to the transport process [23,33].In the case of designed soluble variants, their potential additional functions that differ from wild type proteins (such as solubility) may also generate valuable research outcomes.Performing Molecular Dynamics simulations can facilitate the study of functional properties that result from differences in water accessibility [58,59].While it may not be easy to explain their functional dynamics and behavior in soluble environments, and well beyond the scope of structural informatics analysis, our study utilizing the phenotypical profiling shed light upon the roles of TM segments and their bilayer localization in transport function.Even if the QTY variants cannot perform some functions that are specific to wild type protein's membranous localization, taking into account that such stable soluble variants share substantial structural composition with their transmembrane counterparts, makes them strong tools for both functional studies and drug design.Such outcome results from targeting soluble proteins is easier than those involving membrane proteins [8].Having similar structural conformations as its native counterparts, QTY variants could potentially be utilized with the existing pharmaceutical discovery strategies [33].Furthermore, this structural alignment with native transporters suggests that the QTY variants can also provide valuable tools to produce antibodies for effectively managing various disorders, especially when considering the already existing studies on roles of anti-EAA2 autoantibodies in disease etiologies [60].This characteristic is therefore specific to soluble QTY variants and could not be achieved with native membrane proteins.Molecular Dynamics simulations could be further used to explain the mutagenesis induced dynamics of the variants and specific amino acid substitutions [61,62].Since our study focused on the theoretical aspects, experimental studies involving QTY variants are likely to be beneficial.We suggest further experimental research to consider these specific functional differences and additional applications resulting from the unstudied dynamics of water-soluble TM-like segments, at the same time we further emphasize the similarities of our suggested QTY-code with the reverse QTY-code.

Conclusion
Our study moreover considers evolutionary aspects of the QTY-code design strategy.Such analysis is especially useful for genetic variant analysis since the phenotypical or functional differences cannot always be causally linked with genetic variants, which may therefore become a major limitation of protein design strategies using genetic variant analysis [63,64].Through our analysis of genetic variations submitted by large-scale sequencing studies, we uncovered the potential to trace less harmful systematic variations for effective protein design.
Our findings suggest that variant analysis and evolutionary profiling, combined with structural informatics studies, are promising research tools for designing proteins with specific properties, such as water solubility.Accordingly, our data revealed that the QTY code did not alter the overall structure of the 8 glutamate transporters.Moreover, the QTY code had a notably lesser impact on the phenotypical characteristics of the proteins under investigation, as compared to the average of other polar amino acid substitutions.
Our structural bioinformatics studies not only provided insight into the differences between the hydrophobic helices and hydrophilic helices in the glutamate transporters, but they are also expected to stimulate further study of other water-soluble transmembrane proteins.
. Meanwhile, VGLUTs have a larger portion of intracellular motifs than those in EAATs and YLAT2 (S2 Fig in S1 File).The isoelectric points (pIs) of the transporters varied between 9.26 for EAA4 and 5.56 for EAA3 (Fig 1 and S22 Fig in S1 File and Table

Fig 1 .
Fig 1.Sequence and protein alignments of the native and QTY variants of eight glutamate transporters.The alignments performed are as follows: a EAA1 vs EAA1 QTY , b EAA2 vs EAA2 QTY , c EAA3 vs EAA3 QTY , d EAA4 vs EAA4 QTY , e VGLUT1 vs VGLUT1 QTY , f VGLUT2 vs VGLUT2 QTY , g VGLUT3 vs VGLUT3 QTY , and h YLAT2 vs YLAT2 QTY .Molecular weight, isoelectric point (pI), total variation %, and transmembrane variation % are listed for both the natural and QTY variants.The TM alpha-helices (blue) are shown above the protein sequences.The QTY amino acid substitution changes are colored in red.Other color code: Yellow line, intracellular; Blue wavetransmembrane helices; Pinkish line, extracellular; Green line, peripheral domains and hairpin loops.Single letter abbreviations for the amino acid residues are A, Ala; C, Cys; D, Asp; E, Glu; F, Phe; G, Gly; H, His; I, lle; K, Lys; L, Leu; M, Met; N, Asn; P, Pro; Q, Gln; R, Arg; S, Ser; T, Thr; V, Val; W, Trp; and Y, Tyr.https://doi.org/10.1371/journal.pone.0289644.g001

Fig 4 .
Fig 4. Hydrophobic surface of crystal and cryo-EM structures of two native glutamate transporters and the designed QTY variants.After Q, T, and Y replacement of the hydrophobic residues L, I, V, F, the surfaces were more hydrophilic.The hydrophobic surface (brownish) of the native transporters became more cyan color indicating the hydrophobic surface is largely reduced on the transmembrane helices for the QTY variants: a EAA1 Crystal vs b EAA1 QTY , c EAA3 CryoEM vs d EAA3 QTY .For clarity, N-and C-termini and large loops are deleted.https://doi.org/10.1371/journal.pone.0289644.g004

Fig 5 .
Fig 5. Hydrophobic surfaces of 8 AlphaFold2 predicted native glutamate transporters and their designed QTY variants.After Q, T, and Y replacement of the hydrophobic residues L, I, V, F, the surfaces were more hydrophilic.The hydrophobic surface (brownish) of the native transporters became more cyan color indicating the hydrophobic surface is largely reduced on the transmembrane helices for the QTY variants: a EAA1 vs EAA1 QTY , b EAA2 vs EAA2 QTY , c EAA3 vs EAA3 QTY , d EAA4 vs EAA4 QTY , e VGLUT1 vs VGLUT1 QTY , f VGLUT2 vs VGLUT2 QTY , g VGLUT3 vs VGLUT3 QTY , h YLAT2 vs YLAT2 QTY .For clarity, N-and C-termini and large loops are deleted.https://doi.org/10.1371/journal.pone.0289644.g005

Fig 6 .
Fig 6.Natural mutations of QTY-code.The native structures (green) and predicted effects of QTY and reverse-QTY mutations are shown as colored residues.Blue = benign, orange = possibly damaging with low confidence, red = damaging with high confidence.a EAA1, b EAA2, c EAA3, d EAA4, e VGLUT1, f VGLUT2, g VGLUT3, h YLAT2.For clarity, N-and C termini and large loops are deleted.https://doi.org/10.1371/journal.pone.0289644.g006

Table 1 . Characteristics of native glutamate transporters and their water-soluble QTY variants.
Residue mean-square distance (RMSD) in Å, Isoelectric focusing (pI), Molecular weight (MW), Transmembrane (TM),-= not applicable.The internal and external loops have no changes, the overall changes are significant, and the TM changes are rather large.https://doi.org/10.1371/journal.pone.0289644.t001

Table 4 . Natural mutations of Q->L, T->I, Y->F in glutamate transporters (No T->V mutations
. A single base mutation on the second position of the codons).