An Extended Nomenclature for Mammalian V-ATPase Subunit Genes and Splice Variants

The vacuolar-type H+-ATPase (V-ATPase) is a multisubunit proton pump that is involved in both intra- and extracellular acidification processes throughout the body. Multiple homologs and splice variants of V-ATPase subunits are thought to explain its varied spatial and temporal expression pattern in different cell types. Recently subunit nomenclature was standardized with a total of 22 subunit variants identified. However this standardization did not accommodate the existence of splice variants and is therefore incomplete. Thus, we propose here an extension of subunit nomenclature along with a literature and sequence database scan for additional V-ATPase subunits. An additional 17 variants were pulled from a literature search while 4 uncharacterized potential subunit variants were found in sequence databases. These findings have been integrated with the current V-ATPase knowledge base to create a new V-ATPase subunit catalogue. It is envisioned this catalogue will form a new platform on which future studies into tissue- and organelle-specific V-ATPase expression, localization and function can be based.


Introduction
The vacuolar-type H + -ATPase (V-ATPase) is a proton pump found in all nucleated cells of the body. Partially embedded within the membrane, its function is to transfer two hydrogen ions out of the cytoplasm at the expense of 1 ATP molecule [1]. It is thereby able to establish a proton gradient within the lumen of organelles such as lysosomes, endosomes and the trans-Golgi network. Organelle acidification is important for a diverse array of functions including intracellular trafficking and protein degradation [2][3][4].
The V-ATPase is also functionally important at the plasma membrane of specialized cell types in certain tissues. There, it is responsible for critical homeostatic functions such as body acidbase regulation (renal proximal tubule and collecting duct intercalated cells), bone remodeling (osteoclasts), and sperm storage and maturation (clear cells in the epididymis) [2,5,6] as well as other potential functions in other organs.
Loss of V-ATPase activity due to subunit mutations in specific cell types has been implicated in diverse pathophysiological states such as kidney and bone disease, sensorineural deafness and wrinkly skin syndrome [7][8][9][10]. Thus V-ATPase functions in a very broad array of physiological processes.
The V-ATPase is a large (800 kDa) and complex molecular motor. It is made up of at least 13 individual components/protein subunits organized into two functional domains: V 0 and V 1 [1,4,11,12]. The V 0 domain is composed of several transmembrane subunits that are involved in hydrogen ion translocation across the bilayer, while the V 1 domain is a peripheral to the membrane and is the site of ATP hydrolysis. V 0 is composed of 5 subunits labeled a to e while V 1 consists of 8 subunits denoted A to H. All 13 different subunits are encoded by separate genes located throughout the genome.
Many of the 13 V-ATPase subunits exist as homologs, thereby adding another level of complexity to the motor. The diverse functions and locations of V-ATPases are believed to be encoded within the various homologs. For example the d1 subunit is ubiquitously expressed, while the d2 homolog is seen only in the kidney, osteoclast and lung [13]. Similarly two isoforms of the B subunit were initially described as so-called ''kidney'' (B1) and ''brain'' (B2) specific isoforms, although it is now clear that expression of B1 is not restricted to the kidney [12]. Currently, homologs have been identified for the a, d, e, B, C, E and G subunits [11,14].
An additional level of V-ATPase subunit variation is encoded through splice variants. To date, splice variants have been identified for the a, d, e, C, G and H subunits [9,[13][14][15][16][17][18][19][20][21][22][23]. Just like homologs, splice variants have been shown to exhibit different expression patterns. For example two splice variants of subunit a1 are expressed in rat neurons: one variant localized to axonal varicosities while the other was sorted to distal dendrites and axons [17].
To date, discovery of V-ATPase homologs and splice variants has largely been subunit focused and experimentally based. This method has proven successful as demonstrated by the large number of identified variants. However, this fragmented discovery process led to a fragmented naming system which was recently standardized [11]. A large effort was undertaken to associate the multiple names of a given subunit to one new standard nomenclature, but extension of this nomenclature to include splice-variants was not systematically pursued at that time. Thus, here we have performed a literature search of all V-ATPase subunits, their homologs and splice variants. In addition we scanned both the RefSeq nucleotide and protein databases [24] to identify any novel subunits that have been deposited within these databases. The results of this analysis are presented here.

Results and Discussion
At the time of the first V-ATPase nomenclature standardization, the incorporation of splice-variants was not implemented [11]. Thus, we now suggest augmenting the naming policy to allow for the differentiation of splice variants. In accordance with HUGO Gene Nomenclature Committee (HGNC) specifications, we propose the addition of a ''v[1..x]'' suffix to the relevant gene symbols, and ''i[1..x]'' to the predicted proteins. For example, ATP6V0A1v1 and ATP6V0A1v2 would differentiate between splice-variants 1 and 2 of subunit a1. As is the case at present, we propose that if new homologs of subunits with only a single currently known isoform are identified in the future, they should be named in numerical order (e. g., ATP6V1F will become ATP6V1F1 if a new isoform, which will become ATP6V1F2, is discovered). We also propose that the first full-length ''known'' subunit splice variant should be named ''v1'', allowing for subsequent addition and standard nomenclature if additional variants are discovered in the future. For subunits with only one known variant, the v1 nomenclature should be appended to the existing name only if and when a second variant (which will be named v2) is identified. In view of our extensive database search we believe, however, that this possibility is unlikely.
The V-ATPase proton pump is composed of 13 subunits. According to the data presented in Smith et al (2003) these 13 are encoded through 22 homologs. These are denoted with a ''K'' for known in Table 1. A simple search of the literature and the Entrez Gene database [24] reveals another 17 variants. These are denoted with either an ''E'' for Entrez or ''L'' for literature in Table 1. All of these additional subunits are splice variants except for e2, which was cloned after the publication of Smith et al (2003). Our computational analysis of sequence repositories has identified   #No human transcript could be identified in RefSeq, so the rat ortholog is provided. *If a novel homolog is discovered for a subunit with no known homologs then the current homolog will be denoted 21 and the novel homolog denoted 22.
For example ATP6V1F will become ATP6V1F1 and the novel homolog ATP6V1F2. If additional splice variants of these new homologs are then discovered, the terminology will become, for example, ATP6V1F1v1 and ATP6V1F1v2 etc. doi:10.1371/journal.pone.0009531.t001  Figure 1). Finally, we propose that accessory proteins (whose functions related to the V-ATPase remain unknown) that also use the ATP6 nomenclature, including ATP6AP1 and ATP6AP2, should be included in the revised nomenclature scheme, and the appropriate suffixes should be added to their names if additional isoforms and splice variants emerge in the future. However, we have also examined these sequences and apart from pseudogenes, found no variants. Experimentally identifying and characterizing the associated functional proteins was outside the scope of this study but should  Table 1  be the subject of future work by investigators interested in V-ATPase function, including our own groups. The results of this study suggest that the V-ATPase is regulated in a much more complex manner than is currently assumed. At what level this regulation is exerted remains to be determined experimentally, and having a complete systems overview of all V-ATPase components will help expedite this process. This almost two-fold increase in the number of V-ATPase subunit forms demonstrates how the output of broad genomic scale projects can be utilized for a specialized pursuit. It also highlights the importance of computational methods for sifting and sorting through vast amounts of data deposited in sequence databases worldwide. Identification and standardization of transcript variation offers a powerful approach to guide the future assignation of functional significance among protein variants.

Materials and Methods
Sequences corresponding to the 13 human V-ATPase subunits described previously [11] were retrieved from the HUGO Gene Nomenclature Database (HGNC) database [25]. These sequences were used to query the RefSeq non-redundant protein and nucleotide databases [24] with the BLAST algorithm [26]. Various modules from the Bioperl toolkit were utilized to process the resulting BLAST output [27]. The HGNC V-ATPase identifiers were used to query the SpliceCenter databases [28]. The results of this analysis were manually integrated with the BLAST results described above. This integrated catalogue is presented in Table 1 and Figure 1.