Identification of TENP as the Gene Encoding Chicken Egg White Ovoglobulin G2 and Demonstration of Its High Genetic Variability in Chickens

Ovoglobulin G2 (G2) has long been known as a major protein constituent of chicken egg white. However, little is known about the biochemical properties and biological functions of G2 because the gene encoding G2 has not been identified. Therefore, the identification of the gene encoding G2 and an analysis of its genetic variability is an important step toward the goal of understanding the biological functions of the G2 protein and its utility in poultry production. To identify and characterize the gene encoding G2, we separated G2 from egg white using electrophoresis on a non-denaturing polyacrylamide gel. Two polymorphic forms of G2 protein (G2A and G2B), with different mobilities (fast and slow respectively), were detected by staining. The protein band corresponding to G2B was electro-eluted from the native gel, re-electrophoresed under denaturing conditions and its N-terminal sequence was determined by Edman degradation following transfer onto a membrane. Sequencing of the 47 kDa G2B band revealed it to be identical to TENP (transiently expressed in neural precursors), also known as BPI fold-containing family B, member 2 (BPIFB2), a protein with strong homology to a bacterial permeability-increasing protein family (BPI) in mammals. Full-length chicken TENP cDNA sequences were determined for 78 individuals across 29 chicken breeds, lines, and populations, and consequently eleven non-synonymous substitutions were detected in the coding region. Of the eleven non-synonymous substitutions, A329G leading to Arg110Gln was completely associated with the noted differential electrophoretic mobility of G2. Specifically G2B, with a slower mobility is encoded by A329 (Arg110), whereas G2A, with a faster mobility, is encoded by G329 (Gln110). The sequence data, derived from the coding region, also revealed that the gene encoding G2 demonstrates significant genetic variability across different chicken breeds/lines/populations. These variants, and how they correlate with egg white properties, may allow us to understand further G2’s functions.


Ethics statement
Animal care and all experimental procedures were approved by the Animal Experiment Committee, Graduate School of Bio-agricultural Sciences, Nagoya University (approval no 2014021202), and the experiments were conducted according to the 'Regulations on Animal Experiments at Nagoya University'.

Electrophoresis of the G2 Protein and Detection of Polymorphisms
In total, 285 egg white samples were collected from 27 chicken breeds, lines, and populations including a closed colony of red jungle fowl (Table 1). These samples were subjected to electrophoresis on non-denaturing polyacrylamide gels (38 x 38 x 1 mm) according to the method of Davis [23]. Briefly, thin egg white samples were diluted with 4 volumes of dilution buffer (2 ml of 0.5 M Tris-HCl [pH 6.8], 1.6 ml glycerol, and 0.4 ml of 0.05% [wt/vol] bromophenol blue), and electrophoresis was performed at 4°C at a constant current of 20 mA for 30min on a 4.5% stacking gel and at 50 mA for 3.5h on an 8% separating gel with 25 mM Tris-192 mM glycine buffer (pH 8.3). The separating gel was stained with 0.125% (wt/vol) Coomassie brilliant blue R-350 (CBB-R350) in methanol:acetic acid:water (40:7:53), and the protein bands were detected following destaining in methanol:acetic acid:water (25:10:65). Polymorphic differences in G2 were detected by assessing the presence of differential migrating protein bands corresponding to G2.

Extraction of G2 Proteins from Gels
Fresh thin egg white was collected from two hens in Ehime-Jidori, which exhibited the two different electrophoretic forms of G2 (G2 A and G2 B ). G2 bands were visualized by native-PAGE analysis as described above. The G2 bands were excised from the gel and extracted with 25 mM Tris-192 mM glycine buffer (pH 8.3) at 4°C using an electroeluter (Bio-Rad Laboratories, Hercules, CA, USA). The eluted samples were desalted using dialysis against distilled deionized water for 16 h at 4°C. The samples were subsequently lyophilized and stored at −80°C until analysis.

SDS-PAGE and the N-terminal Amino Acid Sequencing of G2 Protein
The G2 B form sample of the G2 protein extracted from the gel was mixed with an equal volume of SDS sample buffer (2 ml of 0  [24], and the gel was stained and then destained using the same method, as described for native gel electrophoresis. The G2 B protein on the gel was electroblotted onto a polyvinylidene difluoride (PVDF) membrane, using a semi-dry blotter system at a constant current of 20 mA for 1.5 h as described by Towbin et al [25]. After transfer, the membrane was stained with 0.1% CBB-R350, 40% methanol and 1% acetic acid for a few minutes and destained in 50% methanol until the bands became clearly visible. The G2 B protein on the PVDF membrane was placed directly into a Model 492 Procise 1 cLC capillary protein sequencer (Applied Biosystems, Carlsbad, CA, USA) for automatic Edman degradation analysis according to the manufacture's protocol. Partial amino acid sequence data obtained by the N-terminal protein sequencing was used to search for homologous proteins with a translated BLAST search (tblastn) on the NCBI BLAST homepage (http://www.ncbi.nlm.nih.gov/BLAST/) [26].

RT-PCR and Nucleotide Sequencing of G2 cDNAs from Different Chicken Sources
In total, 78 individuals, consisting of 21 egg-laying hens and 57 embryos, were used to sequence the cDNAs encoding the respective G2 protein (S1 Table). Animals were sacrificed by cervical dislocation, and a small piece of the oviduct magnum tissue was taken from each animal and quickly placed in RNAlater (Ambion, Austin, TX, USA). Whole embryos, lacking the head, were taken from five-day post incubation eggs and placed in RNAlater. Total RNA was extracted from these samples using TRIzol reagent according to the manufacturer's instruction (Invitrogen, Carlsbad, CA, USA). The quality of RNA was evaluated electrophoretically on a 2% agarose gel. Total RNA (1.5 μg) was reverse transcribed using a PrimeScript RT-PCR Kit (Takara, Otsu, Japan) according to the manufacturer's instruction. The PCR reaction mixture contained 20-50 ng of cDNA, 1 x KOD FX buffer, 200 μM of each dNTP, 1.5 μl of each of forward and reverse primers (10 μM each), and 1 unit of KOD FX DNA polymerase (TOYOBO, Osaka, Japan) in a final volume of 50 μl. The PCR reaction was performed on a GeneAmp PCR system 9700 (Applied Biosystems) with the following cycle: initial denaturation for 2 min at 94°C, 35 cycles of 98°C for 10 s, 65°C for 30 s, and 68°C for 1.5 min, and final extension at 72°C for 7 min. The PCR products were purified from the gel using a Gel-M gel extraction kit (Viogene, Umeå, Sweden), and nucleotide sequences were determined using an ABI PRISM3130 DNA Analyzer after completion of a sequencing reaction using a Big Dye Terminator Cycle Sequencing Kit v3.1 (Life Technologies-Applied Biosystems). Sequence data sets were analyzed using ATGC sequence assembly software (Ver.5) (Genetyx, Tokyo, Japan).

Genotyping of the G2 Gene
Genotyping of the gene encoding G2 was performed for a wide variety of chicken breeds, lines, and populations using a PCR-RFLP (restriction fragment length polymorphism) method (detailed below). Whole blood samples (1-5 ml) were collected from the wing vein of each bird using heparinized syringes, and genomic DNA was extracted from 10 μl of whole blood using the DNAZOL BD reagent (Molecular Research Center, Inc., Cincinnati, OH, USA). PCR products were digested with restriction endonucleases, electrophoresed on a 2% agarose gel, and the resultant DNA bands were visualized by staining with ethidium bromide.

Isolation of Ovoglobulin G2 Variants
Electrophoretic polymorphisms in egg white G2 protein were surveyed for 27 chicken breeds, lines, and populations (Table 1). Two G2 protein bands with differing mobility were detected in egg white (namely, G2 A and G2 B ) by native-PAGE (Fig 1): G2 A had a faster mobility than G2 B . Egg white homozygous for either G2 A or G2 B showed a single G2 band of the appropriate electrophoretic mobility whereas egg white heterozygous for G2 showed both G2 bands. Of the 27 populations used in this study, 21, including red jungle fowl (RJF/NU), were monomorphic at the G2 loci having a phenotype G2 B /G2 B . Ukkokei (SIL) was found to be the only population with a G2 A /G2 A phenotype, and the remaining five populations including Ehime-Jidori (EJ) were polymorphic (Table 1). EJ chickens were selected as a source of G2 because they have both G2 A and G2 B and they have been maintained as a closed population. The two G2 forms were identified based on their differing electrophoretic mobility on 8% native-gels under non-denaturing conditions (Fig 2  lanes 1,2) with G2 A having the faster electrophoretic mobility. The two forms were electroeluted from the gel, and the eluted samples were confirmed to be a single band by re-electrophoresis using 8% native-PAGE (Fig 2; lane 3, 4). The monomeric molecular mass of G2 was estimated to be approximately 47 kDa using 10% SDS-PAGE (Fig 3).

N-terminal Amino Acid Sequence of G2
We excised the G2 B protein band and used it to determine the N-terminal amino acids of the G2 protein by Edman degradation. The first fifteen N-terminal amino acids of G2 B protein were TRAPDCGGILTPLGL. The purity of the samples was confirmed by the absence of significant background interference during sequencing. A tblastn search revealed that the N-terminal amino acid sequence of G2 B was identical to the amino acid resides 14-28 of chicken TENP (transiently expressed in neural precursors) (Accession no. AF029841), so named, because it is transiently expressed prior to overt cell differentiation of neural precursor cells during neurogenesis in chicken embryos [27]. The N-terminal sequence of the G2 protein determined in this study did not contain the first 13 N-terminal amino acid residues (MGALLALLDPVQP) of TENP, likely because this sequence is predicted as a putative processed signal peptide by SOSUIsignal (http:// bp.nuap.nagoya-u.ac.jp/sosui/sosuisignal/) [28]. The signal cleavage site of G2 B was completely identical to that of the chicken TENP (Accession no. HG007958) proposed by Whenham et al [29]. These results suggest that the egg white G2 B protein is the product of the TENP gene.

Mutations Causative for G2 Electrophoretic Variants
To identify the presumptive mutation underlying the differences in the electrophoretic mobility of G2 proteins, we determined the nucleotide sequences of full length TENP cDNAs prepared from the oviduct RNA from 21 egg-laying hens as shown in S1 Table. In addition, the full-length cDNA sequences of TENP from 57 embryos representing 23 breeds, lines, and populations were also determined. RT-PCR was performed with a pair of primers (Tenp_F1 and Tenp_R1) and internal primers were used for direct sequencing with TENP cDNA as template ( Table 2). These primers were designed based on the nucleotide sequence of the chicken TENP gene (AF029841). Comparisons of the TENP cDNA sequences among 21 hens and 57 embryos identified a total of 21 SNPs, including 10 synonymous substitutions at positions 312, 426, 594, 807, 843, 846, 870, 897, 1170, and 1201, and, 11 non-synonymous substitutions at positions 143, 233, 238, 283, 286, 301, 329, 616, 1225, 1249, and 1253 (S1 Table, Table 3, Fig 4). Eleven non-synonymous substitutions found including Thr48Met, Ser78Leu, Ile80Val, Val95Ile, Thr96Ala, Val101Met, Arg110Gln, Ala206Thr, Met409Leu, Val417Ile, and Ser418Asn. The sequences obtained in this study were deposited in the DNA Data Bank of Japan (DDBJ; http:// www.ddbj.nig.ac.jp/index-e.html; accession numbers, AB219157 -AB219159, LC144559 -LC144608). We examined the association of non-synonymous substitutions in the TENP gene with electrophoretic mobility differences in the G2 protein. The results showed that two of the eleven non-synonymous substitutions, G329A in exon 4 (Arg110Gln) and G616A in exon 7 (Ala206Thr), corresponded to the egg white G2 phenotype; A329 (Gln110) and A616 (Thr206) were from the G2 A allele and G329 (Arg110) and G616 (Ala206) were from the G2 B allele in at least 21 hens.
An N-linked glycosylation site is predicted at Asn265 in the G2 protein (see NetNglyc 1.0 Server (http://www.cbs.dtu.dk/services/NetNGlyc/) [30]). However, there were no amino acid substitutions noted at this position, indicating that N-linked glycosylation is, most likely, not associated with the differences in G2 electrophoretic mobility.

Association between Nucleotide Substitutions in TENP and Electrophoretic Polymorphism in G2
To confirm the association between the two non-synonymous substitutions, G329A and G616A, from the TENP gene and the electrophoretic mobility difference in egg white G2, nucleotide sequences at the two substitution sites and electrophoretic mobility were examined for 412 individuals from twenty six different chicken breeds, lines, and populations, including a colony of red jungle fowl, as shown in Table 4. A 355-bp fragment including exon 4 and a 506-bp fragment including exon 7 were amplified separately using Tenp_ex3F/Tenp_int4R primers and with Tenp_int6F/Tenp_ex8R primers, respectively (Table 2) using genomic DNA as a template. The primers used for amplification and sequencing were designed based on a reference sequence from chicken chromosome 20 (NC_006107) taken from the genome assembly Gallus_gallus-4.0/galGal4 (http://www.ncbi.nlm.nih.gov/assembly/GCF_000002315.3/). The 355-bp and 506-bp PCR products were digested with BcnI and HphI restriction enzymes, respectively, andelectrophoresed.

Discussion
Previously, G2 was fractionated from chicken egg white using basic protein isolation methods such as ammonium sulfate precipitation and/or chromatographic separation, and basic information, such as molecular weights and isoelectric points (pI) were characterized [13,14,16]. Additionally, more than 100 egg white protein components were identified by proteomics analysis [7,8], however, the G2 protein remained unidentified. This was because its primary amino acid sequence was unknown [13,14,15,16,17]. G2 can be easily separated from other egg white proteins using starch and/or non-denaturing acrylamide gel electrophoresis. Two principal polymorphic forms (G2 A and G2 B ) having different electrophoretic mobilities are found in variety of different chicken populations [18,19,20,22,31,32,33,34]. Electrophoretic separation is, therefore, a convenient and reliable means for separating the G2 protein from other egg white proteins. Further, the presence of two polymorphic forms is useful for confirming whether G2 and a putative candidate gene are the same. In this study, we separated the polymorphic forms of the G2 protein as single protein bands following electrophoresis of egg white proteins using native-PAGE. The two polymorphic forms (G2 A and G2 B ) were electroeluted directly from the gel, and when re-electrophoresed under denaturing conditions by SDS-PAGE both the forms had a molecular weight of 47 kDa, which compared favorably to that reported in previous studies (49 kDa, Nakamura et al [14]; 47 kDa, Stevens and Duncan [16]); however, it was substantially different from that reported by Feeny (35 kDa) [13]. The first fifteen N-terminal amino acid residues (TRAPDCGGILTPLGL) of the G2 protein corresponded completely to those at position 14 to 28 of TENP (AF029841), reported by Yan and Wang [27]. The SOSUIsignal (http://bp.nuap.nagoya-u.ac.jp/sosui/sosuisignal/) predicted that the first 13 Nterminal amino acid residues (MGALLALLDPVQP) of TENP (AF029841) correspond to a signal peptide. This putative G2 signal cleavage site was identical to that of chicken TENP (HG007958) proposed by Whenham et al [29], and the N-terminal amino acid residues of the chicken G2 protein were also homologous to those of the emu (Dromaius novaehollandiae) TENP (AB556937) [35]. The theoretical molecular weight and pI values of the two TENP forms (426 amino acids), which excluded the first 13 amino acid residues, were expected to be 47.4 kDa and 5.67, respectively, for the G2 B allele (AB219158, Kinoshita submitted to Gen-Bank, 23 Jun, 2005) and 47.4 kDa and 5.56, respectively, for the G2 A allele (AB219157, Kinoshita 2005) (Expasy compute pI/MW tool (http://web.expasy.org/compute_pi/))). These predicted molecular mass values correspond favorably to those obtained by SDS-PAGE electrophoresis of the native G2 protein (47 kDa). Our results, therefore, provide strong evidence that the egg white ovoglobulin G2 and the protein product of the TENP gene found in chicken embryos are identical proteins encoded by the same gene.
Chicken TENP was first identified as a nearly embryonic protein, which was transiently expressed in neural precursor cells in retina and brain [27]. Subsequently, it was also identified as a component of egg white, vitelline membrane, eggshell, and egg yolk in chickens by proteomic analysis [7,8,36,37,38]. The chicken TENP gene consists of 16 exons encoding 439 amino acids and is located at position 10,642,930−10,647,543 on chicken chromosome 20 (NC_006107.3). TENP is a member of the BPI (bactericidal/permeability-increasing protein) fold-containing family B (BPIFB) [39], and chicken TENP was classified as BPIFB7 (BPI foldcontaining family B member 7) as a new family gene [39,40]. Chicken BPIFB7 shows the highest homology with mammalian BPIFB2 (BPI fold-containing family B member 2) (also known as BPIL1, BPI-increasing protein-like 1), which is highly expressed in the tonsils in Homo sapiens [41]. Therefore, chicken TENP is now officially named BPIFB2 (NCBI gene ID 395882). The human BPIFB2 is an antibacterial and endotoxin-neutralizing protein and is a key component of the innate immune system involved in defense against bacteria. It binds and neutralizes bacterial lipopolysaccharide and thereby abolishes the bioactivity of these toxic bacterial products [41,42,43,44,45]. TENP homologue of the emu has been also isolated from the egg white as a major protein, which exhibited antimicrobial activity to gram-positive bacteria, such as Micrococcus luteus and Bacillus subtilis, but not against gram-negative bacteria such as Escherichia coli and Salmonella typhimurium [35]. These results collectively suggest that this protein may have multiple functions as a component of BPI-like innate immune system during avian early embryonic development [29].
The amino acid sequences of TENP are conserved among four avian species, including chicken (Gallus gallus, Phasianidae, Galliformes), duck (Anas platyrhynchos, Anserformes), zebra finch (Taeniopygia guttata, Passeriformes), and emu (Dromaius novaehollandiae, Struthioniformes); amino acid sequence identities, including gaps in the equivalent regions of the chicken G2 protein (439 aa) (Genbank accession no. AB219158), ranged from 72.0% (316/ 439) with A. platyrhynchos (XP_005011070) to 62.4% (274/439) with D. novaehollandiae (AB556937) and 59.2% (260/439) with T. guttata (XP_012425675). The amino acid substitutions had a tendency to be biased in the N-terminal region (Fig 7). Most of the amino acid substitutions detected in the chicken were not located in highly conserved regions. G2 protein has no N-terminal modification and seems to be less stable in egg white; therefore, amino acid substitution is involved in the egg white stability and/or, the defense against bacteria, rather than alteration in essential functions of the protein, as has been suggested in previous studies [46]. Our previous studies have shown that a non-synonymous substitution that results in a change in net charge on the protein is responsible for electrophoretic polymorphism in both chicken ovalbumin (OV) and in Japanese quail lysozyme (LYZ) [22,47]. In this study, we discovered a non-synonymous substitution (A329G), leading to Arg110Gln in the TENP protein, which uniquely discriminates G2 A from G2 B by the difference of electrophoretic mobility. Generally, the net charge of proteins depends on their amino acid composition as well as post-translational modifications such as addition of sialic acids and phosphate groups. The Arg110Gln substitution was expected to reduce from the total positive charge on G2 A by one unit due to the substitution of the positively charged hydrophilic polar amino acid (Arg) for a non-charged hydrophilic polar amino acid (Gln) at residue 110; thus, the electrophoretic mobility on the native-gel became faster than that of the G2 B . However, the other ten amino acid substitutions found (refer to Table 3) are electrically neutral and as such probably do not affect the net charge on G2, and so we expect that they will not cause a difference in G2 electrophoretic mobility.
The G2 B allele was found to be dominant in almost all of the chicken breeds surveyed in this study. The G2 A allele was found in Japanese native breeds (namely, Ehime-Jidori, Chabo, Chahan, and Ukkokei), and was also found in two commercial layers (white and brown egg). In general, amino acid residues have different functional roles in the structure of a protein, such as in its activity, stability, and folding. Thus, amino acid substitutions found in this study may influence the function of ovoglobulin G2. However, the effects of these substitutions on biochemical properties of the G2 protein remain unknown. There have been several reports suggesting a correlation between variations in the G2 protein and economic traits such as egg production, egg weight, shell thickness, hatchability, etc. [48,49,50,51,52] as well as the embryonic fatality rate [49]. In addition, ovoglobulins have also been reported to be important in egg white foaming quality and viscosity [9,10], although their biological and food chemical functions have not been clearly elucidated. The availability of a wide range of chicken resources makes it potentially easy to identify novel mutations in the G2 protein gene and this could be of potential importance in establishing a relationship between G2 protein variants and favorable economic traits, such as resistance to disease and parasites, egg white and shell quality, and hatchability, all of which are of great potential importance to future poultry production. Further studies examining the correlation of TENP haplotype with various egg white traits using genetically diverse chicken resources are needed to develop a better understanding of the physiological functions of TENP in the chicken.
Supporting Information S1 Table. Missense Mutations Found in the Coding Region of TENP cDNA Prepared from Seventy Eight Individual Samples Representing 28 Different Chicken Breeds, Lines, and Populations. cDNAs were prepared from RNA isolated from 21 chicken oviduct samples (representing six chicken breeds, lines, and populations) and their nucleotide sequences were determined. cDNAs were also prepared from 57 different embryos (representing 23 chicken breeds, lines, and populations) and their respective nucleotide sequences were also determined. Y, cytosine or thymine; R, adenine or guanine; M, adenine or cytosine. (TIF)