PAX6 Haplotypes Are Associated with High Myopia in Han Chinese

Background The paired box 6 (PAX6) gene is considered as a master gene for eye development. Linkage of myopia to the PAX6 region on chromosome 11p13 was shown in several studies, but the results for association between myopia and PAX6 were inconsistent so far. Methodology/Principal Findings We genotyped 16 single nucleotide polymorphisms (SNPs) in the PAX6 gene and its regulatory regions in an initial study for 300 high myopia cases and 300 controls (Group 1), and successfully replicated the positive results with another independent group of 299 high myopia cases and 299 controls (Group 2). Five SNPs were genotyped in the replication study. The spherical equivalent of subjects with high myopia was ≤−8.0 dioptres. The PLINK package was used for genetic data analysis. No association was found between each of the SNPs and high myopia. However, exhaustive sliding-window haplotype analysis highlighted an important role for rs12421026 because haplotypes containing this SNP were found to be associated with high myopia. The most significant results were given by the 4-SNP haplotype window consisting of rs2071754, rs3026393, rs1506 and rs12421026 (P = 3.54×10−10, 4.06×10−11 and 1.56×10−18 for Group 1, Group 2 and Combined Group, respectively) and the 3-SNP haplotype window composed of rs3026393, rs1506 and rs12421026 (P = 5.48×10−10, 7.93×10−12 and 6.28×10−23 for the three respective groups). The results remained significant after correction for multiple comparisons by permutations. The associated haplotyes found in a previous study were also successfully replicated in this study. Conclusions/Significance PAX6 haplotypes are associated with susceptibility to the development of high myopia in Chinese. The PAX6 locus plays a role in high myopia.


Introduction
Myopia is the most common human eye disorder in the world and has become a significant public health problem [1,2]. High myopia, typically defined as a refractive error of 26.0 diopters (D) or worse, is associated with an increased risk of pathological ocular complications [3]. Myopia is regarded as a complex eye disease affected by both genetic and environmental factors as well as gene-environment interactions [4,5]. The prevalence of myopia is significantly higher in Asian populations (,70%) than in populations of European descent (,30%), especially in the younger generations in recent decades [1,[6][7][8]. The same also applies to the severity of myopia. This suggests that Asian populations are genetically more susceptible to particular environmental factors which cause myopia [9]. Family and twin studies support a high heritability of myopia and hence suggest a definite genetic basis for high myopia [4,5,10,11]. Many myopia loci have been mapped by linkage studies. [4,5] Quite a number of myopia susceptibility genes have also been identified by our group [12][13][14][15] and other groups, as has recently been reviewed [4,5]. The majority of findings are conflicting, including those for the paired box 6 (PAX6, GeneID: 5080) gene on chromosome 11.
The PAX6 gene is a member of the PAX family of transcription factors containing two DNA-binding motifs -the paired domain and the paired-type homeodomain. It is highly conserved, and essential for normal development of several organs including the brain, pancreas and the eye [16,17]. PAX6 has been considered as the master gene for eye development [18]. The correct dosage of PAX6 is crucial for normal eye development: over-expression of Pax6 in mice results in microphthalmia, retinal dysplasia and defective retinal ganglion cell axon guidance [19], whereas haploinsufficiency in the mouse results in the phenotype of the small eye (Pax6+/2) or no eye (Pax6-/-) [20]. In humans, heterozygous PAX6 mutations cause aniridia as well as other various congenital eye abnormalities [21]. Of note is that most mutations are truncating or loss-of-function and cause aniridia.
PAX6 was first reported as a candidate gene for myopia in a genome-wide linkage scan for myopia susceptibility loci in a twin study, and was in fact directly below the highest peak at the 11p13 locus -the MYP7 locus [22]. The same study and two other studies, however, did not found the association of PAX6 single nucleotide polymorphisms (SNPs) and common myopia [22][23][24]. Intriguingly, three Chinese studies reported positive association between different PAX6 polymorphisms and high myopia [14,25,26]. In addition, an Australian group found rare nonsense PAX6 mutations in family members with high myopia [27]. It seems that PAX6 is more likely a susceptibility gene for high myopia, rather than common myopia. Even for high myopia, the three positive association studies have however suggested three different polymorphisms [14,25,26], and the SNP rs667773 [25] could not be replicated in Han's study [14]. This warrants a larger-scale study to clarify the role of PAX6 in myopia. The present case-control study aims to investigate the relationship between high myopia and polymorphisms in the PAX6 gene and all known regions involved in the regulation of PAX6 expression. PAX6 regulatory elements have been identified in a large region extending from ,15 kb upstream of exon 1 of the gene to a downstream gene known as elongation protein 4 homologue (ELP4; Gene ID: 26610) [28][29][30][31][32][33][34].

Subject and DNA samples
In an initial association study, 600 unrelated Southern Han Chinese subjects (Group 1) were recruited: 300 cases of high myopes with spherical equivalent (SE) #28.00 D in both eyes, and 300 emmetropic controls with SE within 61.00 D in both eyes. A follow-up replication study was performed to validate the results of the initial phase with a second cohort of 598 unrelated Han Chinese subjects (Group 2) with equal number of cases and controls. The same entry criteria were used for recruiting subjects of Group 1 and 2. This study was approved by the Human Subjects Ethics Subcommittee of the Hong Kong Polytechnic University, and adhered to the tenets of the Declaration of Helsinki. Signed, informed consents were obtained from all participants. All subjects were recruited from the Optometry Clinic of the Hong Kong Polytechnic University, and the complete ocular examination, collection of blood samples and DNA extraction were performed as described previously [15]. Below is a summary of the study subjects.
Group 1 included 300 high myopia cases and 300 emmetropic control subjects (Table 1)

Selection and genotyping of SNPs
Tag SNPs were selected using Tagger [35] from a region of 324.6 kb (chr11: 31,484,873..31,809,434; NCBI B36 assembly) that encompassed the PAX6 gene and its potential regulatory regions (20 kb upstream of PAX6 and 282.16 kb downstream of PAX6). Note that the downstream regulatory region embraces the whole ELP4 gene, which is transcribed in the opposite direction. Based on HapMap data for Han Chinese (release #24/phase II), Tagger used the following criteria for SNP selection: pairwise tagging algorithm, r 2 $0.80 and minor allele frequency $0.10. Five additional SNPs were also included because simulation study has shown that, in case of positive association between the markers and the phenotype, strong linkage disequilibrium (LD) between markers does not necessarily guarantee correlated association test results [36]. These five SNPs (rs667773, rs3026390, rs2071754, rs1506 and rs12421026) had also been examined in at least one of the previous studies and, in particular, all except rs2071754 were found to be associated with high myopia in single-marker or haplotype analysis [14,[22][23][24][25].
Two different methods were used to genotype these SNPs: restriction fragment length polymorphisms or unlabelled probe melting analysis [37]. Details of primer sequences and reaction conditions are given in the online Appendix S1. The choice of methods depended on the logistic arrangement for instrument use in our core laboratory and the cost of the assays.

Statistical analysis
High myopia was examined as a dichotomous trait. Subjects were classified as affected (cases) or unaffected (controls). PLINK (ver. 1.07; http://pngu.mgh.harvard.edu/ ,purcell/plink/index. shtml) was used for the analysis of genetic data: Hardy-Weinberg equilibrium (HWE) on unrelated subjects and association analysis [38]. Exact tests for HWE were performed for controls and cases separately. Haplotype analysis with sliding windows was also performed with PLINK, and multiple comparisons was corrected by generating empirical P (P emp ) values based on 50,000 permutations. Haplotype blocks were constructed with Haploview (http://www.broadinstitute.org/haploview) only for the initial study with an algorithm known as the solid spine of linkage disequilibrium (SSLD) [39].

Initial study -Group 1 subjects
Eleven tag SNPs were selected, and could capture the genetic information for a total of 105 SNPs in the indicated region (324.6 kb) at a mean r 2 of 0.969. Five additional SNPs were selected as has been explained above. In total, 16 SNPs were examined, and are designated as S1 to S16 in sequential order from the 59 end of the PAX6 gene for the sake of easy reference ( Figure 1A and Table 2). Half of the markers (S2 to S9) clustered in a low LD region of ,8 kb at the 39 end of PAX6.
All SNPs were in HWE if a threshold P value of 0.01 was used, except rs3026401 (S9) for the case group (P = 0.0038) ( Table 2). Two SNPs showed marginally significant P values for HWE testing of the control group: 0.0155 for rs3026401 (S9) and 0.0304 for rs509628 (S16). This is not unexpected since about 2 significant results could be obtained due to random chance with 32 comparisons at a significance level of 0.05. As such, these two SNPs were also included for subsequent data analysis. We identified three haplotype blocks across the 324.6-kb region under study with the SSLD algorithm of the Haploview package ( Figure 2). However, the LD among the 16 SNPs under study was in general quite weak with a few exceptions.
There were more female subjects in the case group than in the control group (72.3% vs 56.3%, P,0.0001; Table 1). However, allele frequencies did not show any significant differences between  female and male subjects in either group for all 16 SNPs. This justified the direct comparison of genotypes between the controls and cases without stratification by sex. Table 2 shows the distribution of genotypes and minor allele frequencies in both cases and controls. There were no significant differences between cases and controls of Group 1 under all five genetic modes tested (genotypic, additive, dominant, recessive and allelic). We compared haplotype frequencies between cases and controls with adjustment for sex as a covariate to avoid its potential confounding. Instead of performing haplotype analysis within the LD blocks identified, exhaustive haplotype analyses were conducted using a sliding window strategy and examining all haplotypes of all possible sizes (numbers of SNPs per haplotype) ( Table 3). There were a total of 136 sliding windows, 69 of which were significantly associated with high myopia (omnibus test P emp ,0.05). Above all, there was at least one window showing significant result among sliding windows of a given size. For sliding windows with size up to 10 SNPs per window, it was obvious that the omnibus test result was significant as long as rs12421026 (S7) was included in the window. The crucial importance of S7 was even more obvious if one considered the most significant result among the group of sliding windows of a fixed size for up to 6 SNPs per window. The relative importance of SNPs within a sliding window was less apparent when the size of the sliding windows increased beyond 10 SNPs per window. Most importantly, the window of S4-S5-S6-S7 (indicated as S4..S7 in Table 3) gave the most impressive P emp value for the omnibus test. Table 4 shows the details of haplotype analysis for the most significant 3-SNP and 4-SNP windows.
One LD block identified by Haploview consisted of 7 SNPs starting from rs667773 (S2) and finishing at rs662702 (S8) (labelled as Block 1 in Figure 2). This haplotype window (indicated as S2..S8 in Table 3) gave the most significant result for slidingwindow haplotype analysis among the ten possible 7-SNP sliding windows. The other two LD blocks (S9..S10 and S12..S16) gave negative results for haplotype analysis (Table 3). These results were consistent with those obtained for LD-block-based haplotype analysis by Haploview (data not shown).

Replication study -Group 2 subjects
For Group 2 subjects, there were no significant difference in the proportions of women between cases and controls (65.2% vs 60.2%, P = 0.7468; Table 1). Five SNPs were genotyped in the replication study, and their genotypes were all in HWE (Table 2). The SNPs are arranged in sequential order from the 59 end to the 39 end of the sense strand of the PAX6 gene, and are also designated as S1 to S16 for the sake of easy reference and discussion. The major allele is designated as ''1'' and minor allele as ''2''; and the genotype counts are indicated as the counts of the genotypes 11 Single-marker analysis did not show any significant differences between cases and controls under all five genetic models tested ( Table 2). Single-marker analysis still did not give any significant results when Groups 1 and 2 subjects were combined and analysed directly under all genetic models or when meta-analysis was conducted to compare allele frequencies for Group 1 and Group 2 subjects with control for subject groups.
To maintain consistency with haplotype analysis for Group 1 subjects, haplotype analysis was also performed for Group 2 subjects with adjustment for sex. On the other hand, allele frequencies did not differ significantly between subjects in Group 1 and in Group 2 for either cases or controls. While haplotype frequencies did not differ significantly between controls in Group 1 and in Group 2, haplotype frequencies did differ significantly between cases from these two subject groups. Therefore, haplotype analysis was carried out for Groups 1 and 2 subjects together with adjustment for both sex and subject group as covariates to avoid their potential confounding effects.
With sliding-window haplotype analysis, we successfully replicated in Group 2 subjects the significant association between high myopia and the 4-SNP haplotype window of rs2071754-rs3026393-rs1506-rs12421026 (S4-S5-S6-S7) ( Table 4). Interestingly, the omnibus test was in fact slightly more significantly with 3-SNP S5-S6-S7 haplotypes (P asym = 7.93610 212 ) than with 4-SNP S4-S5-S6-S7 haplotypes (P asym = 4.06610 211 ). The direction of association was identical in both the initial study and the replication study for both S5-S6-S7 haplotypes and S4-S5-S6-S7 haplotypes (Table 4). There were two high-risk S5-S6-S7 haplotypes: TAA (112; odds ratio (OR) = 12.7, 20.8 and 16.8 for Group 1, Group 2 and both groups combined, respectively), and GTG (221; OR = 10.3, 11.4 and 10.7 for Group 1, Group 2 and both groups combined, respectively). Similarly, there were two high-risk S4-S5-S6-S7 haplotypes: GTAA (1112; OR = 10.7, 19.4 and 15.9 for Group 1, Group 2 and both groups combined, respectively), and AGTG (2221; OR = 10.7, 13.8 and 11.8 for Group 1, Group 2 and both groups combined, respectively). On the other hand, there were two protective haplotypes. For the S5-S6-S7 window, the protective haplotypes were TAG (111; only significant in the combined group with OR = 0.767) and GTA (222; OR = 0.618, 0.592 and 0.606 for Group 1, Group 2 and both groups combined, respectively). For the S4-S5-S6-S7 window, the protective haplotypes were GTAG (1111; only significant in the combined group with OR = 0.761) and AGTA (2222; OR = 0.583, 0.600 and 0.594 for Group 1, Group 2 and both groups combined, respectively). Note that PLINK calculates OR for a particular haplotype with reference to all the other haplotypes, and hence the reference haplotypes are different for different individual haplotypes under study. It is interesting to note that the two high-risk haplotypes were each found at ,1% in the combined control group, but at ,9% in the combined case group (Table 4). On the contrary, the two protective haplotypes were found at ,42-47% in the combined control group, but at ,31-43% in the combined case group.
The SNP rs3026390 (S3) was also included in the follow-up study. According to the HapMap Chinese data, rs3026390 (S3), rs3026393 (S5) and rs12421026 (S7) are in tight LD (r 2 $0.95) with each other and hence can act as proxies for each other. In Group 1 subjects, they were in moderate LD (r 2 $0.62; Figure 2) with each other. Haplotypes and sub-haplotypes of these three SNPs were reported to be associated with high myopia in our previous family-based study [14]. Indeed, haplotypes and sub-haplotypes involving these three SNPs were associated with high myopia in the combined group (Table 5) with the results being essentially the same with Group 1 and Group 2 analysed separately (data not shown). The only exception was the S3-S5 window together with its subhaplotypes, which testifies the critical importance of S7 (Table 3). The directions of association matched those shown in Table 4.

Subset analysis of extreme myopia cases
We performed a subset analysis on cases with SE #210.0D in both eyes to explore whether any single marker was associated with extreme myopia. For Group 1 subjects, this stringent threshold reduced the number of cases down to 115 (37 males and 78 females) with a mean of 29.1 years (range, . Of the 16 SNPs tested, two SNPs showed a significant association with extreme myopia: rs12421026 (S7) was significant under all five models with best result under additive model (P = 0.0065), and rs11031423 (S14) was significant under additive and allelic models with best result also under additive model (P = 0.0277). However, neither association retained statistical significance after permutation test for multiple comparisons. We noted that rs667773 (S2) did not show any significant results for all five models tested although it was reported to be associated with extreme myopia (#210.0D) in one previous study [25].
For Group 2 subjects, there were 87 extreme myopia (SE #210.0D) cases: 24 males and 63 females, and a mean age of 35.1 years (range, 18-51). When we performed a subset analysis for such extreme myopia cases in the replication study, all five SNPs under study showed negative results (data not shown). Combined analysis of all Group 1 and Group 2 extreme myopia cases against all controls still did not find any significant differences in allele or genotype frequencies.

Discussion
Three linkage studies provided consistent evidence for the existence of a myopia locus on chromosome 11. A genome-wide linkage scan of 221 UK dizygotic twins first identified the MPY7 locus (maximum LOD score of 6.1) at 40 cM on chromosome 11p13 as a myopia susceptibility locus [22]. Another study of an independent group of 485 UK dizygotic twin pairs (mean SE: 20.5562.34D; range, 220.0D to +8.1D) demonstrated marginal evidence for this linkage [40]. Finally, a genome-wide scan of 36 white families with a mean SE of -4.0D also provided strong evidence for linkage of a myopia locus on chromosome 11 [41].  3 3 S1..S14 S3..S16 S3..S16 3.87E-06 4.00E-05 15 2 2 S1..S15 S2..S16 S2..S16 2.23E-05 3.00E-04 16 1 1 S1..S16 S1..S16 S1..S16 1.09E-03 2.02E-02 a Exhaustive haplotype analyses were performed using sliding windows (SW) of all possible sizes (i.e. number of SNPs per SW; 1 to 16 SNPs per SW) with PLINK. With logistic regression, a single case-control omnibus test of (H -1) degrees of freedom was carried out for each sliding window to jointly assess the significance of the haplotype effects for this SW and adjusted for sex (as a covariate), where H is the number of haplotypes for the SW under consideration. Here, a single asymptotic P value (P asym ) was produced for each SW. For a given window size, the test was performed for all possible windows of the same size, shifting one SNP at a time. There were a total of 136 windows (the sum of numbers in column 2), and multiple comparisons were corrected by running 50,000 permutations to obtain an empirical P value (P emp ). The minimum P value that is achievable with 50,000 permutations is 2. corresponding haplotypes are significantly associated with high myopia (P emp ,0.05 obtained by 50,000 permutations for each group of subjects under study). Note that the minimum P value achievable with 50,000 permutations is 2.00610 25 . The most significant result (asymptotic P values, Wald test; P asym ) among all possible sliding windows in each group of subjects is marked by an asterisk (*). All three sets of analysis were adjusted for sex while the combined analysis was also adjusted for subject group. doi:10.1371/journal.pone.0019587.t004 The PAX6 gene was found to be directly below the highest peak at 11p13 and analysis of 5 tag SNPs with quantitative transmission/disequilibrium test showed strong evidence of linkage to all markers (P = 0.006), but no association with the SNPs or their haplotypes [22]. The refractive error of the study subjects ranged from -12.12 D to +7.25 D (mean SE = +0.3962.38 D). In a subsequent population-based case-control study of 596 individuals from the 1958 British Birth Cohort, 25 tag SNPs from across a 530-kb region that included the PAX6 locus and putative control regions were examined [24]. Both qualitative and quantitative trait analysis found no significant results in either individual SNPs or 3-SNP sliding-window haplotypes. The variance of refraction was 6.25 and the SD from the mean was 2.50 for these subjects although the investigators had selected subjects from the lowest and highest tertiles for case-control comparison.
The initial phase of our study recruited 600 unrelated Hong Kong Chinese subjects (Group 1) including 300 high myopia cases (SE #28.00 D) and 300 controls (SE within 61.00 D). The initial study provided convincing evidence for the association of 39 PAX6 haplotypes with high myopia in Chinese. In the second phase, we successfully replicated the initial positive results with another cohort (Group 2) of 299 cases and 299 controls recruited using the same criteria.
Note that the vast majority of cases of human myopia (.95%) develops due to excessive axial eye size resulting from accelerated postnatal eye growth, but not through changes in corneal or lens power [42]. In fact, our study also examined posterior axial myopia as is obvious from the strong correlation between AXL and SE in our samples.
There was no significant difference of all SNP allele frequencies in either the case group or the control group between Group 1 and Group 2 subjects. This justifies the data analysis with both groups combined wherever appropriate. The allele and genotype frequencies in the two separate control groups were also similar to those from the HapMap Chinese data (P.0.05; data not shown). Moreover, the LD pattern observed in the control group of the initial study was similar to that for HapMap Chinese, but slightly different from that of HapMap Caucasians (data not shown).
The initial phase of the present study captured the genetic information in a 324.6-kb region encompassing the PAX6 locus and all potential regulatory regions by genotyping 16 SNPs. Of note was the clustering of 8 SNPs in an 8-kb region of low LD region in the 39 end of the PAX6 gene ( Figure 1). No association was demonstrated between high myopia and each of these 16 SNPs individually under all five genetic models tested (Table 2), not even for rs3026390 (S3) and rs3026393 (S5), which were previously reported positive [14]. Without a priori knowledge of the haplotype window size that is most appropriate and powerful for detecting the association, we used PLINK to conduct exhaustive haplotype analysis of all possible window sizes (the variable-sized sliding-window strategy), which has been proven to be more powerful than single-marker analysis and LD-block-based haplotype analysis, particularly in regions of low LD as in the 39 end of the PAX6 locus [43]. The results show that, in any window size, there was at least one haplotype window significantly associated with high myopia ( Table 3). The most significant association (P asym = 3.54610 210 and P emp = 2.00610 25 ) fell on the 4-SNP window consisting of rs2071754 (S4), rs3026393 (S5), rs1506 (S6) and rs12421026 (S7) while the second top-ranking is the 3-SNP window of S5-S6-S7 (P asym = 5.48610 210 and P emp = 2.00610 25 ).
In the replication study, we genotyped these 4 SNPs for Group 2 case-control subjects ( Table 2). In addition, rs3026390 (S3) was also included because high myopia was found to be associated with S3 on its own and with S3-S5-S7 haplotypes in a previous study [14]. As in the initial study, these 5 SNPs did not show any significant results on their own individually (Table 2). However, the association of high myopia with the 4-SNP S4-S5-S6-S7 haplotypes and 3-SNP S5-S6-S7 haplotypes was successfully replicated in the follow-up study (Table 4). Interestingly, the results were more significant with S5-S6-S7 (P asym = 7.93610 212 ) window than with S4-S5-S6-S7 window (P asym = 4.06610 211 ). The same pattern was also observed in the combined group P asym = 6.28610 223 for S5-S6-S7 and P asym = 1.56610 218 for S4-S5-S6-S7). The results remained very significant even after correction for multiple comparisons by permutation tests. The direction of association was also consistent across both studies. In the combined group, there were two protective haplotypes and two high-risk haplotypes with odds ratios of similar magnitudes in both 3-SNP S5-S6-S7 and 4-SNP S4-S5-S6-S7 windows (Table 4). In general, protective haplotypes were more frequent than the high-risk haplotypes in the general population.
It is interesting to note that rs2071754 (S4), rs3026393 (S5), rs1506 (S6) and rs662702 (S8) were also investigated by Simpson et al [24] The study did not find any significant association for individual SNPs or 3-SNP sliding-window haplotypes S4-S5-S6 or S5-S6-S8. In our exhaustive sliding-window haplotype analysis, the S4-S5-S6 haplotype window was indeed associated with high myopia in the initial analysis (omnibus test P asym = 0.0184), but the association did not survive correction by permutations (P emp = 0.4799). The S5-S6-S8 haplotype window, not tested in the sliding-window approach, was found to be not associated (P asym = 0.1060, sex-adjusted omnibus test) with high myopia on a separate analysis. While rs12421026 (S7) was found to be critically important in our study (Table 3), it was not studied by Simpson et al. On the other hand, rs667773 (S2) was first reported to be associated with extreme myopia by a Taiwanese study involving 67 cases (SE #210.0D) and 85 controls [25], but did not give significant results in our initial analysis (Table 2) or a subset analysis of cases with extreme myopia (SE #210.0D). Negative finding with rs667773 was also reported by another recent study examining 379 high myopia cases and 349 controls [26]. Intriguingly, three previous studies performed association study with common myopia and reported negative results [22][23][24], probably because of the lower heritability in less severe refractive errors [44,45]. The difference in ethnicity could be one of the reasons for negative results. With both initial and replication testing, we confirm that 39 PAX6 haplotypes are associated with high myopia in our Chinese population. In our previous study involving 164 Chinese nuclear families with 170 highly myopic siblings, rs3026390 (S3) and rs3026393 (S5) was found to be associated with high myopia upon single-marker analysis [14]. The present study did not replicate these positive single markers in both the initial and the follow-up phases. However, we did successfully replicate the positive results for 2-SNP or 3-SNP haplotypes that included rs12421026 (S7) ( Table 5). The consistent association of high myopia with PAX6 haplotypes rather than single SNPs implies that the SNPs being studied are most likely not the causal variants driving the association and that the associated haplotypes are expected to carry or to be in strong LD with the causal variants.
The 4 SNPs (S4, S5, S6 and S7) of the associated haplotypes are found in the 39 end of the PAX6 gene and span a genomic region of ,3.3 kb encompassing the last two exons and the 39 untranslated region (UTR) of PAX6. Intriguingly, the highly conserved region of the 39 UTR is predicted by the TargetScan programme [46] to harbour binding sites for 10 micro-RNAs ( Figure 1). Micro-RNAs are small non-coding RNA molecules that bind to messenger RNAs of protein-coding genes to repress their translation and hence add an extra layer of control over gene expression [47]. It is therefore possible that variation in the PAX6 expression level may be implicated in influencing the susceptibility to myopia development. Whether unidentified SNPs in these binding sites could be involved in driving the association remains to be determined. Of particular interest is a recent study reporting the association of high myopia with dinucleotide repeats in the PAX6 promoter that were found to affect the transcription in a luciferase-reporter assay [26]. These studies tend to suggest that variation in PAX6 expression may be associated with genetic predisposition to myopia development.
In conclusion, several haplotypes in the 39 end of the PAX6 locus were found to be associated with high myopia in both the initial study and the replication study involving Han Chinese subjects. Our study also successfully replicated the associated haplotypes found in a recent study. The study supports PAX6 as a susceptibility gene for high myopia and provides the foundation for further investigation to identify the genuine causal variants.

Supporting Information
Appendix S1 Detailed protocols are given for SNP genotyping, including primer information and reaction conditions. (DOC)