Genome-wide association studies and CRISPR/Cas9-mediated gene editing identify regulatory variants influencing eyebrow thickness in humans

Hair plays an important role in primates and is clearly subject to adaptive selection. While humans have lost most facial hair, eyebrows are a notable exception. Eyebrow thickness is heritable and widely believed to be subject to sexual selection. Nevertheless, few genomic studies have explored its genetic basis. Here, we performed a genome-wide scan for eyebrow thickness in 2961 Han Chinese. We identified two new loci of genome-wide significance, at 3q26.33 near SOX2 (rs1345417: P = 6.51×10−10) and at 5q13.2 near FOXD1 (rs12651896: P = 1.73×10−8). We further replicated our findings in the Uyghurs, a population from China characterized by East Asian-European admixture (N = 721), the CANDELA cohort from five Latin American countries (N = 2301), and the Rotterdam Study cohort of Dutch Europeans (N = 4411). A meta-analysis combining the full GWAS results from the three cohorts of full or partial Asian descent (Han Chinese, Uyghur and Latin Americans, N = 5983) highlighted a third signal of genome-wide significance at 2q12.3 (rs1866188: P = 5.81×10−11) near EDAR. We performed fine-mapping and prioritized four variants for further experimental verification. CRISPR/Cas9-mediated gene editing provided evidence that rs1345417 and rs12651896 affect the transcriptional activity of the nearby SOX2 and FOXD1 genes, which are both involved in hair development. Finally, suitable statistical analyses revealed that none of the associated variants showed clear signals of selection in any of the populations tested. Contrary to popular speculation, we found no evidence that eyebrow thickness is subject to strong selective pressure.

a1111111111 a1111111111 a1111111111 a1111111111 a1111111111 eyebrow thickness in 2961 Han Chinese. We identified two new loci of genome-wide significance, at 3q26.33 near SOX2 (rs1345417: P = 6.51×10 −10 ) and at 5q13.2 near FOXD1 (rs12651896: P = 1.73×10 −8 ). We further replicated our findings in the Uyghurs, a population from China characterized by East Asian-European admixture (N = 721), the CANDELA cohort from five Latin American countries (N = 2301), and the Rotterdam Study cohort of Dutch Europeans (N = 4411). A meta-analysis combining the full GWAS results from the three cohorts of full or partial Asian descent (Han Chinese, Uyghur and Latin Americans, N = 5983) highlighted a third signal of genome-wide significance at 2q12.3 (rs1866188: P = 5.81×10 −11 ) near EDAR. We performed fine-mapping and prioritized four variants for further experimental verification. CRISPR/Cas9-mediated gene editing provided evidence that rs1345417 and rs12651896 affect the transcriptional activity of the nearby SOX2 and FOXD1 genes, which are both involved in hair development. Finally, suitable statistical analyses revealed that none of the associated variants showed clear signals of selection in any of the populations tested. Contrary to popular speculation, we found no evidence that eyebrow thickness is subject to strong selective pressure.

Introduction
Hair has a range of important functions in primates and has been speculated to be subject to intense natural and sexual selection [1]. Although humans have lost most terminal body hair to allow the development of more efficient sweating as an adaptation to bipedal life, considerable facial hair remains, with great diversity between populations [2]. The eyebrow, an area of thick, short facial hair above the eye that follows the shape of the lower margin of the brow ridge, is one of the most conspicuous features of the face. It is thought that its main function is to prevent sweat, water, and other debris from getting into the eye [2]. As a major facial feature, the eyebrow plays an important role in human communication, facial expression, sexual dimorphism and attractiveness [3][4][5][6]. It has been suggested that eyebrows may be subject to sexual selection [7]. Notably, eyebrow thickness varies between and within human populations [8,9]. A recent genome-wide association study in Latin Americans found that common DNA variants in the FOXL2 gene are associated with eyebrow thickness. However, these variants have very low minor allele frequencies in East Asians and Europeans, suggesting that in these two populations, eyebrow thickness may well be affected by different genes.
To enhance our understanding of the genetic basis underlying the variation of human eyebrow thickness, we conducted a genome-wide association study in East Asians and Eurasians, followed by a trans-ethnic meta-analysis, to identify genetic variants that affect eyebrow thickness in humans. Moreover, in order to validate the findings from our association analyses, we performed fine mapping and conducted functional genetic experiments. Finally, we applied various statistical genetics methods to search for signals of positive selection in and around the identified candidate genes.
We performed a further replication analysis for all SNPs described above in the Rotterdam Study sample, which comprises 4411 Dutch Europeans. With the exception of rs112458845 at 3q22.3, which was specific to Latin Americans, this analysis replicated all signals at a nominal significance level (rs1866188: P = 0.0141; rs112458845: P = 0.366; rs1345417: P = 4.03×10 −3 ; rs12651896: P = 2.93×10 −4 ; Table 1). The allelic effects at all associated SNPs acted in the same direction as in all other populations tested (Table 1).

Fine mapping of associated loci suggests four causal variants affecting eyebrow thickness
Next, we sought to localize the variants driving the association signals at each of the novel loci that attained genome-wide significance in our trans-ethnic meta-analysis. We utilized PAIN-TOR [14,15] to calculate the Bayesian posterior causal probability for each variant included in the signal and identified the sets of variants that collectively explained 99% of the total probability (credible sets). The credible sets for 2q12.3 and 3q26.33 contained a single SNP each (S4 Table). The credible set for 5q13.2 comprised nine variants. Among these, the posterior causal probability was highest for rs12651896 (posterior probability = 0.744; S4 Table). Using CADD [16] and DeepSEA [17] to assess the functional consequences of each variant, we found that rs10061469 was consistently predicted to be functionally important (S4 Table). Our interrogation of ENCODE [18] and REMC [19] data revealed that the regions around rs1866188 and rs1345417 show distinct active enhancer signatures defined by epigenetic markers, such as the histone modifications H3K4me1 and H3K27ac, and DNase hypersensitivity in epithelial cells (Fig 3). The regions around rs12651896 and rs10061469 are involved in long-range chromatin looping interactions with FOXD1 ( Fig 3B). Together, these data suggest the presence of putative regulatory regions in the neighborhood of these variants. We thus chose rs1345417, rs12651896, rs10061469 and rs1866188 for further experimental verification.

Validating regulatory activity within regions underlying the association signals
To further examine whether the four variants identified in our fine mapping analyses (rs1345417, rs12651896, rs10061469 and rs1866188) play a role in the expression of nearby genes, we conducted CRISPR/Cas9-mediated gene editing for each of them. We targeted the genomic regions surrounding each variant using two different sgRNAs (S4 Fig). Our qRT-PCR analysis of annotated transcripts near these SNPs found that mixed clones of A375 cells that were stably infected with rs1345417 targeting lenti-Cas9-sgRNAs displayed significantly reduced SOX2 expression and significantly elevated SOX2-OT expression compared to cells infected with control lenti-Cas9-sgRNAs ( Fig 4A). Infection with rs12651896 targeting lenti-Cas9-sgRNAs led to a significant reduction in the expression of FOXD1 (Fig 4B), while infection with rs10061469 targeting lenti-Cas9-sgRNAs showed no effect on nearby genes ( Fig 4C). Infection with rs1866188 targeting lenti-Cas9-sgRNAs resulted in a significant reduction in LIMS1 expression of ( Fig 4D).  Among the genes showing significant changes in expression levels, only SOX2 and FOXD1 were reported to be related to hair growth [20,21]. We thus chose to focus on these two genes. For each of the two lenti-Cas9-sgRNAs targeting rs1345417 and rs12651896, we derived multiple independent single cell clones of infected A375 cells and characterized their exact deletion/ mutation status at the target SNP sites (S4 Fig). We found that SOX2 expression was significantly reduced in A375 cell clones carrying deletions/substitutions encompassing rs1345417, in comparison with cells infected with empty lenti-Cas9 vector or with lenti-Cas9-control sgRNA (Fig 4E and 4F). Importantly, A375 cell clones that were infected with rs1345417 targeting lenti-Cas9-sgRNA but where the SNP site was not successfully edited, did not show reduced SOX2 expression (Fig 4E and 4F). Similarly, FOXD1 expression was significantly reduced in all A375 cell clones carrying a deletion encompassing the rs12651896 site, compared to both controls and unsuccessfully edited infected cell clones (Fig 4G and 4H). In one clone (896sg1m1), β-actin expression was also unexpectedly reduced, probably due to off-target effects.
To test if a single nucleotide substitution event is sufficient to affect endogenous SOX2 gene expression, we conducted a CRISPR/Cas9-mediated knock-in experiment at rs1345417, for which native A375 cells are homozygous (G/G). Consistent with our expectations, SOX2 expression levels were significantly lower in an edited, G/C heterozygous A375 clone than in the original G/G A375 cells (Fig 4I).
Additionally, we conducted a luciferase reporter experiment for the genomic region surrounding rs1345417 (S5 Fig). We found that a 1,495 bp genomic fragment encompassing the rs1345417 site and containing its common G allele can act as a transcriptional enhancer. Its insertion before a SOX2 promoter sequence led to 3 to 4-fold increase in reporter expression (S5 Fig). Moreover, a G>C mutation at rs1345417 on the reporter construct resulted in significantly reduced reporter activity, in line with our observations in the CRISPR/Cas9 experiments above ( Fig 4I). Together, our data indicate that genetic variation at rs1345417 and rs12651896 can affect the transcription of SOX2 and FOXD1, respectively.

Testing for signatures of positive selection in associated regions
For some of the top associated loci, allele frequencies varied among populations (S6 Fig). This variation could be caused either by random genetic drift or by local positive selection. To assess whether positive selection could have helped shape variation in the genomic regions associated with human eyebrow thickness, we applied several statistical genetics methods. Using the Integrated Haplotype Score (iHS) [22] and the Composite of Multiple Signals (CMS) statistic [23], we did not find any evidence indicating that SOX2, FOXD1 and FOXL2 are subject to positive selection in East Asian (CHB), European (CEU) and African (YRI) populations (S7 and S8 Figs). However, there were highly significant signatures of strong positive selection in the EDAR region. This is not surprising, as our top signal near EDAR (rs1866188) is in high LD with rs3827760, a strongly selected functional SNP causing a number of ectodermal related phenotypic changes, as demonstrated previously [24]. Therefore, the signal of . Red dots represent SNPs that are close (<5 kb) to the genome-wide significant signals. Regional association and linkage disequilibrium plots are shown for the significantly associated regions around B) rs1866188, C) rs112458845, D) rs1345417 and E) rs12651896 (gray bar). Different colors denote different populations. Increasing color intensities represent an increasing degree of linkage disequilibrium (r 2 ) with the top SNP in each panel. The recombination rate (right-hand y axis) is plotted in blue and is based on the ASN population from the 1000 Genome Project. Exons for each gene are represented by vertical bars below the x axis, based on all isoforms available from the hg19 UCSC Genome Browser.
To test whether the differences in allele frequencies may stem from different local selection pressures, we applied a probabilistic approach described by He and colleagues [25] to quantify local inter-population differences in selection for the four top loci. Apart from rs1866188 at EDAR, we found no significant differences in selection for any other SNP in the associated regions (S9 Fig).
To test whether eyebrow thickness might be subject to polygenic selection, we applied the polygenic scores test developed by Berg and Coop [26]. This test measures the total frequency of associated alleles in a population, weighting each allele by its effect size. Loci that have undergone local positive selection will show greater divergence than expected under the neutral model. Previous studies using this method have found excess variance among populations for genetic scores associated to height, demonstrating that height is subject to local positive selection [27]. Here, we calculated the polygenic score for eyebrow thickness based on the top four associated variants. The resulting scores were similar between populations and showed no excess variance due to positive selection (P = 0.567, S10 Fig), indicating that the allele frequency differences observed among populations are better explained by random genetic drift than by selection. Our results thus provide no evidence that eyebrow thickness is under strong positive selection in human populations.

Factors associated with eyebrow thickness
We found that eyebrow thickness is significantly associated with age and gender. As may be expected, older adults tend to have a significantly reduced eyebrow thickness due to the weakened biological function of older follicle cells. The eyebrows of males are generally thicker than those of females. Eyebrow plucking is widespread among females and may thus have affected our results; nevertheless, the large and consistent differences observed between genders are likely to be real, considering that androgens tend to stimulate hair growth [28,29].
The top associated SNP at 3q26.33, rs1345417, is located about 80 kb downstream of the SOX2 gene (Sex Determining Region Y-Box 2), an important transcription factor that is highly expressed in the dermal condensate (DC) and dermal papilla (DP) of growing follicles [30,31]. Interestingly, fine-tuning of SOX2 expression appears to play an important role in the regulation of hair thickness. In murine skin development, from E18.5 onwards, Sox2 expression becomes confined to DPs in thick guard/awl/auchene hairs, but not in thin zigzag hairs [20]. Moreover, conditional ablation of Sox2 from the DP resulted in significant reduction in the length of awl/auchene, but not zigzag hairs [20]. These findings are consistent with our hypothesis that the G>C mutation at rs1345417 may cause reduced eyebrow thickness by downregulating SOX2 expression. The top SNP within the second signal at 5q13.2, rs12651896, is located 242 kb downstream of FOXD1 (Forkhead Box D1). This gene belongs to the forkhead family of transcription factors, whose members are characterized by a distinct forkhead domain. FOXD1 is especially enriched in DC cells, a precursor of DP/dermal sheath niche cells within the mature follicle [21]. Finally, rs1866188 on chr2q12.3 is located 253 kb upstream of EDAR (Ectodysplasin A receptor), a gene which plays an important role in the development of ectodermal tissues such as hair, teeth, and sweat glands [24]. Previous studies have consistently found EDAR to be implicated in beard density [8] and hair thickness [24] and straightness [11,32,33].

Regulatory variants influencing eyebrow thickness
It is notable that all of the signals identified by our GWAS fall into regulatory regions. To date, hundreds of GWAS have been conducted, resulting in the identification of a large number of genetic variants associated with common diseases and phenotypic variation. The majority (~93%) of variants associated with common traits lie within non-coding sequences, complicating their functional evaluation [34]. Recent studies have found that these variants are concentrated in regulatory DNA. This remains true even after adjusting for microarray SNP ascertainment bias and suggests that non-coding variants may act by causing changes in gene expression levels [34], as supported by several lines of evidence [35][36][37][38][39]. The association signals found here at 3q26.33 (rs1345417), 5q13.2 (rs12651896), and 2q12.3 (rs1866188) are all located within active enhancer regions characterized by epigenetic markers, such as H3K4me1 and H3K27ac histone modifications, and DNase hypersensitivity in epithelial cells. Our functional validation experiments provided evidence for enhancer activity around rs1345417 and rs12651896. It is thus highly plausible that these variants affect eyebrow thickness by regulating the expression of SOX2 and FOXD1, respectively. In particular, the CRISPR/Cas9-mediated knock-in experiment provided direct evidence that a single substitution at rs1345417 is sufficient to affect the endogenous gene expression of SOX2. Our study represents a successful example of how GWAS and CRISPR/Cas9 technology can be combined to demonstrate the involvement of non-coding variants with regulatory functions in common diseases and normal phenotypic variation.

Population heterogeneity in genes associated with eyebrow thickness
The FOXL2 variant rs112458845 has previously been found to be associated with eyebrow thickness in Latin Americans. Here, we found no significant evidence of association of this variant with this phenotype in East Asians or Europeans. This may partly be due to the frequency distribution of the minor allele, which is rare in East Asians (4%) and absent in Europeans (0%), but relatively common in Native Americans (26%). Additionally, and perhaps more importantly, allelic effect sizes may be population specific due to a unique yet unidentified environmental or genetic background and may thus vary between Latin Americans on the one hand, and East Asians and Europeans on the other. A similar impact of population heterogeneity on allelic effect sizes has been reported for other traits, such as follicular lymphoma [40], body bone mineral density [41], and Type 1 diabetes [42].

Eyebrow thickness is unlikely to be subject to strong positive selection
It has long been debated whether eyebrow thickness is a selected trait or a 'neutral feature' with no apparent link to individual fitness [3,6,7]. We used several methods to detect potential signatures of positive selection for the variants associated with eyebrow thickness. First, we tested whether any of the associated regions overlapped with signals of positive selection. We then used a recently developed probabilistic method to test and estimate differences in selection of the associated variants between populations. We further tested the presence of polygenic selection by examining subtle allele frequency shifts across multiple loci. Previous studies using the above methods have found signatures of positive selection for a range of human traits, including height, skin color, and BMI [23,25,26,43]. However, apart from the EDAR region around rs3827760, a SNP known to be under strong selection, we found no significant signatures of positive selection for variants associated with eyebrow thickness. These results suggest that eyebrow thickness may not be subject to strong positive selection, at least not via the genes identified here. In this context, it is worth noting that most studies postulating sexual selection of the eyebrow were based on reconstructed attractiveness [3,5,7], and it may be argued that the perception of attractiveness can vary significantly over time [3]. Therefore, while sociological studies have indicated that eyebrow thickness may be subject to sexual selection, our study does not provide any support for this conclusion from an evolutionary or genetic perspective.
In conclusion, we identified three novel genetic variants near SOX2, FOXD1 and EDAR that influence eyebrow thickness. We demonstrated that rs1345417 and rs12651896 affect the transcriptional activity of the nearby SOX2 and FOXD1 genes. Furthermore, we found evidence for population heterogeneity in the genetics of eyebrow thickness. Finally, our results suggest that eyebrow thickness may not be subject to strong positive selection.

Ethics statement
The Taizhou

Populations and samples
This study is based on data from four populations (S1 Table): Han Chinese, Uyghurs, Latin Americans and Europeans. 2,961 Han Chinese (including 1,060 males and 1,901 females, with an age range of 31-87) were enrolled in Taizhou, Jiangsu Province, as part of the Taizhou Longitudinal Study (TZL) [10], in 2014. 721 Uyghurs (including 282 males and 439 females, with an age range of [17][18][19][20][21][22][23][24][25] were enrolled at Xinjiang Medical University, Urumqi, Xinjiang Province, China, as part of the Xinjiang Uyghur Study (UYG) in 2013-2014. Additionally, we collected summary statistics for a previously conducted GWAS on eyebrow thickness in Latin Americans of the CANDELA cohort, details of which have been previously published [8]. The Rotterdam Study is a population-based prospective study including a main cohort (RS1) and two extensions (RS2 and RS3) [12,13]. All participants were examined in detail at baseline. Collection and purification of DNA have been described in detail previously [12]. The Rotterdam Study has been entered into the Netherlands National Trial Register (NTR; www. trialregister.nl) and into the WHO International Clinical Trials Registry Platform (ICTRP; www.who.int/ictrp/network/primary/en/) under shared catalogue number NTR6831. All participants provided written informed consent to participate in the study and to have their information obtained from treating physicians.

Ordinal phenotyping
Eyebrow thickness (i.e., density) was rated by eye on a three-point scale (TZL, UYG, RS: scarce, normal, and dense; CANDELA: low, medium and high), following an established standard and based on photographic imagery (S1 Fig). For the TZL and UYG cohorts, each case was rated independently by two different investigators. The inter-rater reliability was evaluated with the Kappa statistic. Cases where ratings were inconsistent were reviewed by a third investigator, who made the final decision. For the CANDELA cohort, an interview of the volunteers had indicated that most women modified their eyebrows; in this cohort, eyebrow thickness was therefore only scored in men. Phenotyping was performed by an experienced investigator, and intra-rater reliability was assessed for 150 individuals. For RS, three investigators simultaneously and independently evaluated all photos, which were displayed on two identical screens with the same settings. Before evaluation, a total of 50 photos were openly discussed to reach a consensus between the three investigators. Inter-rater reliability was also evaluated with the Kappa statistic. The average score from the three independent ratings was used as the final phenotype in all subsequent analysis.

Genotype quality control and imputation
For TZL and UYG, blood samples were collected, and DNA was extracted. All samples were genotyped using the Illumina HumanOmniZhongHua-8 chip, which interrogates 894,517 SNPs. To control for genotype quality, we used PLINK 1.9 [44] to exclude individuals with more than 5% missing data, related individuals, and those that failed the X-chromosome sex concordance check, or for whom the available information on ethnicity was incompatible with their genetic information. We also excluded SNPs with more than 2% missing data, with a minor allele frequency (MAF) below 1%, and those that failed the test for Hardy-Weinberg equilibrium (P<1×10 −5 ). The chip genotype data were phased using SHAPEIT [45], and IMPUTE2 [46] was then used to impute genotypes at non-genotyped SNPs using variant positions from the 1000 Genomes Phase 3 data as a reference. SNPs with an imputation quality score (INFO) below 0.8 or a MAF below 1% were eliminated from further analyses. For the Han Chinese population, a total of 6,343,243 imputed SNPs passed quality control and were combined with 776,213 genotyped SNPs for association analysis. For the Uyghur population, a total of 6,414,304 imputed SNPs passed quality control and were combined with 810,648 genotyped SNPs for further analyses. In RS1 and RS2, genotyping was carried out using the Infimum II HumanHap550K Genotyping Bead Chip version 3, which contains 6,787,905 probes. Complete information on genotyping protocols and quality control measures for RS1 and RS2 have been described previously [47,48]. In RS3, genotyping methods closely followed those established for RS1 and RS2, but a denser array, the Human 610 Quad Arrays of Illumina with 15,880,747 probes, was used. Individuals with a call rate < 97.5%, gender mismatch with typed X-linked markers, or excess autosomal heterozygosity (>0.33) were excluded, as were duplicates or 1st degree relatives identified using IBS probabilities, and outliers using multi-dimensional scaling analysis with reference to the 210 Hap Map samples. Genome-wide imputation in RS3 closely followed the methods used in RS1 and RS2, as described in detail previously [48]. Genotypes were imputed using MACH [49] based on phased autosomal chromosomes of the 1000 Genome reference panel, orientated on the positive strand.

Population stratification analysis
The effects of possible population stratification were corrected using the EIGENSTRAT [50] tool from the EIGENSOFT package. To this end, TZL and UYG data were combined with 1000 Genomes Phase 3 data for YRI, CHB and CEU populations. 102,284 SNPs in low linkage equilibrium (r 2 <0.2) were selected for analysis. Principal component (PC) analysis did not find any outliers in TZL and UYG (S11 Fig).

Statistical genetics analyses
Initial genome-wide association tests using multiple linear regression with an additive genetic model incorporating gender, age and four genetic PCs as covariates were performed in PLINK 1.9 [44]. Expected and observed association results for all tests were visualized in quantilequantile (Q-Q) plots to assess systematic inflation in association resulting from population stratification or other systematic causes of bias. None of the Q-Q plots showed any sign of inflation, the genomic control factor λ being < 1.06 in all cases (S3 Fig). To evaluate the presence of additional independent signals at each locus, we performed conditional analyses by adding the dosages of the top SNP at each locus to the regression model. Q-Q, Manhattan and regional association plots [51] were created in R. The proportion of variance in eyebrow thickness explained by the genetic variants identified was estimated using GCTA [52]. The metaanalysis of the TZL, UYG and CANDELA data sets was performed using METAL [53]. Heterogeneity of SNP associations across studies was tested via Cochran's Q statistic [54], and its magnitude was expressed by I 2 . For SNPs with significant heterogeneity, a random effects model was applied for meta-analysis using METASOFT [55].

Fine mapping and credible set construction
We performed fine mapping of each locus for a 1 Mb genomic interval flanking the top SNP (500 kb upstream and 500 kb downstream) using PAINTOR [14,15]. For each SNP within this 1 Mb region, the posterior probability that this SNP is driving the region's association signal was calculated by dividing the SNP's Bayes factor (BF) by the sum of the BFs of all SNPs in the region [56]. A 99% credible set was then constructed by (1) ranking all variants according to their Bayes factor, and (2) including ranked variants until their cumulative posterior probability of representing the causal variant at a given locus exceeded 0.99 [57].

Functional annotation
To further facilitate the prioritization of variants for functional analysis, we used CADD [16] (Combined Annotation-Dependent Depletion) and DeepSEA [17] (deep learning-based sequence analyzer) to evaluate the possible functional consequences of the variants in the 99% credible set.

Candidate gene identification and regulatory annotation
Candidate genes were chosen based on their distance to the associated loci as well as their function, involvement in biochemical pathways, tissue expression, and involvement in similar phenotypes. The relevant information was obtained from NCBI [58] and Ensemble [59], as well as available published data. We used HaploReg v4.1 [60] to extract a variety of regulatory annotations, including histone modification (ChIP-seq tracks), chromatin state segmentations (15-state) and ChIA-PET [61] (Chromatin Interaction) from ENCODE [18] and the Roadmap Epigenomics Project [19], conserved regions from GERP [62] and Phastcons [63], and eQTLs from the GTEx [64] and GEUVADIS databases [65].

Cell culture
No cell lines were found in the database of commonly misidentified cell lines maintained by ICLAC and NCBI Biosample. Cell lines were not authenticated. NHEM (human melanocytes) and A375 (human melanoma) were purchased from the cell bank of the Chinese Academy of Sciences. HACAT (immortal keratinocytes) and SCC13 (skin squamous-cell carcinoma cell line) were purchased from ATCC. Cell lines were routinely tested for mycoplasma infection. Cells were cultured using DMEM+10%FBS+1%Penicillin-Streptomycin.

Plasmid construction and luciferase assays
A 1,495 bp genomic fragment comprising the rs1345417 enhancer was amplified from human genomic DNA and cloned into a pGL3-promoter vector, in which the SV40 promoter was replaced by a SOX2 promoter. The G>C mutation was introduced by site-directed mutagenesis (Takara). The inserts in each construct were verified by sequencing. The detail information of primer sequences can be found in S5 Table. Constructs were transfected with equimolar amounts (500 ng) of luciferase reporter plasmids into A375 Melanoma cells using jetPEI (Polyplus), according to the manufacturer's instructions. Luciferase expression was normalized to 200 ng Renilla luciferase expression (pRL-SV40). Cells were harvested after 48 h. Luminescence activity was measured with a Berthold Centro LB 960 Microplate Luminometer. Data represent at least three independent experiments. Student's two-tailed t-test was used to determine statistical significance.
For target site deletion via lentiCRISPR-sgRNA virus infection, stably infected cells were selected on puromycin. For clone selection, stably infected cells were diluted to allow colonial growth. Single colonies were individually picked for DNA sequencing of the target site. Because protospacer adjacent motives were located as far as 100 bp upstream and 22 bp downstream, the deletion at rs1866188 was considerably larger than for the other sites.
For G>C substitution at rs1345417 in A375 cells, we used a CRISPR-Cas9 mediated knockin strategy. Specifically, in order to substitute the G/G of rs1345417 in A375 cell, a~1000bp DNA fragment encompassing the rs1345417 site from a Chinese individual with a C/C homozygote genotype was PCR amplified and TA cloned into the pMD18T Vector (Takara) to construct a C-donor plasmid. We then co-transfected the C-donor plasmid and the lenticrispr-rs1345417-sgRNA2 plasmid into A375 cells using the Effectene reagent (Qiagen). 48 h after transfection, the cells were selected on puromycin for 48 h to eliminate untransfected cells. Successfully transfected cells were then diluted for colonial growth. Single colonies were individually picked for DNA sequencing to screen for substitution at the target site. In total, 44 clones were screened.

Molecular biology
Sequencing of the rs1345417 site: Genomic DNA was extracted from each cell clone using ZR Genomic DNA-tissue MiniPrep (Zymo) following the manufacturer's protocol. The rs1345417 genomic region was amplified from genomic DNA by nested PCR and TA-cloned into pMD18 T vector for sequencing. At least four individual TA-clones were sequenced for each cell clone. RNA extraction was conducted using Direct-zol RNA MiniPrep Plus (Zymo) following the manufacturer's protocol. Reverse transcription was conducted using the iscript cDNA synthesis kit (Biorad) following the manufacturer's protocol. Real-time PCR experiments were conducted on a VIIA7 Fast Real-Time PCR system (Applied Biosystem) using the iTaq universal SYBR Green supermix (Biorad). All statistical analyses were conducted using Microsoft Excel and the GraphPad Prism 6 software.

Tests for positive selection
Genomic characteristics resulting from strong recent positive selection include low haplotype diversity and high linkage disequilibrium. We calculated extended haplotype homozygosity (EHH) [22,67] for all SNPs until EHH < 0.05 in CEU, CHB and UYG samples. Next, the integrated haplotype score (iHS) [68] was calculated for all SNPs, with an allele frequency bin of 0.05 to standardize iHS scores against other SNPs of the same frequency class within the region. Finally, we calculated P values assuming a Gaussian distribution of iHS scores under the neutral model, which was checked by plotting the values against a Gaussian distribution. The empirical significance cutoff was based on the top 0.1% iHS scores. We also performed genome-wide CMS [23] analysis in African (YRI), European (CEU), and East Asian (JPT +CHB) populations from the 1000 Genome Project [69] to validate our results. The empirical significance cutoff was based on the top 0.1% CMS scores. Differences in allele frequencies are indicators of possible differences in selection between populations. To test whether the differences in allele frequencies may result from different local selection pressure, we used a probabilistic method which was recently put forward by He and colleagues [25]. We used this approach to test and estimate selection differences between populations for the four top associated loci. We first estimated differences in selection coefficients between populations (CHB, CEU and YRI) using logarithmic odds ratios of allele frequencies. The variance of the estimation was then calculated based on genome-wide variants. Finally, we calculated P values assuming a Gaussian distribution of the statistic under the neutral model. The significance cutoff was P<0.005 after multiple testing correlation.
To investigate whether the loci associated with eyebrow thickness are more differentiated among populations than expected under neutral genetic drift, for each population m, we calculated the polygenic eyebrow thickness score (genetic score) as where β l is the effect size of the eyebrow thickness increasing allele l, and p ml is the frequency of allele l in population m. We first used the four loci identified here in conjunction with allele frequency data from the 1000 Genome Project dataset to estimate the genetic score for eyebrow thickness in each population, with the effect sizes estimated in the meta-analysis. To test whether there was a signature of polygenic adaptation, we then adopted a framework developed by Berg and Coop [26], which builds a multivariate normal model based on matched, presumably neutral variants, to account for relationships among populations. Traits that have undergone local selection will show excess divergence among populations (significance cutoff: P<0.05).