A Robust GWSS Method to Simultaneously Detect Rare and Common Variants for Complex Disease

The rapid advances in sequencing technologies and the resulting next-generation sequencing data provide the opportunity to detect disease-associated variants with a better solution, in particular for low-frequency variants. Although both common and rare variants might exert their independent effects on the risk for the trait of interest, previous methods to detect the association effects rarely consider them simultaneously. We proposed a class of test statistics, the generalized weighted-sum statistic (GWSS), to detect disease associations in the presence of common and rare variants with a case-control study design. Information of rare variants was aggregated using a weighted sum method, while signal directions and strength of the variants were considered at the same time. Permutations were performed to obtain the empirical p-values of the test statistics. Our simulation showed that, compared to the existing methods, the GWSS method had better performance in most of the scenarios. The GWSS (in particular VDWSS-t) method is particularly robust for opposite association directions, association strength, and varying distributions of minor-allele frequencies. It is therefore promising for detecting disease-associated loci. For empirical data application, we also applied our GWSS method to the Genetic Analysis Workshop 17 data, and the results were consistent with the simulation, suggesting good performance of our method. As re-sequencing studies become more popular to identify putative disease loci, we recommend the use of this newly developed GWSS to detect associations with both common and rare variants.


Introduction
The search for common variants (CVs) that are reproducibly associated with complex human diseases has benefited from large-scale genome-wide association (GWA) studies. However, previously identified CVs only account for a small proportion of the trait variation [1,2], i.e. missing heritability [1]. Even with meta-analyses to combine large GWA datasets, similar situations occur [2]. It is suggested that a certain proportion of the missing heritability might come from rare variants (RVs), which have low frequency in the population to increase the risk of developing complex diseases [3][4][5]. Unfortunately, RVs are poorly covered in GWA studies [1], and the power of detecting their individual effects is low even in large scale studies [6]. A larger sample size, say more than 10,000 subjects, is required to detect disease-associated variants with minor-allele frequency (MAF) less than 0.01 [7]. Recently, there is increasing evidence in support of the contributions of RVs to the risk of diseases, such as CLEC16A gene for type-1 diabetes and ANGPTL gene for metabolism traits [8,9]. With the advancement of high-throughput sequencing technologies, large amounts of sequencing data would be rapidly produced in the future years, which provides a unique opportunity to detect disease associated variants, in particular for RVs.
There are two main hypotheses underlying complex human diseases, the common-disease common-variant (CDCV) and the common-disease rare-variant (CDRV), and they are not mutually exclusive in many cases. The combination of CVs and RVs may exert their effects through synthetic association (i.e., synergetic effect) or have independent effects on the trait [10]. For instance, Ionita-Laza et al [11] uncovered six CVs (p-value = 0.03~1×10 −4 ) and one RV (p-value = 1.7×10 −7 ) in NOD2 that affect the risk of developing Crohn disease. Wessel et al [12] identified both common and rare variants at CHRNB2 and CHRNA5 to exhibit significant associations with nicotine dependence. Nejentsev et al [13] found four RVs (p-value = 1.3× 10 −3~2 .1×10 −16 ) and one CV (p-value = 9.5×10 −17 ) in the IFIH1 gene to be protective against type-1 diabetes. In addition, one review article reported that both common and rare genetic variants in two genes (LPL and APOA5) play roles in the development of hypertriglyceridemia [14]. In prior statistical approaches to identify rare variants (which we describe below), very few of them deal with both common and rare genetic variants. Since both types of genetic markers in the same gene may exert their effects for disease traits, we propose a new method in the present study to identify common variants as well as rare variants for complex diseases.
In the last decade, several association tests have been developed for identifying RVs [9,11,[15][16][17][18][19][20][21][22][23][24][25]. These methods may have limited power to detect the true signals under certain conditions. In particular, various collapsing and weighting schemes have impacts on the performance of association testing as described below. Firstly, some test methods, including weighted sum-test (WSS) [19], variable-threshold (VT) [23], and wSSU [15], sum over RVs where rare information were weighted by MAFs. Under this condition, a rarer (common) variant is always corresponding to a larger (smaller) weight. Moreover, these methods and kernelbased adaptive cluster (KBAC) [9], Sum Test [22] and the combined multivariate and collapsing (CMC) [17] do not take association directions into account. This could result in power loss in detecting associations, especially in the presence of CVs and/or in the presence of both harmful and protective influences. Secondly, methods that adopt a weighting scheme based on MAF, such as WSS [19], VT [23], sequence kernel association test (SKAT), where two types of weighting schemes (the beta density weight function, SKAT b ; and the equal weigh function, SKAT 1 ) are suggested [24], the optimal test of the SKAT (SKAT-O) [25], and wSSU [15], may misclassify signal CVs as noise variants (or vise versa), during the collapsing procedure. In addition, the rarer allele frequencies do not always imply the higher associations. Thus, these methods might be sometimes difficult to identify a signal RV, in particular when the RV is even rarer in normal population than in affected samples [26]. We noticed that Ionita-Laza et al [11] further modified the SKAT [24] to detect both CVs and RVs, which considered the weighted sum as the test statistic (named SKAT-A and SKAT-C). The SKAT-A and SKAT-C combine SKAT tests with adaptive sum test [27] and sum test [22], respectively, for the joint effects of RVs and CVs. Thirdly, methods that adopt a weighting scheme using odds ratio (OR), such as OR weighted sum test (ORWSS) [16], often suffer from the problem of unstable estimation due to sparse or empty cells in the tables for estimating OR for genetic effects, even after the adjustment in adding 0.5 to each cell of contingency table, especially when only RVs are present (this phenomenon is also observed in our simulation results). Finally, KBAC [9] clusters variants together based on combinations of genotypes. However, the number of combinations will be largely increased if too many variants are present, and it is difficult to specify which combinations are causal.
In this study, we reviewed recently proposed methods for the detection of disease associations for both CVs and RVs for comparisons. To overcome aforementioned methodological limitations, we also proposed a class of new methods based on t-statistic, to account for association directions, the magnitude of association, and the presence of noise variants. In addition, we utilized a "filtering algorithm" with or without a threshold to collapse information of rare variants collectively in detecting disease associated variants. In the cases where rare and common variants altogether contribute to the etiology of complex traits, our new methods had higher power in detecting signal variants. We conducted a series of simulations to evaluate detection power in different scenarios, including linkage disequilibrium (LD) structure among variants, association directions, MAF distribution of RVs, and the ratio of signal and noise variants. In addition, we applied our method to the Genetic Analysis Workshop 17 (GAW17) dataset to evaluate the performance of our method.

The GWSS Method
One major aim of this research is to propose a robust test method to overcome the difficulties encountered by existing methods. In view of the robustness, we aim to extend the idea of WSS without specifying any distributional assumption, which includes two core components. First, a weighted sum of RVs is used to aggregate the rare information. Commonly used weights include MAF and OR, but they will suffer the problem of low power as mentioned in the previous section. Second, with the aggregated variants, a summary measure is adopted to obtain the final genetic scores and the corresponding p-values. From these two considerations, we propose a generalized weighted sum statistic (GWSS), which includes various existing methods as special cases. Note that most of existing methods need a predetermined value for MAF to define "rare variants". We believe that a data-driven approach for defining rare variants is more suitable, especially in the presence of signal CVs. We thus also incorporate the idea of VT [23] into GWSS (see Step 3 of the GWSS algorithm below). It is the data-driven weighting scheme that makes GWSS possible to simultaneously consider the effects of CVs and RVs. To formally describe GWSS, we consider a case-control study that examines k variants with n subjects (n 1 cases and n 0 controls). For subject i, let G ij 2 {0,1,2} be the number of minor allele in variant j and D i 2 {1,0} be the disease status (yes/no). For variant j, let q j be its adjusted MAF [19] in the control group, and let OR j be its estimated odds ratio. The algorithm of GWSS is described below.
GWSS Algorithm: 1. Given a threshold θ of MAF, obtaining the aggregated genotype for each subject i fw j Á Iðq j yÞgG ij , where w j is one of the following four possible weights: ffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi ffi n 0 q j ð1Àq j Þ p 2. Based onfG i ðyÞ; D i g n i¼1 , calculate the summary statistic S θ using one of the following two summary measures: If θ is known a priori, define S = S θ . Otherwise, repeat Steps 1-2 for all θ 2 Θ and calculate S = sup θ2Θ S θ as the summary statistic, where Θ is a predetermined set of possible values of θ.
4. Obtain the empirical p-value of S via randomly permuting the disease status, while keeping the case/control ratio constant.
The robustness nature of GWSS is threefold as described below: 1. The weightw MAF j , which has been used in constructing WSS [19], implicitly assumes that RVs are more likely to be associated with disease. However, w MAF j is limited to detect only harmful association but ignores the protective one. Further, Feng et al. [16] proposed ORWSS, which used the weight w OR j to estimate association strength and directions simultaneously. Unfortunately, w OR j suffers the problem of instability in estimating the association strength of RVs. It therefore motivates us to propose the weightw D j , which is a special case of aSum when setting the cutoff as 1 [27,28], that considers the direction of OR but ignores the information of strength. The information of strength, however, does play a role in affecting the testing performance. In view of this point, instead of using the unstably estimated OR directly, we propose the weight w DW j that incorporates the information of strength in a robust manner. To see this, we prove that |OR-1| and |log(OR)| are both decreasing functions of the MAF of the control group under the assumption of rare disease (see S1 Appendix for the proof). This result implies that w MAF j can be used as a surrogate for the association strength. As a result, the weight w DW provides a more robust estimation in both association strength (i.e.,w MAF j ) and direction (i.e.,w D j ), and is able to detect the signal variants with both harmful and protective associations. We note that any weight function can be used in our GWSS. For example, we can also use the beta density weight function of SKAT [24] as w j .

In
Step 2, based on the aggregated genotype G i (θ) for each subject i, two summary measures offG i ðyÞg n i¼1 , the "rank-sum" and "t-sum", are considered. The "rank-sum" statistic, which borrows the idea of the Wilcoxon rank-sum statistic, is robust against model assumptions, but at the cost of being less efficient when the distributional assumption is valid. Thus, when the sample size is moderate (i.e., the normal assumption is approximately hold), we suggest to use the more efficient "t-sum" to summarize information. 3. Before implementing WSS, a threshold θ for defining RVs should be determined. The same idea is also used in GWSS, which involves the indicator function I(q j θ) in Step 1 of GWSS algorithm. This indicator function, however, ignores all variants with MAF >θ, which will lose detection power in the presence of signal CVs. Needless to say, it is also hardly the case that researchers have prior knowledge about the value of such θ. It turns out that a data-driven approach for determining θ is preferable. We thus follow the idea of VT to use the maximum value of S θ among possible values of θ (see Step 3) as the summary statistic, which is able to adapt to various situations of signal/noise RVs and CVs (i.e., is more robust than using a predetermined threshold θ).
We close this section by noting that GWSS includes many existing test methods as special cases, such as WSS [19] and ORWSS [16]. See Table 1 for the details. Comparing with WSS and ORWSS, "D" denotes the method using w D j and "DW" denotes the method using w DW j in Step 1, "t" denotes the method using t-sum in Step 2, and "V" denotes the method that taking maximum over θ in Step 3.

Simulation Studies Simulation Settings
We conducted simulation studies to compare the performance of the proposed GWSS with existing methods. Assume that a study examines k SNPs with 500 cases and 500 controls. Two settings for MAF (denoted by MAF j ) of RV were considered. For the case of identical distribution, signal and noise RVs were both generated from Uniform(0.001, 0.01). For the case of different distributions, signal and noise RVs were generated from Uniform(0.001, 0.005) and Uniform(0.001, 0.01), respectively, i.e. the MAFs of signal variants are lower than noise variants [28]. CVs are generated from Uniform(0.01, 0.1) in both cases. Conditional on MAF j , let the latent vector be independent and follow a multivariate normal distribution with mean zero and CovðZ ð'Þ iu ; Z ð'Þ iv Þ ¼ r juÀvj , and [29] and F(Á) represents the cumulative distribution function of standard normal. The positions of signal/noise RVs/ CVs are also randomly specified (the same assumption was adopted in Basu & Pan, 2011). Finally, the disease status D i is generated from the logistic regression model This regression model is designed so that the background disease prevalence is 0.05 with 8 signal variants and k − 8 noise variants. The effects of SNP 1-4 are additive, and the effects of SNP 5-8 are non-additive. As to SNP 5-8, the effect is 3 if any of the SNP 5-8 has mutation. We note that the non-additive effect of multi-markers may be jointly incorporated in haplotypes [30] to capture the combined effects of variants signals, and this setting is to reflect a situation where no specific mutation is necessary but any of them is sufficient to increase the risk of disease. We consider different combinations of the following situations: (1) ρ = 0 or 0.7 for LD between markers; (2) OR j for (RVs, CVs) is (2, 1.5) or (1/2, 1/1.5); (3) ðn RV true ; n CV true ; n RV noise ; n CV noise Þ ¼(8, 0, 0, 0), (8, 0, 8, 0), (8, 0, 4, 4), (7, 1, 8, 0) and (7,1,4,4), wheren RV true , n CV true , n RV noise and n CV noise represents the number of signal RVs, signal CVs, noise RVs, and noise CVs, respectively, andk ¼ n RV true þ n CV true þ n RV noise þ n CV noise ; and (4) same or different MAF distribution for signal and noise RVs. For all methods, the empirical p-values are calculated based on 500 permutations. The detection power is calculated as the proportion of test statistic attained significance level of 0.05 over 1000 simulation datasets.
Results S1 Table shows that the type I errors of all methods under each setting are controlled at the level of 0.05, except for the CMC method. Thus, we used the empirical p-value for CMC statistic instead in the following simulation. We only present methods with better performance in the Tables, including five existing methods (SSUw, wSSU, SKAT b , SKAT-C, SKAT-A) and five tests of our GWSS method (ORWSS-t, DWSS-t, VORWSS-t, VDWSS-t and VDSS-t). The results of comparisons with other methods are listed in the Supporting Tables. Table 2

Harmful Association Direction Without LD (ρ = 0)
In Table 2 and S2 Table (column (a)), when there are only RVs, detection powers of almost all methods are larger than 95%. Among methods, Sum test, KBAC, WSS and WSS-t have the best performance. When there are noise variants, MAF-based methods such as wSSU, VT, WSS-t, and VWSS-t have better performances than others, since the noise CVs are down-weighted by the value of MAF in these methods. In contrast, the detection powers of SSU, Sum Test, KMR, C-alpha, SKAT 1 (SKAT with equal weight) and DSS-t decrease dramatically, suggesting that some underlying assumptions in these methods are inappropriate in the presence of noise CVs. Comparing WSS with WSS-t for which one uses "rank-sum" and the other uses "t-sum", WSSt has better performance than WSS, especially when noise CVs are present. The same pattern Table 2. Detection power for identical MAF distributions of signal and noise rare variants (only list methods with better performance). exists between ORWSS and ORWSS-t. This observation suggests that, in the presence of noise CVs and with moderate sample size, using "t-sum" to obtain the summary score is able to extract more information and, hence, is more efficient than using "rank-sum".
As mentioned in the Introduction section, OR provides information of both association strength and directions, but OR-based weighting scheme might result in loss of power due to the unstable estimation of association strength with sparse data in tabulated genotypic data (i.e. a contingency table). Taking two special cases of GWSS, ORWSS-t (using the weightw OR j ) and DSS-t (using the weightsignðw OR j Þ) to exemplify. When there are only RVs involved, we observe that DSS-t suffices to have better performance than ORWSS-t. On the other hand, when noise CVs exist, ORWSS-t and DWSS-t have better performances than DSS-t, indicating that the strength of association can assist the detection process when it can be estimated well. Moreover, we can observe that VDSS-t has a better performance than DSS-t in the presence of CVs. It indicates that, whether the CVs exist or not, the VT approach is able to improve the detection power even we do not estimate the association strength directly.

Opposite Association Directions Without LD (ρ = 0)
Simulation results are provided in Table 2 and S2 Table (column (b)). It can be seen that variants protectively associated with disease can be detected by SSU, KBAC, KMR, C-alpha, and SKAT-type methods (including SKAT 1 , SKAT b , SKAT-C and SKAT-A) if there is no noise CV. In the presence of noise CVs, SSUw (assuming the variance of each variant does not equal to one), wSSU (assigning small weight to noise CVs), SKAT b (SKAT using beta (1,25) density function evaluated at q j as the weight) and SKAT-C/A (using an optimal weight for combining CV and RV test statistics) perform better as expected. As to the class of GWSS methods, ORWSS-t, DWSS-t, VORWSS-t, VDWSS-t and VDSS-t perform well uniformly in all situations. These results reveal that using OR-based weighting (w OR j ,w DW j ) is able to detect associations for both harmful and protective effects. Besides, when the weight itself has the ability to identify association directions, the corresponding VT-version methods (e.g. VORWSS-t, VDWSS-t, and VDSS-t) still have improved performances. In contrast, we observe power losses in Sum Test, CMC-p (cutoff-point is fixed at 0.01), WSS-t and VWSS-t when opposite association directions are present. The failure to detect associations using Sum Test and CMC-p is due to the offset of harmful and protective effects during the summation process. The power losses in WSS-t and VWSS-t are because the weight w MAF j has no ability to assist identifying protective associations. Comparing with WSS-t, WSS using rank-sum is less affected by the presence of RVs with opposite directions and has a higher detection power than using "t-sum", which is a benefit of using the more robust "rank-sum" summary measure.

With LD (ρ = 0.7)
When variants are correlated with each other (see Table 2 and S2 Table columns (c), (d)), the results are similar to the scenarios of ρ = 0: (1) SSU, KMR, C-alpha, ORWSS and SKAT 1 performed worse in the presence of CVs; (2) for GWSS, using t-sum is better than using rank-sum in most of the situations; (3) methods using OR-based weighting scheme can identify protective association while methods using MAF-based weighting scheme cannot; (4) the VT-version of GWSS methods are superior to non-VT-version ones whether the CVs and protective association exists or not. The impact of LD among variants depends heavily on the association directions among variants. When the effects of signal variants are in the same direction, the joint signals become stronger than the case of uncorrelated variants. In this situation, it is easier to detect association and the detection power is much higher than the case of ρ = 0 ( Table 2 and S2 Table column (c)). On the other hand, when both harmful and protective variants exist simultaneously, the signals become weaker and are not easy to be detected ( Table 2 and S2 Table  column (d)), due to the offset of opposite association effects. S3 Table demonstrates the Type I errors and detection powers under different MAF distributions between signal and noise RVs. Comparing to the scenarios (Fig 1 and S3 Table) with identical MAF distribution (please see results in Table 2 and S2 Table), we found almost the same patterns in all methods. Moreover, we found that VT and WSS-t using MAF-based weighting scheme outperform other methods in OR>1 scenario (S3 Table column (a)), since the noise variants are down-weighted by MAF. Although both SKAT-C and SKAT-A perform worse than GWSS and wSSU in the scenarios of different MAF distribution across CVs and RVs with OR>1(S3 Table column (a)), they outperform other methods in OR<1 scenario (S3 Table column (b)). It reveals the non-robustness of SKAT-C/A, whose performances will be heavily affected by the varying situations. As expected, the detection powers of VT and WSS-t for detecting signal CVs with protective associations are low. On the other hand, the proposed DWSS-t and VDWSS-t take into account the association strength and direction as well as the MAF distributions of RVs, and are detected to perform well in all situations. Notably, in comparison with the scenarios of identical MAF distribution for both noise and signal RVs (see Table 2 and S2 Table columns (a), (b)), all methods encounter a dramatically decline in power (Fig 1 and S3 Table columns (a), (b)). One reason is that the MAF distributions of signal RVs The results were based on 1000 subjects (500 cases and 500 controls). All empirical p-values were calculated from 500 permutations. The detection power is defined as the proportion of test statistic attained significant level 0.05 over 1000 simulations. CMC method use 0.01 as a threshold for rare variants. SKAT 1 and SKAT b represents SKAT using equal weight and using beta (1,25) density function evaluated at q j as the weight, respectively. SKAT-C and SKAT-A combined SKAT and sum test and adaptive sum test, respectively. in the scenario of different MAF distributions (i.e. MAF~Uniform(0.001,0.005)) are rarer than those in the scenario of identical MAF distributions (i.e. MAF~Uniform(0.001,0.01)), and thus, the signals are more difficult to be identified.

Summary of Simulation Studies
We summarize the simulation results for the performances of all methods under different scenarios in Table 3 and S4 Table. The power differences across all methods in different scenarios are first calculated. We classify a method to be sensitive (•), slightly sensitive (Δ), or non-sensitive (X) if the difference is larger than 0.1, between 0.05 and 0.1, or less than 0.05, respectively. We found that the decline in power is mainly caused by opposite association directions and the presence of noise CVs, followed by the presence of noise RVs (Table 3 and S4 Table column (a)). We also evaluate the influences of two factors simultaneously (association directions and the existence of noise variants) in Table 3 and S4 Table column (b). For instance, KBAC is less sensitive in the presence of opposite directions and noise CVs, but it has worse performance when both factors exist simultaneously. Moreover, the MAF distributions of signal and noise variants have impacts on the detection power. For example, ORWSS and SKAT are robust to the presence of opposite directions for identical MAF distributions, but are not robust for different MAF distributions. It is also found that the performance of SKAT is sensitive to the selection of the weight function, since SKAT 1 and SKAT b have totally different behaviors. SKAT-C and SKAT-A are also sensitive to the existence of noise variants, MAF distributions, the association directions, and the chosen weight function. In general, the performances of DWSS-t, VDWSS-t, and wSSU are robust against the association directions and the presence/ absence of signal /noise CVs.
It can be seen that wSSU and VDWSS-t have comparable performances, and are the best performers among all methods. We should emphasize that wSSU assumes the validity of a logistic regression model, and thus a good performance is expected under our simulation settings. On the other hand, VDWSS-t is totally nonparametric tests that does not require any model specification, and is robust to various situations of MAF and signal /noise variants. As a result, VDWSS-t is expected to be more applicable in practice since it is rarely the case that we know what the true model is.

Application
To evaluate the performance of our GWSS method, we used the GAW17 mini-exome dataset (http://www.gaworkshop.org/gaw17, without true answers). The GAW17 dataset consists of 697 unrelated individuals, each of whom provided genotypes (in total 24,487 SNPs assigned to 3,205 genes) that were sequenced from the 1000 Genomes Project (http://www.1000genomes. org). The simulated phenotypes used binary disease status that was generated for each individual (Almasy et al., 2011). We calculated association p-values for disease affected status based on 1000 permutations. We compared our GWSS method to a few commonly implemented methods, including SSUw, WSS, ORWSS, VT, and SKAT-type methods (results are shown in Table 4 and S5 Table).
To conduct the association tests of the GAW17 data, only genes that (1) had significant pvalue 0.001 detected in at least one of the GWSS, SSUw and SKAT-C/A methods, and (2) consisted of at least one RV and CV were considered. We listed fourteen genes that showed significantly aggregated signals to the disease affected status in Table 4 and S5 Table. Eight genes (ARL6IP2, TRIM42, SHC3, FRMPD2, NFKBIA, MBD1, GDF15 and SUSD2) had the smallest pvalue detected by one of our GWSS method (i.e., WSS-t, ORWSS-t, VWSS-t and VDSS-t). The p-values of the other six genes (ADAM15, MSH4, EPHB1, FAM13A1, DGKZ and FLT1) were also ranked in the top three among all methods. We found that SKAT-type methods performed well (p-values 0.001) in seven genes, but completely failed to detect signals for three genes (APL6IP2, TRIM42 and NFKBIA). Notably, three genes (ARL6IP2, NFKBIA and MBD1) can only be detected by our WSS-t and VWSS-t. These results suggested that our GWSS method had good performance among all methods. Taken together, we concluded that our GWSS method robustly showed good performance under different scenarios.

Discussion
Various statistical testing strategies have been developed for identifying associations between RVs and complex disease traits [9,[15][16][17][18][19]21,23,24]. With noted limitations in existing methods, we developed a robust GWSS method to identify rare and common disease-related variants simultaneously. The proposed GWSS method works well under most of the simulated scenarios. Comparing with existing methods, GWSS can correctly control the type-I error rate and achieve high detection powers under harmful and/or protective association settings, when LD structures among variants are absent (i.e. ρ = 0) or present (i.e. ρ = 0.7) ( Table 2 and S2 Table). Many existing methods suffer from low to moderate power loss when noise variants (rare or common) are present. The presence of noise variants can dramatically decrease the detection powers and increase the false positive results, which further complicate the interpretations of the analysis results. Importantly, by incorporating the idea of VT, our GWSS method can separate signal variants from noise ones in a data-driven approach. In practice, noise may also be introduced from calling rare variants and using imputed genotypes [7]. Methods like our GWSS can help to combine information from both RVs and CVs and further provide a basis to enhance our understanding of complex trait genetics.
Power loss may also occur when using methods assuming the same effect direction of variants, since in reality both risk and protective variants are present [31]. It is easy to see that risk effects are neutralized by protective effects in Sum Test and CMC-p methods. Similarly, methods using MAF as a weighting scheme, such as VT, WSS-t, and VWSS-t ignore associations with protective effects (i.e. OR<1), also encounter power loss. Although the protective associations can be detected by using OR as the weighting scheme, the unstable estimation of OR arises another difficulty. The newly proposed VDWSS-t exhibits a more robust estimation in both association strength and direction, which are more applicable in many situations.
The power of association testing also depends on the pattern of LD among markers [32], since noise variants may be associated with a signal variant due to LD. High LD among RVs and CVs often makes it difficult to detect which variants are the true association signals, and causes loss in detection power. In particular, when high LD exists among RVs (including both harmful and protective variants), a huge loss in power is observed due to restricted signals [15]. This situation is also found in our simulation studies. A dramatic power loss can be found in opposite association directions when noise variants are additionally present (Table 2 and S2  Table column (d)). The comprehensive influences of LD structures among variants in RVs detection need further studies.
It is found that WSS-t and ORWSS-t performed better than WSS and ORWSS, suggesting that the use of "t-sum" is beneficial especially when CVs are present. One reason is that "tsum" utilizes more information than "rank-sum", through considering both the mean shift and variation. In particular, when only RVs are involved and all variants have harmful effects, most of the genotypes are common homozygotes (i.e. G ij = 0) and the aggregated genotypes (i.e. G i (θ)) close to zero in most of the subjects, forming a right-skewed distribution. In this situation, "rank-sum" is more appropriate due to its robustness against the normality assumption. Similarly, when the assumption of the same association direction among variants was violated, both WSS and WSS-t had poor performance, while the rank-sum slightly improved the detection power. In contrast, when CVs are involved (i.e. fewer cells with zero value show in tables of genotyping data) with moderate sample size, the distribution of aggregated genotypes would approximately follow the normal distribution, so using "t-sum" is suitable.
Weighting scheme plays an important role in successful identification of signal variants. Inappropriately using weighting scheme may lead to misclassification of noise variants to signal variants. For example, when all RVs have true signals and all CVs are noise, methods using MAF-based weighting can efficiently separate true signals from noise variants, thus yielding gains in power. When signal and/or noise CVs are present, however, methods such as SSU, Sum Test, KMR, C-alpha, and KBAC without considering the effects of CVs will suffer from certain degree of power loss. The choice of the weight function is also critical to some methods. For instance, SKAT and VDWSS-t perform nearly equally well under identical MAF distribution (Table 2 and S2 Table), but SKAT is not a good performer under different MAF distribution (Fig 1). We should emphasize that our GWSS adopts a robust weight function that is able to adapt to various situations, which is more applicable in practice. Additionally, the calculation of GWSS is not time-consuming. In fact, GWSS without using varying threshold values has very similar computational time to WSS. For one simulation data used in our Simulation section, it costs about 2 minutes to obtain p-values. When a varying threshold method is implemented, the calculation can still be completed within 6 minutes.
Due to the nonparametric nature of GWSS, however, currently there is no simple method to adjust for the effects of potential covariates, except the stratification method where one can implement GWSS within the strata. It is of great interest to extend GWSS to accommodate to include potential covariates.
In conclusion, our results demonstrated that the newly proposed GWSS method is able to deal with variants with bidirectional effects and association strength successfully. The GWSS can aggregate information from both CVs and RVs automatically even there is no prior knowledge. The GWSS also takes into account different LD patterns, numbers of signal/noise variants, and association directions. The GWSS can be applied to genome-wide sequence data to assist with the identification of signals of both common and rare variants, and to enhance our understanding of complex trait genetics.

Author Contributions
Conceived and designed the experiments: PHK HH CFK JRL. Performed the experiments: CFK JRL PHK HH. Analyzed the data: JRL CFK. Contributed reagents/materials/analysis tools: JRL CFK. Wrote the paper: CFK JRL. Revised manuscript critically for important intellectual content: PHK CFK HH JRL.