Testing Jumps via False Discovery Rate Control

Many recently developed nonparametric jump tests can be viewed as multiple hypothesis testing problems. For such multiple hypothesis tests, it is well known that controlling type I error often makes a large proportion of erroneous rejections, and such situation becomes even worse when the jump occurrence is a rare event. To obtain more reliable results, we aim to control the false discovery rate (FDR), an efficient compound error measure for erroneous rejections in multiple testing problems. We perform the test via the Barndorff-Nielsen and Shephard (BNS) test statistic, and control the FDR with the Benjamini and Hochberg (BH) procedure. We provide asymptotic results for the FDR control. From simulations, we examine relevant theoretical results and demonstrate the advantages of controlling the FDR. The hybrid approach is then applied to empirical analysis on two benchmark stock indices with high frequency data.


Introduction
Recently many testing procedures have been proposed for detecting asset price jumps [1][2][3][4][5][6][7][8]. These testing procedures use high frequency data to calculate test statistics for a certain period and then use these test statistics to test whether jumps occur in that period. Formally, the null hypothesis for such test at each period i, i~1, . . . ,m , i~1, . . . ,m, can be stated as In addition to know whether the inherent price process has a jump component, the ''one test statistic for one period'' approach for testing (1) also allows us to extract information about when and how frequently jumps occur in the whole sampling period. Such information is even more important for research on event study, derivative pricing and portfolio management.
If the number of periods m is greater than one, the jump test can be naturally viewed as a multiple hypothesis testing problem. Previous research used different test statistics, but often followed a similar decision procedure: Rejecting the null hypothesis if the corresponding p-value is less than the controlled type I error a. Nevertheless, controlling type I error often makes a large proportion of erroneous rejections. Such situation becomes even worse when the jump occurrence is a rare event.
To avoid the problem described above, one may look for a more sensible compound error rate measure. In this paper we focus on false discovery rate (FDR). For testing hypothesis (1), we use a nonparametric jump test procedure proposed by [3,4]. After obtaining the p-value for each single hypothesis test, we use the procedure proposed by [9] to control the FDR when simultaneously carrying out these hypothesis tests.
Several literatures on jump tests also tried to deal with the multiplicity issue. For example, Lee and Mykland [8] set the significance level based on the distribution of the extreme value of the test statistic under the null. This ensures that the probability of global misclassification on the jumps can achieve zero under some regularity conditions. Bajgrowicz and Scaillet [10] proposed a statistical method which is based on setting an appropriate threshold for the test statistic to eliminate the false detections of jumps. They then applied their method on analyzing relationships between jumps in U.S. stock market and announcements of different kinds of economic news. As for applying the FDR control to jump component detections, it also has been adopted by [5], in which an improved version of the jump test statistic proposed by [1] was used. The main difference between [5] and our paper is that we give theoretical justifications on performance of the jump test statistic in a multiple hypothesis testing context. We also conduct an intensive simulation study to support our theoretical results.
The rest of the paper is organized as follows. In Section Methods, we first briefly describe the Barndorff-Nielsen and Shephard (BNS) nonparametric test and the Benjamini-Hochberg (BH) procedure. We then discuss some asymptotic results for the FDR control. We focus on the case when p-values are calculated based on asymptotic distributions of the test statistics. We show that with some appropriate conditions, the FDR can be asymptotically controlled by the BH procedure when the p-values are obtained via the asymptotic distributions. In addition, magnitude of approximation error of the asymptotic FDR control is bounded by a non-decreasing function of expected number of the true null hypotheses. This property indicates that the more false null hypotheses we have, the better performance the asymptotic FDR control will achieve. In Section Results, we conduct a simulation study to show that performance of the BNS-BH hybrid procedure is positively related to the number of false hypotheses and sampling frequency of the data, and is stable when the number of hypotheses and the required FDR level change. We finally apply the proposed procedure on analyzing jumps in S&P500 index and Dow Jones industrial average index.

The BNS nonparametric jump test
Barndorff-Nielsen and Shephard [3,4] proposed a nonparametric test statistic (henceforth the BNS test statistic) which utilizes realized variance and bi-power variation to test jump components of price processes which have continuous sample paths. To begin with, we briefly introduce some important theoretical results of the jump test procedure in financial econometrics. We say a random variable X i ð Þ belongs to the Brownian semimartingale plus jump class if where m t ð Þ and s t ð Þ are assumed to be càdlàg, W t ð Þ is a standard Brownian Motion, D j ð Þ is the quantity of the jth jump within 0,i ð , and N i ð Þ is total number of the jumps occurring within 0,i ð . Here, we assume the number of jumps occurring within the we say X i ð Þ belongs to the Brownian semimartingale without jump class. Some statistical assumptions can be made on D(j) and N(i) for purpose of simplifying the analysis. For example, in empirical finance literatures, magnitude of jump D j ð Þ is often assumed to follow a normal distribution, and the number of jumps within the interval i{1,i ð , N i ð Þ{N i{1 ð Þ is often assumed to follow a counting process with (finite) intensity parameter l(i) which may be time varying.
The realized variance and the realized bi-power variation in period i are defined as as M??. The term (3) is called the quadratic variation for the cumulative (log) return process log P(i){ log P(i{1), and it is a sum of contributions due to the continuous log price process Þ) and the jump process ( The result of (3) follows from the theory of quadratic variation (e.g., [11]) and the result of (4) follows from the theory of power variation process which is a generalized version of the theory of the quadratic variation process [3]. Here BV i ?
without any further assumptions on the jump process, the joint distribution of the jump process and s t ð Þ: Finally, if log P t ð Þ belongs to the Brownian semimartingale without jump class, it is easy to see that both RV i and BV i will converge in probability to ð i i{1 s 2 t ð Þdt as M??: Barndorff-Nielsen and Shephard [3,4] showed that can consistently estimate the quantity In practice, to guarantee nonnegativity of the estimation, some truncation rules can be applied on JV i , for example using max (JV i ,0) or a shrinkage type estimator like (15) in our empirical analysis. To construct a test statistic to test whether the jump term presents, now suppose for t[ i{1,i ð , log P t ð Þ belongs to the Brownian semimartingale without jump class, and the following conditions hold, 1. The process of s t ð Þ is pathwise bounded away from 0. 2. The joint process of m t ð Þ and s t ð Þ is independent of the Brownian motion term W t ð Þ of the log price process, then conditioning on m t ð Þ, s t ð Þ, the quadratic variation and realized bi-power variation process, Barndorff-Nielsen and Shephard [3,4] showed that joint distribution of RV i and BV i will converge asymptotically to a bivariate normal distribution. Then under the null hypothesis when no jumps are present on period i, it can be shown that where A~p=2 ð Þ 2 zp{5: The term ð i i{1 s 4 t ð Þdt in the denominator of (5) is called the integrated quarticity, and to consistently estimate it, we can use the realized tri-power quarticity, where m a~E DZD a ð Þ and Z*N 0,1 ð Þ: In the following simulation study and empirical applications, instead of using the test statistic shown in (5), we will use three improved test statistics to obtain better performances. The first one is proposed by [3,4], which uses the log transformation and is defined as The second one is the Box-Cox transformed test statistic with parameter r~{1:5, which is defined as Here the Box-Cox transformation for a positive number x is defined as The third one is the ratio type test statistic [12]: Under the null hypothesis that there is no jump occurring in period i, the test statistics Z {1:5,i , Z log ,i and Z ratio,i will have a standard normal distribution as their limiting joint distribution. When jumps occur in period i, the test statistics will approach to infinity as M??: For more discussions on theoretical properties of the test statistics under the alternative (when jump presents), please see [13]. researchers, so they are assumed to be random variables. On contrary, the total number of hypotheses m is generally known in advance and so is assumed to be nonrandom. Table 1 shows different situations when a multiple testing is performed. The numbers of hypotheses we reject and do not reject are denoted by R and m{R. The notations U, T, V and S denote the numbers of hypotheses we correctly accept, falsely accept, falsely reject and correctly reject, respectively. The false discovery rate (FDR) is then defined as the expectation of the false discovery proportion (FDP), i.e.

FDR~E FDP
In testing jumps, controlling the FDR has several advantages over controlling other compound error rates. First, if the price process really does not have a jump component, i.e., all the null hypotheses are true, then controlling the FDR will be equivalent to controlling Pr V §1 ð Þ, the familywise error rate (FWER). Second, if the intensity of the jump process l=0, as time goes on (m increases), the proportion of false hypotheses among all hypotheses Testing Jumps via FDR Control PLOS ONE | www.plosone.org will be a nonzero constant with a high probability. Although such proportion may not be large, one may still expect the more (fewer) rejections one has, the more (fewer) erroneous rejections are allowed to occur; or the number of rejections should be proportional to m. In this situation, controlling compound error rates associated with proportion of erroneous rejections, like the FDR, makes sense. In addition, rejection criterion of some compound error rates such as the FWER, are sometimes too stringent to get rejections when the number of hypotheses becomes large. The criterion of the FDR is less conservative in this aspect. Finally, controlling the FDR currently seems to be more acceptable than controlling other compound error rates in many different research fields [14].
Let p (1) ƒ . . . ƒp (m) be the ordered p i 's and H 0 (1) , . . . ,H 0 (m) be the corresponding null hypotheses. Benjamini and Hochberg [9] proposed a stepwise procedure to control the FDR at the required level c. The BH procedure can be simplified as the following twostep decision rule: Some controlling procedures for the FDR need a resampling scheme to construct the rejection region, which relies on intensive computations. The BH procedure, however, requires far less computational sources than those computational intensive methods. As shown above, the only computational burden of the BH procedure is to rank the p-values. Such advantage becomes even more obvious when the number of hypotheses becomes very large.
It can be shown that there is a relationship between the type I error a and the FDR. That is, if we reject H 0 i as p i ƒa, i~1, . . . ,m, it is possible to know what level of the FDR is controlled for the m hypotheses multiple testing. For example, if the hypotheses are identical and the test statistics are all independent, given the type I error a, the following estimator [25] can be used to estimate the corresponding FDR. Here k is a turning parameter. How the BH procedure performs relies on dependence structure of the test statistics. Benjamini and Yekutieli [16] showed that the BH procedure can still control the FDR when the test statistics are not independent, but the positive regression dependency (PRDS) for each test statistic under the true null hypotheses is satisfied. In addition, simulation studies in [14] showed that even if the PRDS condition is violated (e.g., there exist negative common correlations between the test statistics or the covariance matrix has an arbitrary structure), the BH procedure can still provide a satisfactory control of the FDR. Finally, if the test statistics have an arbitrary dependence structure, Benjamini and Yekutieli [16] showed that the BH procedure still guarantees that A more detailed discussion on the theoretical properties of the FDR and the BH procedure is provided in next section.

Asymptotically results
In our case of testing jumps, since our null hypotheses are homogeneous, The concept of increasing and decreasing sets was used in [16] and [17] for introducing the concept of positive regression dependency on each one from a subset (PRDS). Let e m m 0 denote the number of true null hypotheses. In practice, e m m 0 is unknown in advance and so is assumed to be random here.
Conditioning on e m m 0~m0 true null hypotheses (or equivalently m 1~m {m 0 false hypotheses), the FDR is given by Here D v,s m0 is a well constructed union of m{dimensional cubes such that p[D v,s m0 n o is the event that v true and s false null hypotheses are rejected when the BH procedure is implemented with p. Benjamini and Yekutieli [16] and Sarkar [17] showed that if the joint distribution of p i is PRDS on If b p p M is used, the analogue of (6) is then given by  and then to prove : as M??: Therefore implementing the BH procedure with b p p M is asymptotically equivalent to implementing the procedure with p: The main results are the following two theorems, and their proofs are given in the supplementary materials.
Theorem 1. Suppose we have m hypotheses to be tested simultaneously. If the following conditions hold, Discussions on the asymptotic results The two theorems say that under some regularity conditions, we can asymptotically control FDR. A key condition making the two theorems different is the requirement on the dependence structure of elements in vector p and b p p M . If the dependence structure of p i satisfies PRDS on I 0 , it ensures that E V =R ð Þƒc: Here we only require the PRDS should hold on I 0 , and the dependence structure of p i on I 1 can be arbitrary. Marginal distributions of p i and b p p M,i converging with the rate O 1=M d À Á simultaneously for all i is also needed for the consistent control. In addition, we also require the convergence of the joint distribution of the ordered ð Þand S is a m|m covariance matrix with element s ij . Suppose for each i[I 0 , and each j=i, s ij §0, then the distribution of T is PRDS on I 0 , regardless what the covariance structure of i[I 1 is. Mutual independence of T 1 , . . . ,T m can be easily seen as a special case of PRDS on I 0 : As for the nonparametric jump test in this paper, since the limiting distribution of the test statistics is a multivariate normal with s ij~0 for each i[I 0 and each j=i, it implies PRDS on I 0 : The condition that Pr p i ƒa ð Þƒa for a[ 0,1 ð Þ is called the distribution of p i is stochastically dominated by the Uniform 0,1 ð Þ: If lim M?? Pr b p p M,i ƒa À Á ƒa, it is called that the distribution of b p p M,i is stochastically dominated by the Uniform 0,1 ð Þ distribution asymptotically. In order to control FDR with the BH method asymptotically, we at least need that Pr p i ƒa ð Þƒa for a[ 0,1 ð Þand i[I 0 : The condition is more liberal than that p i has the exact Uniform 0,1 ð Þ distribution for i[I 0 , and applies to the case when the test statistics are discrete random variables.
As shown in the proof of Theorem 1, In the first equality, Pr p i ƒq k ,p Þ wq m Þ is the probability that in addition to rejecting the hypothesis i, we also reject other k{1 hypotheses. Sarkar [23] showed that if m 0~m , then Equation (13) is the difference between two familywise error rates (FWER, the probability that we at least have one false rejection) which are obtained respectively from using p and b p p M under the BH procedure. The result is not surprising since when all null hypotheses are true, FDR~FWER.
To make (11) vanish as M??, (9) in condition 5 of Theorem 1 is one of the sufficient conditions. However, as shown in Theorem 2, such condition is redundant when test statistics are independent and continuous.
We finally have a look of the assumption: The assumption says that the convergence in law should hold simultaneously at the points q k for 1ƒkƒm, and for all i[I 0 : Such convergence is reasonable for test statistics with limiting normal distribution if we set d~s=2, s~1,2, . . . : Note that if T i and b T T M,i are continuous,

When M and m both go to infinity
In practice, the number of samples M within a hypothesis, may be less than the number of hypotheses m: How such a large m, small M (or in statisticians' view: Large p (number of dimensions), small n (number of samples)) situation affects statistical inferences has been intensively studied recently, especially in simultaneously convergence of the test statistics. For example, when the samples are i.i.d., sufficient conditions for b P P M,i ? P: p i uniformly for all i already was provided by [23]. Clarke and Hall [24] documented that the difficulties caused by dependence of test statistics can be alleviated when m grows, but the result subjects to that distributions of test statistics should have light tails such as normal or Student's t. Fan et al. [25] proved that if normal or Student's t distribution is used to approximate the exact null distribution, the rejection area is accurate when log m~o(M 1=3 ); but if the bootstrap methods are applied, then log m~o( ffiffiffiffiffi ffi M p ) is sufficient to guarantee the asymptotic-level accuracy.
In practice, high frequency returns might not be i.i.d. distributed. Instead of assuming that samples have certain distributional properties, here we assume that (14) needs to hold. However, by jointly restricting growth rates of M and m, and together with some mild conditions, (14) can also be achieved. It can be seen in the following proposition. The proof of proposition 1 can be found in the supplementary materials.

Simulation study
For the simulation study, we consider the following stochastic volatility plus jump model (SVJ): where dW 1 t ð Þ and dW 2 t ð Þ follow the standard Brownian motion and s 2 t ð Þ follows the CIR process. J t ð Þ follows a Compound Poisson Process (CPP) with a constant intensity ldt, and N t ð Þ is the number of jumps occurring within the small interval t{Dt,t ð : We set correlation between dW 1 t ð Þ and dW 2 t ð Þ equal to zero (no leverage effect). We use the following parameter values for the simulation: m~0:05,a~0:015,b~0:2,and v~0:05: In the simulation, the unit of a period is one day. We vary the (daily) jump intensity l at five different levels: 0, 0:02, 0:05, 0:1, 0:15, and 0.2. Note that the intensity parameter l here is the expected number of jumps occurring per day. Different values of l tend to have different numbers of jump days over the whole sampling period, therefore result in different numbers of false null hypotheses. This allows us to see how such differences affect outcomes of the simulation.
We mimic the U.S. stock market and generate one minute intradaily log prices over 6:5 hours each day. Thus in our simulation, M~6:5|60~390, dt&Dt~1 M and ldt& l M : After obtaining a sample path, the jump test statistics Z {1:5,i , Z log ,i and Z ratio,i and their corresponding p-values are calculated. We test hypothesis (1) with the test statistics and control the FDR at the level c with the BH procedure.

Simulation results
We first focus on the case when the FDR control level c~0:05 and the number of null hypotheses m~1000: Figures 1, 2, 3, 4 and 5 show the plots of average values of relevant quantities from 1000 simulation runs. Figure 1 is for performances of the three different test statistics when the FDR is controlled with the BH procedure. In the top left panel, we show the realized FDR. The solid horizontal line is at the level c~0:05: It can be seen that the realized FDR of Z {1:5,i is almost around or under the required level, while Z log ,i has the largest realized FDR for all different values of l: Overall, as l increases, no matter which test statistic we use, the desired FDR level can be achieved.
Let b S S denote the realized number of correct rejections. We use b S S=m 1 to measure the ability of the test statistics to correctly reject the false hypotheses. As shown in the top right panel of Figure 1, the three test statistics have small differences in b S S=m 1 : It also can been seen that b S S=m 1 increases only slightly as l increases. In the bottom left panel of Figure 1, we can see that the significance level b i Ã i Ã c=m obtained from the BH procedure increases as l increases. As l goes up, the number of false hypotheses m 1 tends to increase, and we have less possibility that the test statistic will signal a true null as a false one. Consequently, we do not need a more stringent b i Ã i Ã c=m to prevent the false rejections, and more rejections can be obtained.
The average number of rejections b S S made by the BH procedure is constantly less than the average value of m 1 , as shown in the bottom right panel of Figure 1. It might be due to that c~0:05 is too restricted to obtain more rejections. A remedy is that we can use a more liberal level (c~0:1 or 0:15), but tolerate more false rejections. One thing worth to note here is that the average values of m 1 =m would in general be less than their corresponding l, since there may be more than one jump on a day, and this becomes even more obvious when l becomes large.
We then compare performances of the BH procedure with the conventional procedure of controlling type I error in each hypothesis: H 0 i is rejected if its realized p-value is no greater than a: Here a we specify are two frequently used levels: 0:01 and 0:05: Relevant results are shown in Figure 2. As can be seen in the first row, when different test statistics are used, the conventional procedure results in a high realized FDR, especially when the jump intensity l is small (the number of the false null hypotheses tends to be relatively low in the situation). An extremely case is that when there is no jump (l~0), rejecting H 0 i when b p p i ƒ0:01 (or 0:05) results in 100% false rejections. It says that the probability we at least make one false rejection (the familywise error rate, FWER) is one as we follow the conventional procedure. The reason is that when all the null are true and the test statistics for each hypothesis are almost serially independent, if we reject H 0 i when b p p i,M ƒa, on average we would reject ma hypotheses, and all of these rejections are wrong. However, the BH procedure performs far better in this situation. Even in the worst case, on average it only takes about probability 0:276 to make such an error.
Since the specified a 0 s are on average greater than b i Ã i Ã c=m, it is expected that more rejections can be obtained under the conventional procedure than the BH procedure. This can be seen in the second row of Figure 2. b S S=m 1 of the conventional procedure tends to be higher than that of the BH procedure, but as l goes up, their gap becomes small. Figure 3 shows performances of the method when lower frequency (5-min, 10-min and 15-min) data is used. Z {1:5,i still has the best ability to satisfy the required FDR levels, but it suffers the greatest loss of b S S=m 1 when the data frequency goes lower. Z log ,i does not perform better than the case when 1-min data is used, no matter in satisfying the required FDR level or b S S=m 1 : For Z ratio,i , its performance still is in the middle, but overall its performance is more stable than the other two competitors.
We then have a look at how the method performs when the number of hypotheses changes. We vary m at several different levels, ranging from 50 to 2000 and keep c~0:05: The results are shown in Figure 4. It can be seen that when l=0 and m is large (no less than 100), the realized FDR and b S S=m 1 are stable over different m: How does the method perform when FDR is controlled at different required levels? Figure 5 shows different required levels c and the realized FDR. The thick line is a 45-degree line, and the vertical dotted line is for c~1=2: Ideally the realized FDR needs to be equal or below the 45-degree line. For l~0:05 and 0:15, the method performs well, especially when c goes large. However, when l~0, there is a significant difference between the three test statistics, and the required FDR level becomes difficult to achieve in this situation.
The above results suggest that performances of the hybrid method are positively related to sampling frequency M and the intensity parameter l: Although the BH procedure results in quite stringent rejection criteria, it still can keep b S S=m 1 at a satisfying level. Fixing rejection region at a~0:01 and 0:05 indeed can have better b S S=m 1 , but it can suffer far higher false rejections when the number of true null is large. In sum, the simulation shows that combining the BNS test with the BH procedure, the FDR can be well controlled and the test statistics also can keep substantial ability to correctly identify jump components. Finally, we also conduct a simulation study with the stochastic volatility plus jump model (SV1FJ) used in [12]. The results can be found in the supplementary materials ( Figures S1, S2, S3, S4 and S5 in the supplementary materials) and they are qualitatively similar to those of the SVJ case shown here.

Real data applications
In the following we present some empirical results with real data. The raw data used for the empirical applications are one minute recorded prices of S&P500 (SPC500) index in cash and Dow Jones Industrial Average (DJIA) index. The sample period spans from Jan-02-2003 to Dec-31-2007. In order to reduce estimation errors caused by microeconomic structure noises, we use five minute log returns to estimate RV i , BV i and JV i and the jump test statistics. Figure S6 and S7 in the supplementary materials show volatility signature plots for detecting microstructure noise and time series plots of the price variations. A detail description of the data and discussion on the microstructure issue can be found in the supplementary materials. Table 2 shows summary statistics of the price variations, different types of b T T i,M , their corresponding b p p i,M and mutual correlations of these quantities of the two indices. Results of the Ljung-Box test (denoted by LB.10) indicate that the price variations are highly serially correlated. However, for b T T i,M and b p p i,M , the Ljung-Box test instead indicates that they exhibit almost no serial correlation, which suggests that the BH procedure may efficiently control the FDR in this case.
The daily test statistics of the two indices have high mutual correlations. This property is quite different from the daily test statistics between individual stocks and the market index. As shown in [2], the jump test statistics of individual stocks and the market index almost have no mutual correlation, even though their returns are highly correlated. Such low correlation is due to a large amount of idiosyncratic noises in the individual stock returns, which causes a low signal-to-noise ratio in the nonparametric jump test statistics. The high mutual correlation between the jump test statistics of the two benchmark indices suggests that the idiosyncratic noises of returns is not significant and we may have more reliable results when we perform the jump test at the market level.

Common jump days
To measure daily price variation induced by jumps, we use sum of squared intradaily jumps, which can be estimated by the following estimator: where JV i~R V i {BV i . Table 3 shows summary statistics of JV i,c when FDR is controlled at level c~0:01 and 0:05. The mean and standard deviation of (15) shown here are conditional on The conditional mean is around 0.14 to 0.22 for SPC500 and 0.13 to 0.16 for DJIA. For SPC500 and DJIA, the significant levels b i Ã i Ã c=m for the three statistics are all below 0.006 when the FDR control level c~0:05. Depending on different test statistics, the proportion of identified jump days among all days, is around 1.5% to 11.6% for SP500 and around 2.4% to 8.6% for DJIA.
Common components in two highly correlated asset prices are often one of the most widely studied issues in empirical finance.
Here we document some relevant empirical findings. Figure 6 shows the time series plots of the identified JV i,c on the common jump days, and Table 4 shows their summary statistics. The term common jump days used here only means that the two indices both have jumps on these days. It does not necessarily mean that the two indices jump exactly at the same time within these days. Since the daily BNS test statistic is obtained by integrated quantities over one day, it cannot tell us how many and what exact time the jumps occur within that day. Nevertheless such test at least let us know what common days they have jumps, and this information is still valuable for further research.
It can be seen that the results from the two methods are very similar. When the FDR control level c~0:05, proportion of the common jump days among all jump days is around 41% for SPC500. This proportion varies from 31% to 51% for DJIA when different test statistics are used. Comparing magnitudes of the variations in Table 4 with those in Table 3, the two indices tend to have larger jumps on the common days. The result seems to imply that a common shock such as announcements of macroeconomic news, may induce a larger jump than other idiosyncratic shocks such as announcements of news of individual stocks.

Jump intensity estimation
Jump intensity of an asset price process is a very crucial parameter for evaluating risks of the asset. As shown in [26] and [27], the jump intensity seems to change over time, which implies that clustering of jump variations is time varying. The time varying jump intensity also demonstrates very different dynamic behavior across different assets. In the previous literatures, the time varying jump intensity is estimated via moving average of the number of identified jump days, but the threshold for identifying these jump days is a fixed type I error. Here, rather than controlling the fixed type I error over the whole sampling period, we try to incorporate the FDR control into the rolling window estimation.
The simple moving average (rolling window) intensity estimator for the kth day is defined as where h is a threshold, and K is length of the rolling window. The estimator can serve as a local approximation for the true intensity of the jump process, if we assume that number of jumps occurring at most once per day. In the following analysis, we set K~120, and h is chosen based on two different ways: The first one is the FDR criterion using the whole m~1247 hypotheses, and the second one is the FDR criterion using the K hypotheses within that window with the required FDR level c~0:15.
While the first method always has h fixed, the later method leads to an adaptive FDR criterion which may change over time, since including a new b p p i,M may make a different FDR criterion. Time series plots for the estimations with the three different jump test statistics are illustrated in Figure 7. In the left panel are plots for the SPC500 and the right panel are plots for the DJIA. It can be seen that with Z {1:5,i , b l l mov k tends to be constantly lower than those with the other two test statistics. When h is chosen adaptively over the whole sampling period, b l l mov k is more volatile; and it tends to be higher (lower) when more (less) jump days are identified. This phenomenon holds no matter which test statistic is used. On the other hand, with h fixed, b l l mov k is less sensitive to inform such large price movements. Finally, one should note that adaptively choosing h is only meaningful if the control procedure can lead to a different choice of h as different information appended, which is possible for the BH procedure but can never be achieved via the conventional type I error control.

Conclusion
In this paper, we have tested whether a stochastic process has jump components by the BNS nonparametric statistics, and controlled the FDR of the multiple testing with the BH procedure.
Theoretical and simulation results are presented to support validity of the hybrid method. Under appropriate conditions, the FDR can be asymptotically controlled by the BH procedure if the p-values are obtained via the asymptotical distributions. The simulation results show that the transformed BNS test statistics can perform well in satisfying the required FDR level with the BH procedure. Their ability to correctly reject false hypotheses is also improved as the frequency of jumps increases. By controlling the FDR, we can have a large chance to avoid any wrong rejection when the stochastic process does not have any jump components. Overall, our simulation results suggest that performance of the method is positively related to the jump intensity and sampling frequency, and is stable over different numbers of hypotheses and the required FDR levels.
As for the empirical results, we find the daily nonparametric test statistics and their corresponding p-values almost have no serial correlation, either for the SPC500 or DJIA. But the test statistics between the two indices are highly mutually dependent. The two indices tend to have larger jumps on the common jump days. We also demonstrate different properties of jump intensity estimations from fixed and adaptive threshold methods. The jump intensity estimated from adaptive threshold method is more sensitive to inform large price movements. Figure S1 Realized FDR,Ŝ S=m 1 , significance level obtained from the BH procedure and number of rejections. In the graphs, each point is an average value from 1000 simulations. (TIF) Figure S2 Realized FDR andŜ S=m 1 of the hybrid method and the conventional procedure. In the graphs, each point is an average value from 1000 simulations. (TIF) Figure S3 Realized FDR andŜ S=m 1 of the hybrid method with lower frequency data. In the graphs, each point is an average value from 1000 simulations.