Nonparametric testing of lack of dependence in functional linear models

Wenjuan Hu; Nan Lin; Baoxue Zhang

doi:10.1371/journal.pone.0234094

Abstract

An important inferential task in functional linear models is to test the dependence between the response and the functional predictor. The traditional testing theory was constructed based on the functional principle component analysis which requires estimating the covariance operator of the functional predictor. Due to the intrinsic high-dimensionality of functional data, the sample is often not large enough to allow accurate estimation of the covariance operator and hence causes the follow-up test underpowered. To avoid the expensive estimation of the covariance operator, we propose a nonparametric method called Functional Linear models with U-statistics TEsting (FLUTE) to test the dependence assumption. We show that the FLUTE test is more powerful than the current benchmark method (Kokoszka P,2008; Patilea V,2016) in the small or moderate sample case. We further prove the asymptotic normality of our test statistic under both the null hypothesis and a local alternative hypothesis. The merit of our method is demonstrated by both simulation studies and real examples.

Citation: Hu W, Lin N, Zhang B (2020) Nonparametric testing of lack of dependence in functional linear models. PLoS ONE 15(6): e0234094. https://doi.org/10.1371/journal.pone.0234094

Editor: Xiaofeng Wang, Cleveland Clinic Lerner Research Institute, UNITED STATES

Received: December 31, 2019; Accepted: May 18, 2020; Published: June 26, 2020

Copyright: © 2020 Hu et al. This is an open access article distributed under the terms of the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are credited.

Data Availability: The data underlying this study are from the ‘CanadianWeather’ dataset, which is publicly available. The authors used the ‘CanadianWeather’ dataset by loading the R package ‘fda’ directly in the R program. The link of the R package ‘fda’ is available here: https://cran.r-project.org/web/packages/fda/ and the description of this dataset is here: https://www.rdocumentation.org/packages/fda/versions/5.1.4/topics/CanadianWeather. Another link to the dataset is here: http://www.psych.mcgill.ca/misc/fda/ex-weather-a1.html. Ramsay, J. O. and Silverman, B. W. (2005). Functional Data Analysis. The authors did not have special access privileges.

Funding: Research was supported by the National Natural Science Foundation of China (Grant No.11671268). The funders had no role in study design, data collection and analysis, decision to publish, or preparation of the manuscript.

Competing interests: The authors have declared that no competing interests exist.

Introduction

Functional regression studies how a response variable Y varies with a functional predictor X(s), where Y can be scalar () or functional (Y(t) ∈ L²([0, 1])). The space L²([0, 1]) denotes the Hilbert space for square integrable functions. Without loss of generality, we define the index s and t on the closed interval [0, 1]. In case the raw support of s and t is a closed interval [a, b], one can simply rescale it to the interval [0, 1]. In this paper, we assume data following the widely used functional linear model (FLM) [1–4]. For a functional response, the FLM is defined as (1) where both the intercept α(t) and the random error ϵ(t) are square integrable and independent of X(s), the regression coefficient β(t, s) is in L²[0, 1] × L²[0, 1]. Denote that , where is the regression operator . For a scalar response variable Y, the FLM has a simpler form , where both the intercept α and the random error ϵ are real valued, and the regression parameter β(s) ∈ L²[0, 1]. Hereafter, we mainly focus on FLMs with functional responses, but the general methodology also applies to scalar responses.

In this paper, we consider testing whether the regression operator has an assigned structure , that is, to test (2) In practice, people often focus on the special case with , i.e. to test the dependency between the response variable and the predictor. Existing tests in the literature for this problem can be categorized into parametric and nonparametric tests. In parametric tests, the test statistics are usually established by first estimating the functional regression coefficient through dimension reduction, such as functional PCA [5, 6–8]. Methods for real-valued responses include [6], [7] and [8]. [6] used a test statistic based on the L₂ norm of the empirical cross-covariance operator of (X, Y). [8] proposed a Wald-type test with varying thresholds in selecting the number of principal components. [7] developed four test statistics based on the functional principal component (FPC) scores. They assume normality on the error distribution due to the need of the likelihood function. For functional responses, the test statistic proposed by [5] is constructed based on the eigenvalues and eigenfunctions decomposed from the functional PCA of the response variable Y(t) and the predictor X(s). Such parametric methods require the costly estimation of the covariance operator of the predictor. Due to the intrinsic high dimensionality of functional data, the inaccuracy and numerical instability in the covariance operator estimation may render the parametric tests invalid especially for small or moderate size samples. The same issue also occurs in high dimensional problems in multivariate statistics [9, 10]. Another limitation of FPC is that the principal component scores are computed independently from the predictor. Then the directions which explain X(t) best may not be the best predictors for the response which may lead to disparate test results for the regression problems. On the other hand, nonparametric tests utilized a different idea to avoid estimating the covariaznce operator [11, 12, 15]. For real-valued responses, [11] used the Nadaraya-Watson technique [13, 14] to estimate the conditional mean of given X = x. [15] also proposed a nonparametric test based on a kernel function for real responses. However, this method still requires estimating the covariance operator to calculate the semimetric. Furthermore, this test needs to split the sample into three groups, one of the three groups is used to estimate the kernel function, another to estimate the sample mean of responses variables, and the last group contributes to statistic, which is suitable for large sample data. The test proposed by [12] is for functional responses, and its test statistic is a weighted U-statistic with weights obtained from nearest neighbor smoothing. While this test possesses the correct Type-I error rate, identification of the neighbors requires defining distances between the functional predictors in the least favorable direction, which tends to result in lower power in general.

Motivated by [10], we propose a novel nonparametric test, called FLUTE, based on a U-statistic that measures the L₂ distance in the induced space after transforming the original space of the functional predictor by the covariance operator. Our approach avoids explicit estimation of the covariance operator as it is based on the distance in the induced space. The FLUTE test can be applied to both real-valued and functional responses.

The paper is organized as follows. In Section Methodology, we introduce basic notations about functional linear regression model and the FLUTE test statistic. After presenting the theory in Section Asymptotic theory for functional responses, we further discuss the FLUTE test for FLMs with a scalar response in the next Section. Section Simulation and real data reports results from simulation studies and real data. The last section is the conclusion section.

Methodology

Notation and assumptions

Let 〈⋅, ⋅〉 denote the inner product in L²[0, 1], that is, for any f₁, f₂ ∈ L²[0, 1] The L₂ norm ‖⋅‖ is defined by ‖f‖² = 〈f, f〉. We assume in the FLM (1) that both the predictor X(s) and response variable Y(t) are random elements of L²[0, 1] and integrable. The sample functions X_i(s), i = 1, …, n, are independently and identically distributed (i.i.d.) with E[X(s)] = μ_x(s) and E‖X(s)‖⁴ < ∞. We also assume that the random trajectories ϵ_i(t) are i.i.d with E[ϵ(t)] = 0, E[ϵ²(t)] = σ²(t) and E‖ϵ(t)‖⁴ < ∞.

Suppose {ϕ_k, k ≥ 1} and {η_ℓ, ℓ ≥ 1} are some orthonormal bases of the Hilbert space and , respectively. To simplify the representation, hereafter we focus on the case where the Hilbert spaces and are L²[0, 1]. Then we represent the predictor X_i(s) and the regression coefficient β(t, s) via the Karhunen-Loève decomposition [16]. For any s ∈ [0, 1], we have (3) where μ_x(s) is the mean function of the predictor X_i(s), and the expansion coefficient ξ_ik = 〈X_i, ϕ_k〉, with E[ξ_ik] = 0. For any t, s ∈ [0, 1], the regression coefficient β(t, s) is represented as (4) where β_ℓk = ∬β(t, s)η_ℓ(t)ϕ_k(s)dtds.

Next we introduce the covariance operator of the predictor X(s) and its empirical counterpart . For any element f ∈ L²([0, 1]), we define where . And denote the corresponding eigenelements by with the eigenvalues λ₁ ≥ λ₂ ≥ … and ν_k the eigenfunction corresponding to λ_k.

If both the predictor X(s) and response variable Y(t) are fully observed, hereafter we assume that α(t) = 0 and μ_x(s) = 0 which will be explained in the next section, then the FLM with a functional response (1) can be represented as, (5)

In practice, the infinite expansion (3), (4) and (5) above is usually approximated by a truncated basis expansion (e.g. B-spline basis and Fourier basis) [4, 11, 16, 17]. If the functional variables are densely observed, then recovering each trajectory of the functional variables based on the least square method is straightforward [18, 19]. If the functional variables are sparsely observed, [20] and [21] proposed to estimate the FPC scores through local linear surface smoother for the covariance operator, and then approximate each trajectory using the first K eigenfunctions. For the sparse observation, the error can not be ignored. Due to the certified complexity of the asymptotic normality of the statistic, we leave this area for future investigation.

We will represent the FLM with a functional response (1) using basis expansion when the approximation error is controlled. Denote e_i(t) as the approximation error produced through and approximating , that is, (6)

Similar with Condition 1 in Appendix B in [22], then by the Cauchy-Schwarz inequality, uniformly across all i = 1, …, n, we have (Please see lemma 1), where C and are two positive constants.

As K, L → ∞, the approximation error should be more precise and become ignored. Hence the FLM with a functional response (1) can be rewritten as, (7)

In this paper, we assume that both the predictor X(s) and response variable Y(t) are fully observed or the approximation error is controlled.

The FLUTE test

In this section, we introduce the FLUTE test whose test statistic is a U-statistic. The theory of U-statistics for fixed-dimensional data, pioneered by [23], has been well documented; see [24] and [25] for summaries. Recently, [10] developed the theory for high-dimensional multivariate data.

Under the functional linear model (1), if α(t) = 0 and E[X(s)] = 0, we can see that , which is then the perturbed L² norm for measuring the distance between and . Further, it is easy to see that (8) where c(s, e) = E[X(s)X(e)]. Note that is equivalent to the first term on the right-hand side of Eq (8) being zero. Thus we may consider testing the hypothesis (2) by a U-statistic with as the kernel, whose expectation is , where .

For the general case where α(t) ≠ 0 and E[X(s)] ≠ 0, we consider the U-statistic T_n, (9) where , and , with , and denotes combinations over all subscripts (i₁, …, i₄). As the statistic T_n is invariant to location shifts in both X_i(s) and Y_i(t), without loss of generality, we assume that α(t) = 0 and μ_x(s) = 0 in the rest of the paper. Define θ(F) = E[ψ(i₁, i₂, i₃, i₄)], then E[T_n] = θ(F).

As the statistic T_n measures the distance between the regression operator and the assumed structure under the null hypothesis, large values of the statistic T_n are in favor of the alternative hypothesis and leads to rejection of the null hypothesis.

For the representation of the predictor X(s), we have (10) where Φ(s) = (ϕ₁(s), …, ϕ_K(s))′, ξ = (ξ₁, …, ξ_K)′, and var[ξ] = Σ. We next follow the general condition in [9] and assume that the loadings ξ of the predictor X(s) have a factor design structure.

Assumption 1 There exists a m−variate random vector

N = (N₁, …, N_m)′ for some m < ∞ so that ξ = ΓN. Here Γ is a K × m matrix such that ΓΓ′ = Σ, and E[N] = 0, var[N] = I_m, where I_m is the m × m identity matrix. Each random variable N_ℓ, ℓ = 1, …, m, is assumed to have finite 8th moment and for some constant ρ ∈ [0, 1). Further, for any and 1 ≤ m₁ < m₂ < … < m_d ≤ m, we assume

Assumption 1 allows factors N to have a weak correlation. If the predictor X(s) follows a Gaussian process, [16] pointed out that X(s) admits the following expansion with independent standard normal random variables N_k’s. It is easy to see that this is a special case of the factor design structure in Assumption 1, where the (a, b)th element of the transformation matrix Γ_K×m is .

Let ε_i = (ε_i1, …, ε_iL)′ which is the expansion coefficients of ϵ(t), and Λ = var[ε]. We assume the following assumption.

Assumption 2 For i ≠ j, and .

Asymptotic theory

In this section, we derive the asymptotic unbiasedness of the FLUTE test and the asymptotic normality of its test statistic under both the null and a local alternative hypothesis through the Hoeffding decomposition.

Let W_i = (X_i(s), ϵ_i(t)), where . Thus, ψ(i₁, i₂, i₃, i₄) in Eq (9) can be represented as . And ψ_c(w₁, …, w_c) = E[ψ(w₁, …, w_c, W_{c + 1}, …, W₄)], be the projections of ψ to lower-dimensional sample spaces for c = 1, …, 4, where w₁, …, w_c are fixed variables (e.g. , , ). The specific forms have been given in the appendix of Proof of Theorem 1. Let v_c = var[ψ_c] be the variance. Let , then we have the Hoeffding decompositions for T_n is , where and with . The decomposition for the variance of T_n is . We assume that E[ψ²(W₁, …, W₄)] exists. The proofs of the Hoeffding decompositions can be found in [23] and also [24]. [10] recently showed that the decomposition also holds when the dimension of the predictor K increases to infinity. Based on Proposition 1 in [10], if we find the minimum c′ such that , c′ = 1, 2, or 3, is of the same order as v₄, then T_n will be dominated by the first c′ terms, so that (11)

Theorem 1 Under the FLM ( 1 ) and assuming Assumption (1),

K, L → ∞ as n → ∞, we have

(i). and
(ii).

where and

Please see the Proof of Theorem 1 in Appendix.

Let Δ = (β_ℓk − β_0,ℓk)_ℓ,k, where β_0,ℓk define the loadings of β₀(t, s). And let M_a = ΔΣ^aΔ′, a = 0, 1, 2, 3 (e.g. Σ⁰ = I_K, Σ² = ΣΣ), Q₀ = Γ′Γ, Q₁ = Γ′Δ′ΔΓ, Q₂ = Γ′ΣΔ′ΔΓ, Q₃ = Γ′ΣΓ, Q₄ = Γ′Δ′ΔΣΔ′ΔΓ. Under , we have Δ = 0, and hence Q₁ = Q₂ = Q₃ = M_i = 0 for i = 0, 1, 2, 3. So it is obvious that v₁ = 0, and T_n is then a degenerate U-statistic. Under this case, we have

Next we show that the form of the variance for T_n also holds under a subclass of local alternative hypothesis specified by the following condition, (12) Under the null hypothesis, the equation v₁ = o(n⁻¹ v₂) holds with v₁ = 0. Under the local hypothesis, the equation v₁ = o(n⁻¹ v₂) still holds (see Appendix). The following theorem then states the asymptotic normality of our test statistic under this local alternative.

Theorem 2. Under the FLM ( 1 ), assuming Assumptions (1) and (2), under either the null hypothesis or the local alternatives , as n → ∞, we have

Please also see the Proof of Theorem 2 in Appendix.

For real data, the trace tr(Σ²) and tr(Λ²) need to be estimated. We use the estimator given in Chen and Qin [26], which was shown to be unbiased and ratio consistent, i.e. , under the null hypothesis or the local alternatives. Specifically, the estimator is given as (13) where , and with . Following the same idea, we can also construct a consistent estimator of tr(Λ²) under H₀.

Following Theorem 2, the FLUTE test rejects at significant level α if where z_α is the upper α−quantile of N(0, 1).

Theorem 2 also implies that the asymptotic power of the proposed statistic under the local alternative is The quantity can be viewed as a signal to noise ratio (SNR). If r_n(β − β₀, Σ, Λ) = o(1), it is obvious that the power converges to α. If r_n(β − β₀, Σ, Λ) is in the order of O(1), the power converges to 1.

FLUTE for scalar responses

In the FLM with a scalar response, (14) where Y ∈ R. The null hypothesis for the scalar response is defined as

The idea of the FLUTE method in Section Asymptotic theory directly applies and only requires slight modification toward the dimension of the response and functional regression coefficients. For example, the kernel of the FLUTE statistic is (Y_i − 〈X_i, β₀〉)〈X_i(s), X_j(s)〉(Y_j − 〈X_j, β₀〉) with expectation , where . The expansion of parameter β(t) is . The theory can be developed using the same idea as in Section Asymptotic theory. We distinguish by denoting the counterpart to notations in Section Asymptotic theory with a check mark. For example, the kernel of the FLUTE statistic for scalar response model (14) is denoted by , and its variance is . The following theorems show that the same asymptotic null distribution in Theorems 1 and 2 hold for the scalar response case.

Theorem 3. Under the FLM with scalar response ( 14 ), assuming Assumption (1), when K → ∞ as n → ∞, we have

(i). and
(ii).

We consider the local alternative hypothesis as follows. and where .

Theorem 4. In the FLM with a scalar response ( 14 ), assume Assumption (1) and E[ϵ⁴] is finite. Under either the null hypothesis or the local alternatives , as n → ∞, we have where σ² = var[ϵ].

The proofs of Theorems 3 and 4 are omitted because they can be proved in the same way as Theorems 1 and 2 except with slight modification to the notations to reflect the difference in dimensionality.

Simulation and real data

In this section, we demonstrate the performance of the FLUTE method by simulation studies and an application to a real data example. For cases with functional responses, we compare the FLUTE method with the method in [1], which is constructed based on the functional PCA, we call KMSZ, and the nonparametric test in [12] is constructed by a weighted U-statistic and we name as NP. The KMSZ method depends on the functional principal components and is more suitable for large sample case which could estimate the covariance operator well. The test statistic of the NP method depends on so-called the least-favorable direction γ which is more suitable for the low dimension case. Under the simulation setup, this direction γ can be decided in three different ways: 1) Pre-estimate γ based on a super large simulated data set and then use it for all simulated data sets; 2) pre-estimate γ based on the data set generated at each level of |β|² and then use it for simulated data sets generated at the same level of |β|²; and 3) estimate γ based on each simulated data set. The simulation results please see Table 1 and more details can be found in the Supplementary Material A in S1 File. Results reported in this section are based on the second way, which is consistent with applications of the NP method to real data.

Download:

Table 1. Size and power for NP test with different searching methods.

https://doi.org/10.1371/journal.pone.0234094.t001

For FLMs with a scalar response, neither the KMSZ nor NP method is applicable because the former involves functional PCA on the response and the latter requires computing the L² norm between two functional response values. The nonparametric test proposed by [15] which we name as NETRF is for the scalar response. However, the NETRF test still requires estimating the covariance operator to calculate the semetric. Furthermore, this test needs sufficiently large sample data to provide accurate estimations of each group. Therefore, we do not directly compare DelSol’s method with our FLUTE test, but we conduct simulation studies for small/moderate sample cases to demonstrate the incapability of DelSol’s method under these scenarios. Here we choose the current comparison benchmark as the F-test proposed by [7]. [7] actually proposed four asymptotically equivalent tests which also depends on the functional components, and can be more suitable for large sample case. We use the F-test because it behaves the best of the four tests for small to moderate samples.

Simulation results

We next conduct a simulation study to evaluate the empirical size and power of our FLUTE test for small to moderate samples with sample size n varying between 40 and 100. In each simulation, we generate 1,000 Monte Carlo samples. Our computer codes are written in R. For basis expansion and functional PCA, we use the implementation in the R package fda.

Functional response.

First we present the case of the FLM with functional responses. This simulation design follows the FLM (1), where we set β(t, s) = |β|² exp{(t² + s²)/2}. Here |β|² indicates the L₂ norm of β(t, s) and is used to control the SNR. We generate the functional predictor X_i(s) according to Eq (10), where the bases are chosen as Fourier bases. For instance, the first five orthonormal Fourier basis functions are , , , and . Without loss of generality, we set the mean μ_x(s) = 0. According to the factor design (Assumption 1), the loadings ξ₁, …, ξ_n are independently generated from the following moving average model, (15) where the constant T controls range of dependency(see Fig 1). The coefficients are randomly generated from U(0, 1), where U(a, b) denotes the Uniform distribution on the interval (a, b). And the random vectors N _i = (N_i1, …, N_i(K+T−1))′ are independently generated from the N(0, I_{K + T−1}) distribution. It then follows that the (k, ℓ)th element of the covariance matrix var(ξ) is , which shows that the correlation between ξ_ik and ξ_iℓ is controlled by |k − ℓ| and T. The random error function ε_i(t) is generated according to the decomposition in Eq (10). We also set bases in the same way as {ϕ_k(s)}. And the loadings ε_i1, …, ε_iL are independent identical distribution and generate from N(0, Σ_ϵ).

Download:

Fig 1. The autocorrelation functions for loadings

.

https://doi.org/10.1371/journal.pone.0234094.g001

To evaluate the impact of dimensionality and sample size, we carry out simulations under four different settings, varying in dimensionality and sample size K = L = 5 (low-dimensional) and K = L = 11 (high-dimensional), n = 40 and 100. When generating X(t), we set T = 5 in Eq (15). The variances of the loadings ε_i1, …, ε_iL are the same, we set Σ_ϵ = I_L. Under each setting, we vary the |β|² at 10 levels from 0 to 0.5 (see Tables 2 and 3). When |β|² is 0, the result provides the empirical size of all tests, and results at the other 9 levels give the power. Each testing method is evaluated at two nominal significance levels α = 0.05 and 0.1.

Download:

Table 2. Size and power for different tests at α = 0.05.

https://doi.org/10.1371/journal.pone.0234094.t002

Download:

Table 3. Size and power for different tests at α = 0.1.

https://doi.org/10.1371/journal.pone.0234094.t003

Table 2 shows the empirical sizes and power obtained for different dimensionality and sample size under the nominal significance level α = 0.05. Under the same sample size, the power of all three tests decrease as the dimensions K and L increase. When the dimensionality is the same, the power of all three methods improves as the sample size increases. Table 2 also shows that the FLUTE method performs stably in both the low dimensional cases and the high dimensional cases. The KMSZ and NP tests are conservative and their power decreases significantly as the dimension increases.

Further the NP method has almost no power in the case of high dimension and small sample size (K = L = 11, n = 40). It is apparent that in Table 2, power of the FLUTE method is consistently higher than that of the KMSZ and NP methods, especially in high dimensional cases. The simulation results also show that the FLUTE method respects the nominal levels under high dimensionality at both sample size n = 40 and 100.

Table 3 shows the results under the nominal significance level α = 0.1, and provides the same conclusion as Table 2.

To evaluate the impact of the correlation structure, we carry out simulations under two different settings, T = 5 (weakly correlated) and 11 (strongly correlated) (see Fig 1). Under each setting, we vary |β|² at 6 levels from 0 to 0.5 (see Table 4). When T is 5, the correlation is weak. Table 4 shows the empirical sizes and power obtained for the case of K = L = 11 and n = 40. The FLUTE method is stable for different T, both of the KMSZ and the NP method are more sensitive to the correlation structure. On the other hand, the power of the NP statistic decreases significantly when T increases, since this method needs to search the least-favorable direction. While the power of the KMSZ statistic decreases significantly when T reduces, since this method depends on functional PCA. When the correlation is weak, the number of functional PCs would increase to achieve the same percentage of variance explanation, hence the number of p also increase which results in lower power.

Download:

Table 4. Size and power for different correlation when K = L = 11, n = 40.

https://doi.org/10.1371/journal.pone.0234094.t004

To evaluate the performance of the FLUTE method with heteroscedastic variance, we carry out simulations under the following settings, the designed variances of the expansion coefficients of ε_i(t) are Var(ε_iℓ) = 1/ℓ, for i = 1, …, n, ℓ = 1, …, L. We set T = 5, n = 40. And we vary |β|² at 6 levels from 0 to 0.5. The significance levels are α = 0.05 and α = 0.1 respectively. Table 5 shows the empirical sizes and power for the cases of K = L = 5 and K = L = 11, and provides a similar conclusion as Tables 2 and 3 when n = 40. The power of all three tests decreases as the dimensions K and L increase. However, the FULTE method performs stably in both low dimensional cases and the high dimensional cases, the power of the KMSZ and NP tests decreases significantly as the dimension increases.

Download:

Table 5. Size and power for heteroscedastic variance when K = L = 11, n = 40.

https://doi.org/10.1371/journal.pone.0234094.t005

Fig 2 shows the histograms of the FLUTE statistic for different dimensionality and sample size under the null hypothesis, which matches nicely with the imposed standard normal density. This is consistent with our results in Theorm 2.

Download:

Fig 2. The null distribution of the FLUTE statistic in FLMs with functional responses.

The solid line indicates the density of the standard normal distribution.

https://doi.org/10.1371/journal.pone.0234094.g002

Fig 3 shows the power curves of the FLUTE statistic under four different cases with varying dimensionality and sample size when the nominal significance level α = 0.05 and level α = 0.1. Under all the four cases, power curves have effective size, and when |β|² is 0.2, the four power curves almost reached 1.

Download:

Fig 3. Power curves of the FLUTE method.

Case 1: K = L = 5 and n = 40; Case 2: K = L = 11 and n = 40; Case 3: K = L = 5 and n = 100; Case 4: K = L = 11 and n = 100. The left figure is for α = 0.05, and the right is for α = 0.1.

https://doi.org/10.1371/journal.pone.0234094.g003

Scalar response.

This section presents the results for FLMs with scalar responses. This simulation design follows the model (14). We set the coefficient of regression parameter as . The functional predictor X(t) is generated in the same way as in Section Functional response. And the random errors ε_i are independently generated from N(0, 1).

Same with FLMs with functional responses in Section Functional response, we carry out simulations under four different settings, K = 5 (low-dimensional) and K = 11 (high-dimensional), n = 40 and 100. Under each setting, we vary |β|² at 10 levels from 0 to 0.5 (see Tables 5 and 6). Each testing method is evaluated at two nominal significance levels α = 0.05 and 0.1.

Download:

Table 6. Size and power for normal residual at significant level α = 0.05.

https://doi.org/10.1371/journal.pone.0234094.t006

Table 6 shows the empirical sizes and powers obtained for different dimensionality and sample size under the nominal significance level α = 0.05. The power of the two tests, FLUTE and the F-test in [7], is similar in these four cases. The results show that the FLUTE test is more powerful than the F-test. Table 7 shows the same conclusions at nominal significance level α = 0.1.

Download:

Table 7. Size and power for normal residual at significant level α = 0.1.

https://doi.org/10.1371/journal.pone.0234094.t007

Table 8 shows the comparison between FLUTE and Delsol’s method at significant level α = 0.1, and K = 11. NETRF1, NETRF2 and NETRF3 stand for three bootstrap methods in Delsol’s paper. Due to the three test statistics are nonparametric tests that are constructed based on a kernel function, the estimation of bias and variance terms seems difficult. Further, it is usually irrelevant to use the quantiles of the asymptotic law to estimate the threshold directly. Thus, the bootstrap procedure is needed. For all three methods, we choose the semi-metric induced by functional principal components, and split the samples into three groups as 20, 10 and 10, when n = 40, 40, 30 and 30, when n = 100. Under each setting, the empirical significance level are calculated by 1000 bootstrap iterations. FLUTE stands for our method. It is obviously that the sizes of Delsol’s methods can not be well controlled at the nominal level for small/moderate samples.

Download:

Table 8. Size for normal residual at significant level α = 0.1.

https://doi.org/10.1371/journal.pone.0234094.t008

Application to Canadian Weather data

The Canadian Weather data is available from the R package fda (http://www.r-project.org) which named CanadianWeather. The data consists of the daily temperature and rainfall registered in 35 weather stations in Canada averaged over 1960 to 1994, hence the sample size is 35. We view the daily temperature as the predictor and the rainfall as the response variable. Both the predictor and the response variable are functional. We use the FLUTE test to check the dependency between the daily temperature and the rainfall. Following [3], we choose 11 Fourier bases to fit the temperature curve and rainfall curve for each station separately.

Let Y_i(t) represent the logarithm of the rainfall at the station i at time t and x_i(t) be the temperature of the same station at time t of the year. The value of FLUTE statistic is 12.17159 based on the whole 35 stations, hence we reject the null hypothesis. To illustrate the efficacy of the test, we repeat the test on 1000 bootstrap samples. Each bootstrap sample consists of data at 35 randomly selected stations with replacement from the total 35 stations. Fig 4 shows that the density of the FLUTE statistic is far away from the standard normal distribution, hence we prefer to reject the null hypothesis.

Download:

Fig 4. Empirical distribution of the FLUTE statistic based on 1000 bootstrap samples of size 35 drawn from the Canadian Weather dataset.

https://doi.org/10.1371/journal.pone.0234094.g004

Conclusion

We proposed the FLUTE test for testing dependence between the response and functional predictor in FLMs with either a real or functional response. By constructing a U-statistic that measures the L₂ distance in an induced space, the FLUTE statistic avoids estimating the covariance operator of the predictor. The parametric test in [1] requires estimation of the covariance operator and demands large samples. The nonparametric test in [12], although avoids explicitly estimating the covariance operator, requires estimating the least-favorable direction γ. In general, using the least-favorable direction leads to lower power. Meanwhile, our experience suggests the estimation of γ can be numerically unstable across different simulated data sets, which results in poor test performance.

Our FLUTE test does not suffer from these problems. It requires minimum effort in estimating model parameters, hence achieves higher power, especially for high dimensional cases. One potential weakness of the FLUTE test is its high computational cost in evaluating a U-statistic in large samples. However, estimating covariance operator is less a concern in large samples, one can switch to using parametric methods. We recommend the best context of using the FLUTE test is small to moderate sample problems.

Appendix

Proof of Theorems.

Lemma 1. Suppose the functional predictors {X_i, i = 1, …, n} and the regression function β(t, s) satisfy the following two conditions,

(A). Functional predictors, {X_i, i = 1, …, n}, belongs to a Sobolev ellipsoid of order two: there exists a universal constant C, such that for all i = 1, …, n.
(B). The regression functions satisfy with some constant . Further as L → ∞, the summation of coefficients for k = 1, 2, ….

then we have the approximation error .

Proof. Recall that Then by the Cauchy-Schwarz inequality, we have Next we show the three parts are controlled separately. According to the Holder inequality and Condition (A), we have (16) And we have (17) Similar with the proof of Eqs (16) and (17), we get (18) Hence we complete the proof by combining the bounds on each of the three parts.

Next, to prove Theorems 1 and 2, we first introduce some lemmas.

Lemma 2. Suppose random vector , satisfy E[Z_i] = 0, var[Z_i] = I_p, , where ρ is a constant in (0, 1). If the two random variables Z₁ and Z₂ are independent, for any square matrix M = (m_kℓ)_p×p, we have

(1). ;
(2).
(3).

Proof.

(1). Let , where W₁(k, ℓ) indicates the (k, ℓ)th element of W₁. With direct computation, we have W₁(k, ℓ) = Z_1k Z_1ℓ∑_i,j m_ijZ_1i Z_1j. If k = ℓ, If k ≠ ℓ, E[W₁(k, ℓ)] = m_kℓ + m_ℓk. Then E[W₁] = M + M′+ tr(M)I_p + ρdiag(M).
(2). Since , and , then we have
(3). It’s simple to show that .

Lemma 3. Consider symmetrical and semi-positive definite matrices A and B, [27] has improved some inequalities:

(1). tr(AB)² ≤ tr(A²)tr(B²);
(2). tr²(AB) ≤ tr(A²)tr(B²).

Lemma 4. For matries M_a, a = 1, 2, 3 defined as M_a = ΔΣ^aΔ′, we have tr²(M₂) ≤ tr(M₁)tr(M₃).

Proof.

Proof of Theorem 1.

Recall that the definition of the statistic in Eq (9), it is straightforward to show that .

To find the dominating terms, we need to calculate the following projections,

Based on the expansion of X_i(t) and the orthogonality of the bases, we can derive the variance of the projections v_c.

With straightforward calculations, we get Here, the Hadamard product is defined as A ∘ B = (a_ijb_ij) for matrices A = (a_ij) and B = (b_ij). Since both variances v₂ and v₄ are linear combinations of , tr²(M₂), tr(M₁ M₃), tr(ΛM₃), , tr(ΛM₁)tr(Σ²), tr(Q₂ ∘ Q₂), tr(Q₀ ∘ Q₁)², tr(Γ′Δ′ΛΔΓ ∘ Q₃), tr(Q₃ ∘ Q₄), and tr(Σ²), they are of the same asymptotic order. This means that the statistic T_n is dominated by the first two terms corresponding to V_n1 and V_n2. Hence we can get the Hoeffding decomposition (11) of T_n, and Then we complete the proof.

Proof of Theorem 2.

Using the inequalities in Lemma 3, under either the null hypothesis or the local alternative, we have (19) (20) Thus v₁ = o(n⁻¹ v₂).

Define (21) where δ_β = β − β₀. Then we can get We have which can be regarded as a U-statistic with the kernel Through direct calculation, we can get the projections of Ψ, , , and . By Hoeffding’s variance decomposition, we have , (see Supplementary Material B in S1 File).

Because we only need to show that From Eq (21) and the form of , let , where and Under the assumptions of this theorem and following Eqs (19) and (20) we have and

To complete the proof, we now need to show Define and , thus , which we define as . Let be a σ−field generated by . It is obvious to see that , . Then it shows that is a zero mean martingale. Let , . The central limit theorem will hold Hall 28 if we can show satisfies the following two conditions: (22) and for ∀τ > 0 We have and Hence we can define , where It can be shown that E[C_n1] = 1, and As tr(Σ⁴) = O(tr²(Σ²)), and var[C_n1 → 0. Then (see Supplementary Material B in S1 File). Similarly, E[C_n2] = 0, , then In summary, Eq (22) holds.

Since , by the law of large numbers, the last step is to prove (23) We have , and , thus under Assumption (16), we have (see details in S1 File) Hence we prove that Eq (23) holds. And this completes the proof.

Supporting information

S1 File.

https://doi.org/10.1371/journal.pone.0234094.s001

(PDF)

References

1. Ramsay JO and Silverman BW. Functional Data Analysis. New York: Springer, 2005.
2. Yao F, Müller HG, and Wang J. Functional linear regression analysis for longitudinal data. The Annals of Statistics. 2005;33(6):2873–2903.
- View Article
- Google Scholar
3. Malfait N and Ramsay JO. The historical functional linear model. The Canadian Journal of Statistics. 2008 Jun;31(2):115–128.
- View Article
- Google Scholar
4. Chiou JM, Yang Y, and Chen Y. Multivariate functional linear regression and prediction. Journal of Multivariate Analysis. 2016;146:301–312, April.
- View Article
- Google Scholar
5. Kokoszka P, Maslova I, Sojka J, Zhu L. Testing for lack of dependence in the functional linear model. Canadian Journal of Statistics. 2008 Jun;36(2):207–222.
- View Article
- Google Scholar
6. Cardo H, Ferraty F, Mas A, and Sarda P. Testing hypotheses in the functional linear model. Scandinavian Journal of Statistics. 2016 Mar;30(1):241–255.
- View Article
- Google Scholar
7. Kong D, Staicu AM, Maity A. Classical testing in functional linear models. Journal of Nonparametric Statistics. 2016;28(4):813–838. pmid:28955155
- View Article
- PubMed/NCBI
- Google Scholar
8. Su Y, Di C, and Hsu L. Hypothesis testing in functional linear models. Biometrics. 2017 Jun;73(2):551–561. pmid:28295175
- View Article
- PubMed/NCBI
- Google Scholar
9. Bai Z, Saranadasa H. Effect of high dimension: by an example of a two sample problem. Statistica Sinica. 1996;6:311–329.
- View Article
- Google Scholar
10. Zhong P and Chen S. Tests for high-dimensional regression coefficients with factorial designs. Journal of the American Statistical Association. 2011 Jan;106(493):260–274.
- View Article
- Google Scholar
11. Delsol L, Ferraty F, and Vieu P. Structural test in regression on functional variables. Journal of Multivariate Analysis. 2011 Mar;102(3):422–447.
- View Article
- Google Scholar
12. Patilea V, Sellero CS, and Saumard M. Testing the predictor effect on a functional response. Journal of the American Statistical Association. 2016 Jul;111(516):1684–1695.
- View Article
- Google Scholar
13. Nadaraya EA. On estimating regression. Theory of Probability & Its Applications. 1964; 9(1):141–142.
- View Article
- Google Scholar
14. Watson GS, Smooth regression analysis. Sankhya: The Indian Journal of Statistics, Series A(1961-2002).1964;26:359–372.
- View Article
- Google Scholar
15. Delsol L. No effect tests in regression on functional variable and some applications to spectrometric studies. Computational Statistics. 2013;28(4):1775–1811.
- View Article
- Google Scholar
16. Horváth L and Kokoszka P. Inference for Functional Data with Applications. New York: Springer, April 2012.
17. Shin H and Lee MH. On prediction rate in partial functional linear regression. Journal of Multivariate Analysis. 2012 Jan;103(1)93–106.
- View Article
- Google Scholar
18. Ramsay JO and Dalzell CJ. Some tools for functional data analysis. Journal of the Royal Statistical Society. Series B (Methodological). 1991; 53(3):539–572.
- View Article
- Google Scholar
19. Hervé C, Ferraty F, and Sarda P. Spline estimators for the functional linear model. Statistica Sinica. 2003 Jul;13:571–591.
- View Article
- Google Scholar
20. Yao F, Müller HG, and Wang JL. Functional data analysis for sparse longitudinal data. Journal of the American Statistical Association. 2005;100(470):577–590.
- View Article
- Google Scholar
21. Yao F, Lei E, Wu Y. Effective dimension reduction for sparse functional data. Biometrika. 2015 Jun;102(2):421–437. pmid:26566293
- View Article
- PubMed/NCBI
- Google Scholar
22. Fan Y, James GM, and Radchenko P. Functional additive regression. Ann. Statist. 2015 Oct;43(5):2296–2325.
- View Article
- Google Scholar
23. Hoeffding W. A Class of Statistics with Asymptotically Normal Distribution. The Annals of Mathematical Statistics. 1948 Sep;19:293–325.
- View Article
- Google Scholar
24. Serfling RJ. Approximation Theorems of Mathematical Statistics. New York: Wiley.;1980.
25. Lee AJ. U-statistics: Theory and Practice. New York: Marcel Dekker, 1990.
26. Chen S. X. and Qin Y.-L. A two-sample test for high-dimensional data with applications to gene-set testing. Annals of Statistics 2010; 38(2):808–835.
- View Article
- Google Scholar
27. Bellman R. Some Inequalities for Positive Definite Matrices. Springer, 1980.
28. Hall P and Heyde CC. Martingale Limit Theory and Its Application. New York: Academic Press, 1980.

[ref1] 1. Ramsay JO and Silverman BW. Functional Data Analysis. New York: Springer, 2005.

[ref2] 2. Yao F, Müller HG, and Wang J. Functional linear regression analysis for longitudinal data. The Annals of Statistics. 2005;33(6):2873–2903.
View Article
Google Scholar

[3] View Article

[4] Google Scholar

[ref3] 3. Malfait N and Ramsay JO. The historical functional linear model. The Canadian Journal of Statistics. 2008 Jun;31(2):115–128.
View Article
Google Scholar

[6] View Article

[7] Google Scholar

[ref4] 4. Chiou JM, Yang Y, and Chen Y. Multivariate functional linear regression and prediction. Journal of Multivariate Analysis. 2016;146:301–312, April.
View Article
Google Scholar

[9] View Article

[10] Google Scholar

[ref5] 5. Kokoszka P, Maslova I, Sojka J, Zhu L. Testing for lack of dependence in the functional linear model. Canadian Journal of Statistics. 2008 Jun;36(2):207–222.
View Article
Google Scholar

[12] View Article

[13] Google Scholar

[ref6] 6. Cardo H, Ferraty F, Mas A, and Sarda P. Testing hypotheses in the functional linear model. Scandinavian Journal of Statistics. 2016 Mar;30(1):241–255.
View Article
Google Scholar

[15] View Article

[16] Google Scholar

[ref7] 7. Kong D, Staicu AM, Maity A. Classical testing in functional linear models. Journal of Nonparametric Statistics. 2016;28(4):813–838. pmid:28955155
View Article
PubMed/NCBI
Google Scholar

[18] View Article

[19] PubMed/NCBI

[20] Google Scholar

[ref8] 8. Su Y, Di C, and Hsu L. Hypothesis testing in functional linear models. Biometrics. 2017 Jun;73(2):551–561. pmid:28295175
View Article
PubMed/NCBI
Google Scholar

[22] View Article

[23] PubMed/NCBI

[24] Google Scholar

[ref9] 9. Bai Z, Saranadasa H. Effect of high dimension: by an example of a two sample problem. Statistica Sinica. 1996;6:311–329.
View Article
Google Scholar

[26] View Article

[27] Google Scholar

[ref10] 10. Zhong P and Chen S. Tests for high-dimensional regression coefficients with factorial designs. Journal of the American Statistical Association. 2011 Jan;106(493):260–274.
View Article
Google Scholar

[29] View Article

[30] Google Scholar

[ref11] 11. Delsol L, Ferraty F, and Vieu P. Structural test in regression on functional variables. Journal of Multivariate Analysis. 2011 Mar;102(3):422–447.
View Article
Google Scholar

[32] View Article

[33] Google Scholar

[ref12] 12. Patilea V, Sellero CS, and Saumard M. Testing the predictor effect on a functional response. Journal of the American Statistical Association. 2016 Jul;111(516):1684–1695.
View Article
Google Scholar

[35] View Article

[36] Google Scholar

[ref13] 13. Nadaraya EA. On estimating regression. Theory of Probability & Its Applications. 1964; 9(1):141–142.
View Article
Google Scholar

[38] View Article

[39] Google Scholar

[ref14] 14. Watson GS, Smooth regression analysis. Sankhya: The Indian Journal of Statistics, Series A(1961-2002).1964;26:359–372.
View Article
Google Scholar

[41] View Article

[42] Google Scholar

[ref15] 15. Delsol L. No effect tests in regression on functional variable and some applications to spectrometric studies. Computational Statistics. 2013;28(4):1775–1811.
View Article
Google Scholar

[44] View Article

[45] Google Scholar

[ref16] 16. Horváth L and Kokoszka P. Inference for Functional Data with Applications. New York: Springer, April 2012.

[ref17] 17. Shin H and Lee MH. On prediction rate in partial functional linear regression. Journal of Multivariate Analysis. 2012 Jan;103(1)93–106.
View Article
Google Scholar

[48] View Article

[49] Google Scholar

[ref18] 18. Ramsay JO and Dalzell CJ. Some tools for functional data analysis. Journal of the Royal Statistical Society. Series B (Methodological). 1991; 53(3):539–572.
View Article
Google Scholar

[51] View Article

[52] Google Scholar

[ref19] 19. Hervé C, Ferraty F, and Sarda P. Spline estimators for the functional linear model. Statistica Sinica. 2003 Jul;13:571–591.
View Article
Google Scholar

[54] View Article

[55] Google Scholar

[ref20] 20. Yao F, Müller HG, and Wang JL. Functional data analysis for sparse longitudinal data. Journal of the American Statistical Association. 2005;100(470):577–590.
View Article
Google Scholar

[57] View Article

[58] Google Scholar

[ref21] 21. Yao F, Lei E, Wu Y. Effective dimension reduction for sparse functional data. Biometrika. 2015 Jun;102(2):421–437. pmid:26566293
View Article
PubMed/NCBI
Google Scholar

[60] View Article

[61] PubMed/NCBI

[62] Google Scholar

[ref22] 22. Fan Y, James GM, and Radchenko P. Functional additive regression. Ann. Statist. 2015 Oct;43(5):2296–2325.
View Article
Google Scholar

[64] View Article

[65] Google Scholar

[ref23] 23. Hoeffding W. A Class of Statistics with Asymptotically Normal Distribution. The Annals of Mathematical Statistics. 1948 Sep;19:293–325.
View Article
Google Scholar

[67] View Article

[68] Google Scholar

[ref24] 24. Serfling RJ. Approximation Theorems of Mathematical Statistics. New York: Wiley.;1980.

[ref25] 25. Lee AJ. U-statistics: Theory and Practice. New York: Marcel Dekker, 1990.

[ref26] 26. Chen S. X. and Qin Y.-L. A two-sample test for high-dimensional data with applications to gene-set testing. Annals of Statistics 2010; 38(2):808–835.
View Article
Google Scholar

[72] View Article

[73] Google Scholar

[ref27] 27. Bellman R. Some Inequalities for Positive Definite Matrices. Springer, 1980.

[ref28] 28. Hall P and Heyde CC. Martingale Limit Theory and Its Application. New York: Academic Press, 1980.

Figures

Abstract

Introduction

Methodology

Notation and assumptions

The FLUTE test

Asymptotic theory

FLUTE for scalar responses

Simulation and real data

Simulation results

Functional response.

Scalar response.

Application to Canadian Weather data

Conclusion

Appendix

Supporting information

S1 File.

References