On Cox proportional hazards model performance under different sampling schemes

Cox’s proportional hazards model (PH) is an acceptable model for survival data analysis. This work investigates PH models’ performance under different efficient sampling schemes for analyzing time to event data (survival data). We will compare a modified Extreme, and Double Extreme Ranked Set Sampling (ERSS, and DERSS) schemes with a simple random sampling scheme. Observations are assumed to be selected based on an easy-to-evaluate baseline available variable associated with the survival time. Through intensive simulations, we show that these modified approaches (ERSS and DERSS) provide more powerful testing procedures and more efficient estimates of hazard ratio than those based on simple random sampling (SRS). We also showed theoretically that Fisher’s information for DERSS is higher than that of ERSS, and ERSS is higher than SRS. We used the SEER Incidence Data for illustration. Our proposed methods are cost saving sampling schemes.


Introduction
The survival time of a particular event is called the time-to-event. The time of death and time to develop a disease are examples of survival data. Statistical methods for survival analysis have been applied to many vital fields of research. Generally, survival analysis uses data to predict survival probability and identify risk and/or prognostic factors related to subjects' survival and disease progression. An essential aspect of survival data is not usually fully observed in all subjects under study, leading to different censored data types.
Subjects in a study are usually assumed to be selected randomly (interred the study randomly) in the sense of simple random sample (SRS) [1]. Another promising sampling method in the literature is Ranked set sampling (RSS) introduced by McIntyre [2,3]. RSS is a useful cost affected sampling scheme for agriculture and environmental studies. Samawi and Al-Sagheer [4] applied RSS in a study involving human subjects by collecting data in the study based on bilirubin's level in the jaundice premature babies' blood. Also, recently Jabrah et al. [5] used RSS in a study involving human subjects, and they selected college students to analyze a psychological intervention to buttress resilience study. More applications of RSS and a1111111111 a1111111111 a1111111111 a1111111111 a1111111111 variations of RSS and their efficiency for estimation population means and other parameters are presents recently by several authors such as [6][7][8][9][10][11][12][13][14] among others. RSS (Balanced) is collected by drawing m 2 items randomly from a target population and then randomly dividing them into m sets of m units. Units from each set of size m then ranked virtually, or by the mean of the available auxiliary variable (concomitant) that associated with the variable of interest. Among the first set of m units, we choose the unit with the lowest rank and measure the unit's actual value. In the second set unit with the second-lowest ranking is measured, and so on until the unit with the highest ranking is measured from the last set. It is essential to point out that the ranking error will reduce RSS's efficiency compared with SRS in some inferential procedures. Therefore, it is crucial to choose a ranking method that improves the ranking precision [15,16].
Furthermore, when the set size m is large, selecting an RSS sample of size m is susceptible to ranking errors, so the literature suggested m not to excide 5. However, to overcome size r of m restriction to reduce ranking errors, several modifications and variations of the RSS are suggested in the literature. Samawi et al. [17] were the first to introduce the ERSS sampling scheme, while Al-Odat and Al-Saleh [18] proposed moving Extreme Ranked Set Sampling (MERSS). Al-Saleh and Samawi [19] and Samawi and Al-Saleh [20] implemented both MERSS max and MERSS min to provide valid odds and more efficient estimates of the odds ratio, respectively. Samawi et al. [21] investigates the use of MERSS min (max) to improve the Cox proportional hazards model's efficiency. This modified sample's advantage is only the maximum (or minimum) sets of varied sizes are identified for quantification in this procedure. Therefore, even for large m, MERSS, or ERSS, can be easily implemented.
As another modification of RSS, is the Double Extreme Ranked Set Sampling (DERSS) sampling scheme was introduced by Samawi [22] for the mean and regression estimators. Helu et al. [23] improved the AFT survival model's analysis efficiency using a modified version of ERSS, namely ERSSmin (ERSSmax). Also, Samawi et al. [24] used the modified MERSS to improve the logistic regression analyses' performance. Recently, Samawi et al. [25] further enhances logistic regression analysis using modified DERSS.
The DERSS min (DERSS max ) procedure, which is an extension to ERSS procedure, is suggested here. This procedure involves randomly drawing m independent sets each contains m random samples of size m each (note that each set has m 2 sample units). Each sample is drawn from a large population. We assume that the largest or the smallest sample unit within each sample with respect to the value of an auxiliary variable Z, which is associated with survival time, can be identified with no or little cost. From each set of size m 2 use ERSS procedure, as described by Samawi at el. [17] to obtain m ERSS min (or ERSS max ) samples of size m each (this completes the first stage). Again apply the same ERSS min (or ERSS max ) procedure on the obtained m ERSS min (or ERSS max ) samples of size m each to obtain a sample of size m, which will be called the second stage of ERSS min (or ERSS max ) or Double Extreme Ranked Set Sample DERSS min (or DERSS max ). This will complete one cycle. The cycle may be repeated r times to obtain a DERSS of size n = rm.
This work is the first to discusses the implementation and the performance of the proportional hazards model when the subjects are selected based on the modified ERSS (ERSS min or ERSSmax) and DERSS (DERSS min or DERSSmax) sampling scheme. In this work, we assume that the ranking of subjects, including in the study, is based on one of the available auxiliary variables associated with the variable of interest (time-to-event). Also, we are the first to show theoretically that the proposed sampling schemes improve the Fisher's information compared with SRS. This implies that Fisher's information for DERSS is higher than that of ERSS, and ERSS is higher than SRS. The rest of the paper is as follows: Section 2 provides basic survival analysis, definitions, DERSS procedure, and notation. The analysis using the Cox proportional hazards model and its properties using DERSS is discussed in section 3. In Section 4, we provide our simulation study to compare the Cox proportional hazards model's performance using the modified DERSS to ERSS and SRS. We illustrate the proposed methods using the SEER Incidence Data in section 5. Final remarks are provided in Section 6.

Basic definitions
Let S(t) denote the survival function at time t with hazard rate function denoted by h(t) and f (t) as p.d.f for a random variable of time to an event T. Furthermore, survival data is usually not fully observed, and data censoring procedure occurs. The right censoring occurs when the subject has not experienced the event of interest at a given time, which means that we observe only the lower bound for the value of t of the censored individuals. On the other hand, type I censoring occurs when the event is unobserved before some pre-specified time, such as at the closing of a study. In this work, we focus on type I right censoring.

Likelihood function for type I right censored data
In survival data, we define t i = min(T i , C i )T i the true survival time and C i the censoring time for the i th individual. Let δ i be an indicator variable, such that The likelihood function for n observations is given by where β is the set of parameters of risk factors.

Sample notation and some basic results of the modified ERSS and DERSS
As in Samawi (2002), let fX l ijk ; i ¼ 1; 2 . . . ; m 2 ; j ¼ 1; 2; . . . ; m; l ¼ 1; 2; . . . ; m; k ¼ 1; 2; . . . ; rg denotes the m independent sets for the kth cycle, each with sample size m 2 from a p.d.f. f(x) and a distribution function F(x) (absolutely continuous). Using similar notations as in Samawi et al. [21] for selecting DERSSmin, and hence also ERSS min , we have, after ranking the sample units within each sample in each set, visually or by any not costly way, we obtain ERSS min as, (X 1(1)k , X 2(1)k ,. . .,X m(1)k , k = 1,2,. . .,r) of size m.r. Then DERSSmin is obtained as . ., r, denotes DERSSmin. However, selecting DERSSmax is similar but selecting the maximum instead of the minimum. Samawi et al. [25] showed that the p.d.f of the smallest and the largest order statistics of an i.i.d sample of size m with p.d.f f X (x) are respectively given by: . .,m and k = 1, 2,. . ., r., then

Cox proportional hazards model using ERSS min and DERSS min
This section investigates the Cox Proportional Hazards Model using ERSS min and DERSS min when the survival time and the ranked auxiliary variable (X) are positively correlated. When they have a negative correlation, we can use either ERRS max or DERSS max , and all consequence results can be derived in similar manners. We will focus on DERSS min in our consequence discussion.
For the k th cycle, k = 1,2,. . .,r, let Y 1(1)k ,Y 2(1)k ,. . .,Y m(1)k be the measurements obtained on the auxiliary variable, using DERSS min with size n = rm. Moreover, let the vector V i [1] . .,r represent the p+1 explanatory variables' observations and one auxiliary variable obtained from the sampled unit in the ith set and kth cycle. Note that the notation (.) stands for perfect ranking while the notation [.] stands for imperfect ranking.
The partial likelihood is usually used to estimate the model parameters β = (β 1 , β 2 ,. . .,β p+1 ) 0 in the Cox model. Under SRS assumptions, Cox and Oakes [26] showed that the estimators of β = (β 1 , β 2 ,. . .,β p+1 ) 0 from partial likelihood are consistent and asymptotically normal distributed. However, Cox and Oakes [26] showed a loss of precision in estimation by using partial likelihood compared to the full likelihood. Also, they indicated that the sample size and the proportion of censoring observation affect inference precision. When the percentage of censoring increases and the sample size is small, the estimators from partial likelihood lose more precision. We propose to use DERSS to improve the accuracy of inferences based on the partial likelihood estimation of Cox model parameters.

Partial likelihood for the Cox model
To estimate the parameter in the Cox proportional hazards model, we order the random sample of size n individuals according to the rank of their survival times, t 1 < t 2 <. . . < t n , assuming there are no ties for the observations. Instead of using the full likelihood function in (5), Cox [27] proposed partial likelihood as follows, It is the consequence of the conditional probability that an individual experiences an event at time t i [1]k , (i = 1,2,. . .,m;k = 1,2,. . .,r) given that the individual in the event's risk set at t i [1]k , where R(t i [1]k ) be the set of all individuals who have not experienced the event and are uncensored just prior to t i [1]k . Under DERSS min , and as described by Samawi et al. [21] the partial likelihood function (PLHF) can be written as When the observation is censored, the product of the conditional probabilities is 1 and equivalently can be written as where (r d .m) is the number of events (failed subjects) in a sample of size (n = r.m). The loglikelihood can be written as, Since the partial likelihood function depends on the ordering of survival times and then the baseline hazard distribution is inherent in ordering the survival times, and no shape is described. Therefore, the produced partial likelihood is only based on the uncensored observations, and hence this may result in a large sample size requirement. However, in contrast to SRS, ERSS and DERSS naturally provide a larger percentage of events, as indicated by Samawi et al. [21,25] and may result in requiring a smaller sample size. For estimating the j th parameter (β j ), we need to solve the following equation where g h ðy ið1Þk ; Rðt i½1�k ÞÞ ¼  (7).
For the consistency and the asymptotic properties ofβ, see [26].
To show the superiority of using ERSS and DERSS proposed in this paper, we need to derive the Fisher's information matrix ofβ, say I(β). To find I(β) we take the second derivative of L(β): : For the jth covariate, we have which is a function of the ranked auxiliary covariate (Y) used to select subjects from the population using the DERSS min scheme randomly and assumed to have some distribution F Y (y), with a continuous density f Y (y). As suggested by [21,25] ranking on this variable induces a corresponding ordering on the response variable (T), which leads to improved precision. When using DERSS min , we have taken the minimum judgment ranking from each set of size m, and as shown by [25], the density function of Y (1) is g Yð1Þ ðyÞ ¼ m 2 f Y ðyÞ ½1 À F Y ðyÞ� m 2 À 1 .
Then the Fisher information using DERSS min is given by: Since we assume that β P+1 > 0, then the number of events and R(t i [1]k ) will be increased. This implies that g h (y (1) ; R(t i [1]k )) is a decreasing function of y. Also, mð1 À F Y ð1Þ ðyÞÞ mÀ 1 is a decreasing function of y, thus as in See and Chen [28] Hence, the last inequality in [15] can be argued similarly. These inequalities are still held when the correlation between the survival time and Y is negative, and DERSS max is used. The inequalities in [15] show that using DERSS min provides more information than ERSS and ERSS provides more information than SRS, and hence DERSS min may require a smaller sample size to achieve similar precision as for ERSS or SRS.

Simulation study and results
For comparison purposes, we used simulation to evaluate the Cox proportional hazards model's performance when ERSSmin and DERSSmin are used. We assume that the Ranked Auxiliary Covariate is positively correlated with survival time. We compared DERSSmin with both ERSSmin and SRS. The power of testing the hypothesis of no treatment effect after controlling for the auxiliary variables is compared. We also studied the performance of estimating the conditional hazard ratios for risk factors and their confidence intervals. We used different conditional hazard ratios and different values of the association between survival time and the auxiliary variables. Two combinations of sample sizes of m and r are considered used in the simulations, i.e., (m = 20, r = 10), and (m = 30, r = 20). We used 5000 simulated samples in our simulation.
The parameter β 1 is the association between T and Y, and β 2 is the risk factor parameter. The simulation results are given in Tables 1-3 for a dichotomous risk factor, while the results of a continuous risk factor are presented in Tables 4-6. We estimated the empirical nominal value (α = 0.05) and the tests' power (see Tables 1 and 4). The estimated conditional hazard ratios and their MSEs are reported in Tables 2 and 5, while the 95% confidence intervals for hazard ratios are reported in Tables 3 and 6.
From Table 1, we can conclude that, when the risk factor is dichotomous, testing the hypothesis of the risk factor's effect, controlling for the Ranked Auxiliary Covariate Y in the model, the DERSS min results in a more powerful test than SRS and ERSS min . All SRS, ERSS min , and DERSS min achieved the test nominal value (0.05) under the null hypothesis in all cases. We demonstrated the test's power increases as the set size m increase and/or the value of β 1 increases. However, using DERSS min in the Cox proportional hazard model provides greater power than SRS and ERSS min , in all cases. Similarly, from Tables 2 and 3, DERSS min has smaller MSE and narrower confidence intervals in estimating the hazard ratios. Similar results are concluded from Tables 4-6 when the risk factor is a continuous variable.

Application using the SEER incidents data set
We illustrate the Cox proportional hazards model based on DERSS, ERSS, and SRS using SEER Incident Data [29]. We selected histologic grade as the primary predictor of interest and month of life after a breast cancer diagnosis as the outcome. Due to breast cancer, deaths were considered death and being alive, or Deaths due to other causes deemed to be censored. In the Cox proportional hazards model, we also include age and tumor marker1 as confounders. The data included 454,517 individuals, and 233,125 have complete data for a grade (1: well-differentiated; 2: moderately differentiated; 3: poorly differentiated; 4: undifferentiated). Our analysis was based on complete data. Since grade levels 3 and 4 are crossed on the survival plots, which violates the proportional hazards assumption for the Cox model, we combined grade 3 and grade 4 into one single group as poorly differentiated or undifferentiated and denoted the new grade variable (1: well-differentiated; 2: moderately differentiated; 3: poorly differentiated or undifferentiated). We treated the entire data set of 233,125 observations as a population and randomly selected ERSS and DERSS samples (with m = 15, r = 20). We selected ERSS and DERSS based on ranking on the lifetime directly since most of the data set variables are categorical. The only potential continuous variable for ranking is age, which is only weakly associated with lifetime with hazards ratio close to 1. We performed the hypothesis testing of no difference in survival time between the patients with a different histologic grade at diagnosis, adjusting for the covariate age. Table 7 represents the survival analysis results for a Cox proportional hazards model using all available data. The estimation of parameters and the conditional hazard ratios are treated as actual for comparison purposes in this example. Table 7 and Fig 1 are the results of the survival analysis based on the entire data. Fig 1 indicates that the proportional hazard assumption of the model is valid. Also, Fig 1 shows that the survival chance for grade 1 is higher than grade2 and grade 3, and grade 2 is higher than grade 3 of the disease. However, based on the p-values, the test of the effect of grade 3 on survival time compared with grade 1 and 2 was significant, controlling for age, based on the 500 bootstrap samples of SRS, ERSS, and DERSS of size n = 300. Both ERSS and DERSS used survival time for ranking and chose the samples with the minimum survival time. The DERSS samples give smaller p-values for the majority of the time and coincide with the whole data analysis. The estimated power out of the 5000 runs was 0.944, 0.930, and 0.856, respectively, under DERSS, ERSS, and SRS. Therefore, whenever possible, ERSS and/or DERSS are to be recommended for survival analysis.

Final remarks and conclusion
It is essential in collecting data to have a sampling scheme that is cost-effective and less timeconsuming. ERSS and DERSS, which are RSS modifications, are encouraging sampling techniques that can effectively have more efficient estimators with less cost than SRS. We proposed a more efficient survival regression analysis method based on Double Extreme Ranked Set Sampling (DERSS) and Extreme Ranked Set Sampling (ERSS). Subjects are assumed to be ranked based on an auxiliary variable associated with the response variable. We discussed parameter estimation based on the maximum likelihood approach and provided an expression for the estimated variance based on the inverse information matrix. Also, we discussed the asymptotic behavior of the ML estimators. Our findings conclude that using ERSS and DERSS can significantly increase power when used in a Cox proportional hazards model. We show that the test's power increases as the set size m increases through the simulation studies.  Moreover, ERSS and DERSS provide more efficient estimates of the parameters associated with hazard ratios in smaller MSEs and narrower confidence intervals than those under SRS.