Figures
Abstract
Ranked set sampling (RSS) has created a broad interest among researchers and it is still a unique research topic. It has at long last begun to find its way into practical applications beyond its initial horticultural based birth in the fundamental paper by McIntyre in the nineteenth century. One of the extensions of RSS is median ranked set sampling (MRSS). MRSS is a sampling procedure normally utilized when measuring the variable of interest is troublesome or expensive, whereas it might be easy to rank the units using an inexpensive sorting criterion. Several researchers introduced ratio, regression, exponential, and difference type estimators for mean estimation under the MRSS design. In this paper, we propose three new mean estimators under the MRSS scheme. Our idea is based on three-fold utilization of supplementary information. Specifically, we utilize the ranks and second raw moments of the supplementary information and the original values of the supplementary variable. The appropriateness of the proposed group of estimators is demonstrated in light of both real and artificial data sets based on the Monte-Carlo simulation. Additionally, the performance comparison is also conducted regarding the reviewed families of estimators. The results are empowered and the predominant execution of the proposed group of estimators is seen throughout the paper.
Citation: Shahzad U, Ahmad I, Almanjahie IM, Al-Omari AI (2022) Three-fold utilization of supplementary information for mean estimation under median ranked set sampling scheme. PLoS ONE 17(10): e0276514. https://doi.org/10.1371/journal.pone.0276514
Editor: Anoop Kumar, Amity University - Lucknow Campus, INDIA
Received: September 5, 2022; Accepted: October 8, 2022; Published: October 24, 2022
Copyright: © 2022 Shahzad et al. This is an open access article distributed under the terms of the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are credited.
Data Availability: All relevant data are available within the paper.
Funding: Authors are grateful to the Deanship of Scientific Research at King Khalid University, Kingdom of Saudi Arabia for funding this study through the research groups program under project number R.G.P.2/132/43. Ibrahim Mufrah Almanjahie, Ishfaq Ahmad and Amer Ibrahim Al-Omari received the grant.
Competing interests: The authors have declared that no competing interests exist.
1 Introduction
The ranked set sampling (RSS) design was primarily developed by [1] as an effective inspection plan for estimating the population parameters, i.e., the mean of field and scrounge yields. This inspection plan is appropriate in circumstances where the ranking of perceptions is effectively dependent on a supplementary variable, which is linearly related to the subject variable or any reasonable technique. The RSS has wide applications in numerous logical issues, particularly in natural and environmental examinations where the principal concentration is on practical and productive sampling/inspection techniques. For instance, assume that the agency of environmental protection needs to guarantee that the fuel stations in certain metropolitan territories are distributing gas according to air clean guidelines. It is quite possible the compound parameters of fuel can be effectively ranked right after the assortment at the fuel siphon by some rough field strategies being modest and simple. While carrying the sample units to the research center and utilizing certain lab procedures to gauge its compound parameters is costly. For more details about RSS and its applications, see, for example, [2, 3] who suggested the median ranked set sampling (MRSS) for estimating the population mean. The MRSS consists of the following steps:
- Step 1: Select n random samples, each of size n units from the population and rank the units within each sample with respect to the variable of interest.
- Step 2: If the sample size n is odd, then from each sample, select for measurement of the
th smallest rank (the median of the sample). If the sample size n is even, then select for measurement of the
th smallest rank from the first
samples, and the
th smallest rank from the second
samples. The cycle can be repeated m times if needed to obtain a sample of size nm.
[4, 5] suggested multistage median ranked set sampling for estimating the population mean and median, respectively. [6] considered the modified ratio estimator for the population mean using double MRSS. For more details of RSS and its applications, see [7–17].
[12] considered the median ranked set sampling scheme and built ratio-type estimators based on known quartiles of supplementary information. [18] extended the work of [12] and suggested new families of estimators. These families also contain known mean and quartiles of supplementary information. In this paper, some new estimators for the population mean of the random variable Y are proposed under MRSS scheme. The basic idea is when there exists an adequate relationship (sufficient correlation) between (Y, X), the ranks and second raw moment of the supplementary variable are also correlated with the variate of interest. In light of this reality, we will utilize three forms, i.e., (the original values, ranks and the second raw moment) of a single supplementary variable X. The proposed three-fold utilization of the supplementary variable will provide enhancement in the estimation of population mean μy. So, we build some estimators for estimating μy which not only includes the original values of the auxiliary variable X but also on the second raw moment and ranks of the supplementary variable. The appropriateness of the recommendation is additionally shown empirically by utilizing real and artificial populations. The execution assessment uncovers the prevalent presentation of the proposed family in contrast with [12, 18] group of estimators.
The rest of the paper is composed of pursues. In Section 1.1, we present documentation (notations) and a concise review of the estimators utilized in MRSS when a supplementary variable is accessible. For every estimator, the theoretical properties of order n−1 are derived. In Section 2, we extend the idea of [12, 18], in light of three-fold utilization of supplementary information of a single auxiliary variable, and propose three new estimators for estimating μy under MRSS with their theoretical properties. In Section 3.1, we perform a reenactment (simulation) study to look at, under changed situations, the productivity of the proposed estimator with that of other focused estimators present in the literature. In Section 3.2, the productivity of the estimator is researched by utilizing a real-life data set. Finally, the conclusion is provided in Section 4.
1.1 Preliminaries with respect to median ranked set sampling
Suppose a median ranked set sample of size n is drawn from a finite population ω, consisting of N units. Let Y and X are study and auxiliary variables, respectively. Further, let us assume that Z and V representing; the ranks and squared of each value of the supplementary variable, respectively.
Let (Xi(1), Zi[1], Yi[1], Vi[1]), (Xi(2), Zi[2], Yi[2], Vi[2]), …, (Xi(n), Zi[n], Yi[n], Vi[n]) be the order statistics of Xi1, Xi2, …, Xin and the judgment order of Zi1, Zi2, …, Zin; Yi1, Yi2, …, Yin; Vi1, Vi2, …, Vin for (i = 1, 2, …, n), where (.) and [.] indicate that the ranking of X is perfect and the ranking of Y has error. For odd and even sample sizes, the units measured using MRSS are denoted by MRSSO and MRSSE, respectively.
For odd sample size, let ,
,
be the observed units by MRSSO.
,
,
and
be the MRSS mean of X, Z, Y and V, respectively. Further,
,
and
,
are the variances of
,
,
and
, respectively, where
.
For even sample size, let denote the observed units by MRSSE. Further
are the MRSS mean of X, Z, Y and V, respectively. Also,
,
,
and
are the variances of
,
and
, respectively, where
.
Infinite inspection surveys, the enhancement for the estimates of the objective parameter μy can be achieved by utilizing supplementary information, for instance, through the ratio, product and regression methods of estimation. Along with this heading, some authors have stretched out a piece of these outcomes to MRSS. In the following, we present a concise review of certain important contributions.
When there is a significant degree of linear relationship exists between the subject variable Y and the supplementary variable X, and the mean and quartiles of X are known in advance, [12] developed two mean ratio type estimators under MRSS design:
where μx is the population mean of the random variable X and (q1, q3) represent the upper and lower quartiles of the supplementary variable X. These estimators can be refortified as
(1)
where j = (E, O) represents the odd and even sample selection. Further, (k = 1, 3) denotes the first and third quartiles, respectively. The Bias and mean square error (MSE) expressions for order n−1 of
are
(2)
where
for (a = 1, b = qk) and
.
The ratio estimators determined in [12] are most suitable for the positive degree of linear relationship, i.e., (ρxy(j) > 0). Similarly, for the negative degree of a linear relationship, i.e., (ρxy(j) < 0), the product versions of these estimators can be developed by taking the reciprocal of ratio type component. To defeat the constraint related to the indication of ρxy(j), [18] developed a usual regression-estimator under MRSS with their MSE expression as given by
(3)
(4)
where bxy(j) is the regression coefficient. [19] constructed a difference type estimator based on two tuning parameters, known as generalized regression estimator, under SRS design. By merging the idea of [12, 18, 19] also introduced a difference type estimators under MRSS as follows:
(5)
where k1(j) and k2(j) are wisely chosen constants. The bias and MSE expressions are determined by
(6)
(7)
The optimum values of k1(j) and k2(j) are given by
(8)
(9)
[20] considered the ratio and product parts of the usual ratio and product estimator in exponential form and hence obtained a more enhanced estimates of the population parameter. These exponential estimators perform well according to their constraints related to indications, i.e., (ρxy(j) > 0) and (ρxy(j) < 0), for ratio and product type estimators, respectively. Taking inspiration from [20] and
, [18] also developed exponential type estimators under MRSS as follows:
(10)
The bias and MSE expressions are as given below:
(11)
(12)
where
By substituting and
in
, we get minimum MSE, given by
(13)
where
(14)
(15)
2 Materials and methods
The reasonable utilization of the supplementary information helps in expanding the exactness of an estimator both at the structuring stage as well as at the estimation stage, see [21–33]. In social, monetary, and regular studies, the total supplementary information is frequently accessible to the inspected outline. The basic concept is when there exists an adequate measure of the relationship between (X, Y); the rank and second raw moments are also connected (correlated) with the variate of interest. Consequently, the ranked supplementary variable V (which is based on the ranks of X) can be treated as a type of supplementary variable. Similarly, the second raw moment auxiliary variable Z (which is based on the second raw moment of X) can be treated as another supplementary variable. Hence, this three-fold supplementary information can help us in expanding the proficiency of the estimators. In view of these thoughts, we propose three new estimators of the finite population mean in the upcoming sub-sections of the current section.
2.1 First proposed estimator
Taking motivation from [12, 18] and utilizing two out of three defined forms of auxiliary information (X, Z), we propose the following estimator as
(16)
To obtain the bias and the MSE, we define
If the sample size n is odd, we can write
If the sample size n is even, we can write
Expressing the proposed estimator
in terms of ε’s, we have
(17)
(18)
By applying the expectation to the above expression, we determine the theoretical expression of bias of degree n−1 as
(19)
where
Taking the square and the expectation of (18), we obtain the MSE expressions of . After partial differentiation of the MSE, we get the optimum values of weights, i.e., k1, k2, and k3. By substituting the optimum values of weights, we determine the minimum MSE expressions. The expressions of optimum weight and the minimum MSE are as follows.
(20)
where
(21)
(22)
where
(23)
where
2.2 Second proposed estimator
Taking the motivation from and utilizing two out of three defined forms of auxiliary information, i.e., (X, V), we propose the following estimator.
(24)
(25)
To obtain the bias and the MSE, we define
If the sample size n is odd, we have
If the sample size n is even, we write
Accordingly, the bias and minimum MSE can be calculated from the expressions provided in previous section, by replacing the second raw moment auxiliary variable Z with the corresponding ranked supplementary variable V. Hence, we have
(26)
where
(27)
where
For abbreviation, we intentionally skip the expressions of
and
. Interested readers may easily get the mathematical expressions of these quantities by simply replacing z with v, regarding the previous sub-section.
2.3 Third proposed estimator
Taking motivation from and
and utilizing all the three defined forms of auxiliary information, i.e., (X, V, Z), we propose the following estimator
(28)
where
(29)
(30)
where
(31)
The bias of
is
(32)
Taking the square and the expectation of (30), we get the MSE expressions of . Then, by partial differentiation of the MSE, we determine optimum values of weights, i.e., k1, k2, k3, and k4. By substituting the optimum values of weights, we obtain the minimum MSE expressions. The optimum weights and minimum MSE are as follows.
(33)
(34)
(35)
(36)
(37)
where
As we see that optimum values of k1, k2, k3, k4 and the minimum MSE contain the notations t1(O), t2(O), …, t7(O) for odd sample size and t1(E), t2(E), …, t7(E) for even sample size. The detailed expressions of these notations t1(j), t2(j), …, t7(j), are as follows.
3 Results and discussion
3.1 Simulation study
While assessing the performance of the new recommended estimators, it is traditional to infer the MSE based conditions under which an estimator is more efficient than the others. However, these MSE based conditions are generally hard to confirm. Therefore, we have abstained intentionally from these conditions dependent on the MSE expressions and lean towards various numerical reenactment tests (simulation experiments), in the current Section. In the next Section, we will also assess the performance on behalf of the real life data set.
We perform the simulation study for evaluating the features of new estimators, i.e., over existing ones, i.e.,
in MRSS. In this section, we conduct re-enactment tests to investigate the properties of the proposed estimators. In reenactment tests, we generate a population of size 10000 from a bivariate normal distribution
where, we assumed values of ρxy = {±0.95, ±0.75, ±0.50, ±0.35, ±0.25, ±0.15}. For more details, interested readers may be referred to [18]. It is worth mentioning that only one form of the supplementary information, i.e., the supplementary variable X, generated from the bi-variate normal distribution. However, the remaining two forms of the supplementary information, i.e., (V, Z), are developed by taking ranks and square of X.
We consider two situations: First one is based on the true descriptives of population while the later one is based on the estimated descriptives of population.
3.2 Simulation with known population descriptives
In order to determine the percentage relative mean square error (PRMSE) of the concerned estimators, we consider both even and odd types of sample sizes, in light of the preliminaries defined in Section 2. We consider n = 3, 5 as odd and n = 4, 6 as even sample size. It is worth mentioning that each mentioned MRSS sample of size even/odd, selected 5000 times and the averaged MSE calculated. The PRMSE is calculated as:
In the current sub-section, a simulation design is based on the assumption that complete supplementary information is available. Thus, the true population parameters of supplementary information are considered here.
3.3 Simulation with unknown population descriptives
The configuration of the simulation portrayed in the previous sub-section depends on the supposition that all descriptives of the population regarding both the supplementary and the subject variables, and shown in the equations of the estimators, are known, except for μy. Without doubt, in spite of the fact that this circumstance has a hypothetical natural worth, its utility might be faulty in genuine examinations where some of the descriptives of supplementary variate are obscure. Thus, we have additionally investigated the adequacy of the proposed estimators under the reasonable circumstances where the previously mentioned population descriptives are obscure and should be assessed on the grounds that no dependable guess is accessible from past information, specialist advice or a pilot study. In such a condition, estimates of the objective parameters are influenced by an additional wellspring of variability; their conduct might be non-unique in relation to the situation where the population descriptives are thought to be given. To reveal insight into the matter, we address the issue of assessing the impact on the productivity of the reviewed and proposed estimators when all population descriptives, except for μx, are evaluated on the premise of various samples with different sizes. It is worth mentioning that the estimated optimum weights of the considered estimators are utilized for this particular situation.
3.4 Real life application
In this section, the estimators addressed in the previous section are assessed using real life data. Similar frame work of the previous section is adapted here. The data belongs to the production of corn and extension of agriculture land. Unit of production is “quintal” and considered as the target variable. However, the unit of extension of agriculture land is “hectares” and considered as an auxiliary variable, (Source: Istat—Italian Statistical Institute). The data set contain N = 101 observations with ρxy = 0.91, μy = 96.48, μx = 171.93, σx = 99.71, σy = 44.38, q1 = 80.61, q2 = 157.52, q3 = 352.13 and skewness = 0.4335.
3.5 Interpretation of numerical results
The analysis of the designed based simulation and the real life application are given in Tables 1–5, respectively. From these Tables, we notice that all proposed estimators , beat
,
,
,
,
,
,
. Moreover,
is always better than
. No consistent improvement is found between
. However, both of these are better compared to their reviewed counterparts
. We see that the proposed estimators are more effective than the reviewed counterparts.
The PRMSE values decrease as the correlation coefficient values increase. Also, for fixed values of the correlation coefficient and the same estimator, the MSE values decrease as the sample size increases for both true and estimated parameters. This examination reveals the superiority of the proposed estimators over the reviewed estimators. Among all PRMSE estimators, computations are given in Tables 1–5, it can be noted that the
is more efficient in terms of the PRMSE. These are the results for the circumstances considered in the numerical study, conducted for the purposes of the paper. Hence, we feel sure that the equivalent could hold in other settings.
4 Conclusion
In the current study, we have proposed some new estimators for the mean estimation of a subject variable when supplementary information is accessible. We explore the three-fold utilization of a single supplementary variable under MRSS design. Our class sums up the usual regression estimator and difference estimator of [19], due to [18], under MRSS design. We have also calculated the asymptotic theoretical properties such as the bias and minimum MSE. These properties are helpful to pay attention to an underestimation of the population parameter of the subject and the variable, i.e., μy. The proposed estimators are not exhaustive but rather can go about as an obstruction against the expansion of equivalent propositions that could show up later on. The proficiency of the proposed estimator has been explored by comparing it with different reviewed estimators, based on hypothetical and practical examples. For this, numerical examinations on some genuine and artificial populations have been conducted by Monte Carlo studies. These numerical examinations show the prevalence of the proposed estimators over the existing ones. Thus, the new proposals, based on three-fold utilization of a single supplementary variable, are recommended for survey practitioners. In future studies, the proposed three fold utilization can be extended for two stages MRSS in light of [12, 18, 34] work.
References
- 1. McIntyre GA. A method for unbiased selective sampling, using ranked sets. Australian journal of agricultural research. 1952;3(4):385–390.
- 2.
Chen Z, Bai Z, Sinha B. Ranked set sampling: theory and applications. New York: Springer.; 2004.
- 3. Muttlak HA. Median ranked set sampling. J Appl Stat Sci. 1997;6:245–255.
- 4. Jemain AA, Al-Omari AI. Multistage median ranked set samples for estimating the population mean. Pakistan Journal of Statistics. 2006;22(3):195.
- 5. Jemain AA, Al-Omari AI, Ibrahim K. Multistage median ranked set sampling for estimating the population median. Journal of Mathematics and Statistics. 2007;3(2):58–64.
- 6. Jemain AA, Al-Omari A, Ibrahim K. Modified ratio estimator for the population mean using double median ranked set sampling. Pakistan Journal of Statistics. 2008 Jul;24(3):217–226.
- 7. Al-Nasser AD, Al-Omari AI. Information theoretic weighted mean based on truncated ranked set sampling. Journal of Statistical Theory and Practice. 2015 Jun;9(2):313–329.
- 8. Al-Omari AI. A new measure of entropy of continuous random variable. Journal of Statistical Theory and Practice. 2016 Oct;10(4):721–735.
- 9. Al-Omari AI. Estimation of mean based on modified robust extreme ranked set sampling. Journal of Statistical Computation and Simulation. 2011 Aug;81(8):1055–1066.
- 10. Al-Omari AI, Bouza CN. Review of ranked set sampling: modifications and applications. Investigación Operacional. 2014 Sep;35(3):215–235.
- 11. Al-Omari AI, Gupta S. Double quartile ranked set sampling for estimating population ratio using auxiliary information. Pakistan Journal of Statistics. 2014 Oct;30(4):513–535.
- 12. Al-Omari AI. Ratio estimation of the population mean using auxiliary information in simple random sampling and median ranked set sampling. Statistics and Probability Letters. 2012 Nov;82(11):1883–1890.
- 13. Bhushan S, Kumar A. Novel log type class of estimators under ranked set sampling. Sankhya B. 2022a May;84(1):421–447.
- 14. Bhushan S, Kumar A. On optimal classes of estimators under ranked set sampling. Communications in Statistics—Theory and Methods. 2022b Apr;51(8):2610–2639.
- 15. Bhushan S, Kumar A. Predictive estimation approach using difference and ratio type estimators in ranked set sampling. Journal of Computational and Applied Mathematics. 2022c Aug;410:e114214.
- 16. Zamanzade E, Al-Omari AI. New ranked set sampling for estimating the population mean and variance. Hacettepe Journal of Mathematics and Statistics. 2016 Dec;45(6):1891–905.
- 17. Zamanzade E, Vock M. Parametric tests of perfect judgment ranking based on ordered ranked set samples. REVSTAT-statistical journal. 2018 Oct;16(4):463–474.
- 18. Koyuncu N. New difference-cum-ratio and exponential type estimators in median ranked set sampling. Hacettepe Journal of Mathematics and Statistics. 2016 Feb;45(1):207–225.
- 19. Rao TJ. On certail methods of improving ration and regression estimators. Communications in Statistics-Theory and Methods. 1991 Jan;20(10):3325–3340.
- 20. Bahl S, Tuteja R. Ratio and product type exponential estimators. Journal of information and optimization sciences. 1991 Jan;12(1):159–164.
- 21. Abid M, Abbas N, Riaz M. Improved modified ratio estimators of population mean based on deciles. Chiang Mai Journal of Science. 2016a Jan;43(1):1311–1323.
- 22. Abid M, Abbas N, Zafar NH, Lin Z. Enhancing the mean ratio estimators for estimating population mean using non-conventional location parameters. Revista Colombiana de Estadistica. 2016b Jan;39(1):63–79.
- 23. Shahzad U, Perri PF, Hanif M. A new class of ratio-type estimators for improving mean estimation of nonsensitive and sensitive variables by using supplementary information. Communications in Statistics- Simulation and Computation. 2019 Oct;48(9):2566–2585.
- 24. Bulut H, Zaman T. An improved class of robust ratio estimators by using the minimum covariance determinant estimation. Communications in Statistics- Simulation and Computation. 2022 May;51(5):2457–2463.
- 25. Zaman T, Bulut H. Modified ratio estimators using robust regression methods. Communications in Statistics-Theory and Methods. 2019 Apr;48(8):2039–2048.
- 26. Zaman T, Bulut H. Modified regression estimators using robust regression methods and covariance matrices in stratified random sampling. Communications in Statistics-Theory and Methods. 2020 Jul;49(14):3407–3420.
- 27. Shahzad U, Hanif M. Some imputation based new estimators of population mean under non-response. Journal of Statistics and Management Systems. 2019 Nov;22(8):1381–1399.
- 28. Ali N, Ahmad I, Hanif M, Shahzad U. Robust-regression-type estimators for improving mean estimation of sensitive variables by using auxiliary information. Communications in Statistics—Theory and Methods. 2021 Feb;50(4):979–992.
- 29. Hanif M, Shahzad U. Estimation of population variance using kernel matrix. Journal of Statistics and Management Systems. 2019 Apr;22(3):563–586.
- 30. Zaman T. Improvement of modified ratio estimators using robust regression methods. Applied Mathematics and Computation. 2019 May;348:627–631.
- 31. Shahzad U, Alnoor NH, Hanif M, Sajjad I, Anas MM. Imputation based mean estimators in case of missing data utilizing robust regression and variance-covariance matrices. Communications in Statistics- Simulation and Computation. 2020a.
- 32. Shahzad U, Alnoor NH, Hanif M, Sajjad I, Anas MM. Quantile regression-ratio-type estimators for mean estimation under complete and partial auxiliary information. Scientia Iranica. 2020b.
- 33. Shahzad U, Ahmad I, Almanjahie I, Al-Noor NH, Hanif M. A new class of L-Moments based calibration variance Estimators. Computers Materials and Continua. 2021 Jan;66(3):3013–3028.
- 34. Mahdizadeh M, Zamanzade E. Confidence Intervals for Quantiles in Ranked Set Sampling. Iranian Journal of Science and Technology, Transactions A: Science. 2019 Dec;43(6):3017–3028.