Figures
Abstract
In this article, we have proposed a generalized estimator for mean estimation by combining the ratio and regression methods of estimation in the presence of auxiliary information using systematic sampling. We incorporated some robust parameters of the auxiliary variable to obtain precise estimates of the proposed estimator. The mathematical expressions for bias and mean square error of proposed the estimator are derived under large sample approximation. Many other generalized ratio and product-type estimators are obtained from the proposed estimator using different choices of scalar constants. Some special cases are also discussed in which the proposed generalized estimator reduces to the usual mean, classical ratio, product, and regression type estimators. Mathematical conditions are obtained for which the proposed estimator will perform more precisely than the challenging estimators mentioned in this article. The efficiency of the proposed estimator is evaluated using four populations. Results showed that the proposed estimator is efficient and useful for survey sampling in comparison to the other existing estimators.
Citation: Shaheen N, Qureshi MN, Alamri OA, Hanif M (2023) Optimized inferences of finite population mean using robust parameters in systematic sampling. PLoS ONE 18(1): e0278619. https://doi.org/10.1371/journal.pone.0278619
Editor: Qichun Zhang, University of Bradford, UNITED KINGDOM
Received: June 30, 2022; Accepted: November 20, 2022; Published: January 23, 2023
Copyright: © 2023 Shaheen et al. This is an open access article distributed under the terms of the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are credited.
Data Availability: All relevant data are within the paper and its Supporting information files.
Funding: The author(s) received no specific funding for this work.
Competing interests: The authors have declared that no competing interests exist.
1. Introduction
In survey sampling, it is well known that no single estimation technique can always provide the best results for populations of different characteristics under different situations. In many real-life situations, the researchers may incline to use such sampling techniques which can provide better estimation results in limited time, cost, and effort. Compared to the other sampling designs, systematic sampling is often considered to be a better choice, as it is easy to apply and can provide increased precision of the estimators. In systematic sampling, the units are selected as per some criterion after selecting the first unit at random. Madow and Madow [1] first studied the theory of systematic sampling and considered it the most frequently used probability sampling design for population parameter estimation due to its easiness. Cochran [2] reviewed the applications of systematic sampling and concluded that “apart from being easy to implement, systematic sampling provides more efficient estimators as compared to the simple random sampling or stratified sampling for certain types of populations”. Finney [3] and Zinger [4] discussed the application of systematic sampling for natural populations like forests. Cochran [5] cited many applications for systematic sampling in forestry, agriculture, and land surveys, and suggested that it can provide implicit stratification to produce better estimates for the situation when the sampling frame is in some specific order.
In survey sampling, the researchers collect information on the variable(s) that are correlated (either positively or negatively) with the main variable of interest. The use of such variables(s) along with the main study variable is very common to improve the efficiency of the estimator(s) for the population parameter(s) such as the mean, total, variance, proportion, etc. Usually, the ratio-type estimators are considered if the correlation is positive and the product-type estimators may be used when the correlation is negative between the study and the auxiliary variables. Moreover, many authors used some conventional and non-conventional parameters of the auxiliary variables to increase the precision of the estimators. Several authors have used systematic sampling for the situation when the auxiliary information was available with the concerned variable. Swain [6] suggested a ratio estimator whereas Shukla [7] proposed the product estimator using systematic sampling. Singh [8] suggested a ratio-cum-product type estimator in systematic sampling design. Srivastava and Jhajj [9] proposed a class of estimators for population mean using multi-auxiliary variables whereas Kushwaha and Singh [10] recommended a family of almost unbiased ratio and product estimators in systematic sampling. Banarasi et al. [11] proposed a family of ratio, product, and difference-type estimators using systematic sampling. Singh and Singh [12] proposed unbiased ratio and product type estimators in systematic sampling. Recently, some estimators were proposed using the exponent of the auxiliary variable by Singh et al. [13], Singh and Jatwa [14], Singh and Solanki [15], Tailor et al. [16], Khan and Singh [17], Khan [18] and Qureshi et al. [19].
In this article, we have proposed a generalized mean estimator using the auxiliary information with the expectation that the proposed estimator will perform better than the competing estimators. We have used some robust parameters associated with the auxiliary variable such as coefficient of kurtosis (β2x), upper quartile (Q3x), mid-range (Mx), Hodges-Lehmann (Hx), tri-mean (Tx), Gini’s mean difference (Gx), Downton’s method (Dx), probability-weighted moments (Px) and deciles mean (DMx). The generalized estimator also produces a family of sub-generalized estimators and sub-families of ratio and product-type estimators. In Section 2, we have discussed sampling methodology, notations, and some associated estimators. In Section 3, we derived the expressions for bias and MSE of the proposed estimator. Many special cases have been discussed in which the proposed estimator reduces to ratio and product-type estimators. Mathematical comparisons of the proposed estimator with the existing estimators are given in the same Section. An extensive numerical study is conducted using two real and two artificial populations in Section 4. A brief discussion of the paper is given in Section 5.
2. The methodology of systematic sampling and classical estimators
Consider a population P = (P1, P2, …, Pj,…, PN) composed of N distinct elements in some specific order that labeled from 1 to N (1,2, …,j, …, N), where we mentioned the jth element on P denoted by Pj for the simplicity. Further suppose that N is a product of two non-negative integers n and k, such that N = nk. Let Y is the main variable of interest and X be an auxiliary variable for which the values of both the study and the auxiliary variables can be defined by their label as y = (y1, y2,…,yj,…, yN) and x = (x1, x2,…,xj,…, xN) respectively. Let yij and xij be the values of the main study variable and the auxiliary variable on jth (j = 1, 2, 3, …, k) unit in the ith (i = 1, 2, 3, …, n) systematic sample. To select a sample of size n, we draw a random number between 1 to k (suppose it is j) and then every kth unit is selected such that j, j+k, j+2k, …, j+(n-1) k successive digits. Consequently, we have total k possible samples for each size n.
The means and variances for the study and the auxiliary variables for systematic samples may be obtained as
and
and
The mean estimator (without having the auxiliary information) along with the variance equation is given by
where
and
The ratio and product estimators proposed by Swain (1964) and Shukla (1971) under systematic sampling design are defined as
The expressions for MSE for the estimators µr, sys and µp,sys are given by
where
where
and
Here ρy and ρx are the intra-class correlations whereas Cy and Cx are the coefficients of correlation of their subscripts and ρyx is the correlation coefficient between the subscript.
3. Proposed generalized estimator
In this section, we have proposed a generalized estimator using an auxiliary variable for the estimation of population mean by combining modified ratio and regression type estimators under systematic sampling as
(1)
With
and
where bsys is the regression coefficient between the sample observations of the study variable and the auxiliary variable and v is a suitable constant to produce ratio type and product type estimators by assuming values 1 and -1 respectively. Here λ is an optimized constant used for the value of minimum MSE. Further, we have α (α≠ 0) and γ, which may assume real numbers or different robust parameters (define in Appendix E) of the auxiliary variable X.
Some notations are necessary to derive the bias and the MSE. Let
and
and
Using the above notations, the proposed generalized estimator in (1) may be written as
(2)
Simplifying and expanding the above equation using the Taylor series up to the first order, we have
where
and
After simplification, we applied expectation to the above equation, and we get
The mathematical expression of bias of the proposed estimator is
where
We re-write (2) by using Taylor expansion up to the first-order
On squaring and applying the expectation of the above equation, we get
The first-order MSE of the proposed estimator is
(3)
3.1 Optimum choice for scalar “λ”
For the minimum MSE of the proposed estimator, we differentiate (3) for “λ” and equate to zero to get the optimum value of scalar λ as
The minimum MSE of the proposed estimator is given by
The can be obtained by replacing
,
, ρyx and λopt their estimates
,
, ryx and
respectively.
Note that the proposed estimator may reduce to some classical estimators using different values of bsys, v, α, γ, and λ as shown in Table 1.
3.2 Algebraic comparisons
In this sub-section, we have made some algebraic comparisons to get the optimum conditions in which the proposed estimator performs superior to the competing estimators.
The proposed estimator performs better than the basic mean estimator if
or
The proposed estimator performs better than the usual ratio estimator if
or
The proposed estimator performs better than the product-type estimator if
or
4. Simulation study
For the assessment of the proposed estimator and its sub-cases, a simulation study is carried out for each the population separately. The statistics of all the populations are presented in Appendix B. The mathematical formulae to compute the absolute biases (ABs) and percentage relative efficiencies (PREs) for all the estimators are defined as
and
where
and
Sources of population
Four populations have been taken to obtain the results of ABs and PREs of all the estimators using 50,000 simulations independently. The sources of the population are
- Source-1:
- Cochran [5]
- Y= Food Cost; X= No. of persons in a family
- N=33 and n = 03.
- Source-1I:
- Murthy [20]
- Y = Timber Volume; X = Strip-wise Length (cents)
- N = 176 and n = 24.
- Sources-III:
- A bivariate population for (Y, X) is generated using an R-package of uniform distribution with the parameters 2 and 50.
- Source-IV:
A bivariate normal population of size N = 1000 (n = 200) is generated using an R-package with a mean vector and variance-covariance matrix:and
The following steps have been coded in R-language to get the results of ABs and the PREs.
From the numerical illustrations presented in Table 2, it is notice that the generalized estimator is more efficient than the competing estimators based on the results of ABs and PREs for all populations. Many special cases of ratio-type ( and
) estimators are computed for populations I-III and product-type (
and
) estimators for population IV (Appendix C). We have used v = 1 for populations I-III and v = -1 for population IV and observed that the ratio-type estimators performed well for the populations having positive correlation whereas the product-type estimators performed better for the population having negatively correlated auxiliary data with the study variable. Overall results revealed that the efficiency of the proposed estimator is every high than existing estimators for all populations.
5. Conclusion
Systematic sampling is often considered to be very useful for populations under different disciplines like agriculture, environmental, ecological, forestry, and marine science. It is also applied outside the afford-mention fields, as it is easy to understand and simple to execute. In the present study, our primary concern is to suggest a generalized estimator for the estimation of population mean using systematic sampling. The proposed estimator can produce several other estimators as special cases using different choices of defining parameters. We further suggested several efficient estimators as special cases of the proposed estimator using known parameters associated with the auxiliary variable. The optimum conditions are also derived under which the proposed estimator provide more precise estimates as compared to the existing estimators. We have analyzed the numerical comparison using two real and two artificial populations. The proposed estimator along with its sub-estimators has shown great efficiency than the sample mean, classical ratio, and product estimators for all populations. The proposed estimator performs better than the ratio and product estimator for v = 1, and -1. On the behalf of theoretical comparisons and numerical findings, we suggest that the proposed estimator is very efficient and useful for mean estimation using systematic sampling.
Only single auxiliary variable is considered in this research to propose the generalized estimator for mean estimation. In future, multi auxiliary variables would be incorporated to propose more generalized estimator in the presence of measurement error and nonresponse using systematic sampling.
Supporting information
S1 Appendix. This file contains multiple tables.
https://doi.org/10.1371/journal.pone.0278619.s001
(DOCX)
Acknowledgments
Authors are deeply thankful to the Editor in Chief and the anonymous reviewers for their valuable comments to improve the quality of the paper.
References
- 1. Madow W. G. and Madow L. H. (1944). On the theory of systematic sampling, Annals of Mathematical Statistics, 15, 1–24.
- 2. Cochran W. G. (1946). Relative accuracy of systematic and stratified random samples for a certain class of populations, The Annals of Mathematical Statistics, 17, 164–177.
- 3. Finney D.J. (1948). Random and systematic sampling in timber surveys. Forestry, 22, 64–99.
- 4. Zinger A. (1964). Systematic sampling in forestry, Biometrics, 20:3, 553–565
- 5.
Cochran W.G. (1977). Sampling Techniques. New York: Wiley.
- 6. Swain A.K.P.C. (1964). The use of systematic sampling ratio estimate, Journal of the Indian Statistical Association, 2, 160–164.
- 7.
Shukla, N. D. (1971). Systematic sampling and product method of estimation,” in Proceeding of the all-India Seminar on Demography and Statistics, (Varanasi, India 1971).
- 8. Singh M. P. (1967). Ratio-cum-product method of estimation, Metrika, 12, 34–42.
- 9. Srivastava K. and Jhajj H. S. (1983). A class of estimators of the population mean using multi-auxiliary information, Calcutta Statistical Association Bulletin, 32, 47–56.
- 10. Kushwaha K. S. and Singh H. P. (1989). Class of almost unbiased ratio and product estimators in systematic sampling, Journal of the Indian Society of Agricultural Statistics, 41(2), 193–205.
- 11. Banarasi S., Kushwaha N. S. and Kushwaha K. S. (1993), A class of ratio, product and difference-type estimators in systematic sampling, Microelectronics Reliability. 33(4), 455–457.
- 12. Singh H. P, and Singh R. (1998). Almost unbiased ratio and product type estimators in systematic sampling, Questiio, 22(3), 403–416.
- 13. Singh H. P., Tailor R. and Jatwa N. K. (2011). Modified ratio and product estimators for population mean in systematic sampling, Journal of Modern Applied Statistical Methods, 10(2), 424–435.
- 14. Singh H. P., and Jatwa N. K. (2012). A class of exponential type estimators in systematic sampling, Economic Quality Control, 27(2), 195–208.
- 15. Singh H. P., and Solanki R. S. (2012). An efficient class of estimators for the population mean using auxiliary information in systematic sampling, Journal of Statistical Theory and Practice, 6 (2), 274–285.
- 16. Tailor T., Jatwa N., and Singh H. P., (2013). A ratio-cum-product estimator of finite population mean in systematic sampling, Statistics in Transition, 14(3) 391–398.
- 17. Khan M. and Singh R. (2015). Estimation of population mean in chain ratio-type estimator under systematic sampling, Journal of Probability and Statistics, Article ID248374, 2 pages.
- 18. Khan M. (2015). A general class of exponential type estimator for population mean under systematic sampling using two auxiliary variables, Journal of Probability and Statistics, Article ID2374837, 6 pages.
- 19. Qureshi M.N., Khalil S., and Hanif M. (2018). Generalized Semi-Exponential type estimator in systematic sampling, 17(2): 283–290.
- 20.
Murthy M.N. (1967). Sampling Theory and Methods. Statistical Publishing Society. Calcutta 35, India.