Skip to main content
Advertisement
Browse Subject Areas
?

Click through the PLOS taxonomy to find articles in your field.

For more information about PLOS Subject Areas, click here.

  • Loading metrics

Optimized inferences of finite population mean using robust parameters in systematic sampling

  • Nazia Shaheen,

    Roles Formal analysis, Methodology, Visualization, Writing – original draft

    Affiliation Department of Statistics, National College of Business Administration and Economics, Lahore, Pakistan

  • Muhammad Nouman Qureshi ,

    Roles Investigation, Software, Validation, Visualization, Writing – review & editing

    nqureshi633@gmail.com

    Affiliations Department of Statistics, National College of Business Administration and Economics, Lahore, Pakistan, School of Statistics, University of Minnesota, Minnesota, Minneapolis, United States of America

  • Osama Abdulaziz Alamri,

    Roles Conceptualization, Funding acquisition, Resources, Visualization

    Affiliation Department of Statistics, University of Tabuk, Tabuk, Saudi Arabia

  • Muhammad Hanif

    Roles Conceptualization, Investigation, Supervision, Validation, Writing – review & editing

    Affiliation Department of Statistics, National College of Business Administration and Economics, Lahore, Pakistan

Abstract

In this article, we have proposed a generalized estimator for mean estimation by combining the ratio and regression methods of estimation in the presence of auxiliary information using systematic sampling. We incorporated some robust parameters of the auxiliary variable to obtain precise estimates of the proposed estimator. The mathematical expressions for bias and mean square error of proposed the estimator are derived under large sample approximation. Many other generalized ratio and product-type estimators are obtained from the proposed estimator using different choices of scalar constants. Some special cases are also discussed in which the proposed generalized estimator reduces to the usual mean, classical ratio, product, and regression type estimators. Mathematical conditions are obtained for which the proposed estimator will perform more precisely than the challenging estimators mentioned in this article. The efficiency of the proposed estimator is evaluated using four populations. Results showed that the proposed estimator is efficient and useful for survey sampling in comparison to the other existing estimators.

1. Introduction

In survey sampling, it is well known that no single estimation technique can always provide the best results for populations of different characteristics under different situations. In many real-life situations, the researchers may incline to use such sampling techniques which can provide better estimation results in limited time, cost, and effort. Compared to the other sampling designs, systematic sampling is often considered to be a better choice, as it is easy to apply and can provide increased precision of the estimators. In systematic sampling, the units are selected as per some criterion after selecting the first unit at random. Madow and Madow [1] first studied the theory of systematic sampling and considered it the most frequently used probability sampling design for population parameter estimation due to its easiness. Cochran [2] reviewed the applications of systematic sampling and concluded that “apart from being easy to implement, systematic sampling provides more efficient estimators as compared to the simple random sampling or stratified sampling for certain types of populations”. Finney [3] and Zinger [4] discussed the application of systematic sampling for natural populations like forests. Cochran [5] cited many applications for systematic sampling in forestry, agriculture, and land surveys, and suggested that it can provide implicit stratification to produce better estimates for the situation when the sampling frame is in some specific order.

In survey sampling, the researchers collect information on the variable(s) that are correlated (either positively or negatively) with the main variable of interest. The use of such variables(s) along with the main study variable is very common to improve the efficiency of the estimator(s) for the population parameter(s) such as the mean, total, variance, proportion, etc. Usually, the ratio-type estimators are considered if the correlation is positive and the product-type estimators may be used when the correlation is negative between the study and the auxiliary variables. Moreover, many authors used some conventional and non-conventional parameters of the auxiliary variables to increase the precision of the estimators. Several authors have used systematic sampling for the situation when the auxiliary information was available with the concerned variable. Swain [6] suggested a ratio estimator whereas Shukla [7] proposed the product estimator using systematic sampling. Singh [8] suggested a ratio-cum-product type estimator in systematic sampling design. Srivastava and Jhajj [9] proposed a class of estimators for population mean using multi-auxiliary variables whereas Kushwaha and Singh [10] recommended a family of almost unbiased ratio and product estimators in systematic sampling. Banarasi et al. [11] proposed a family of ratio, product, and difference-type estimators using systematic sampling. Singh and Singh [12] proposed unbiased ratio and product type estimators in systematic sampling. Recently, some estimators were proposed using the exponent of the auxiliary variable by Singh et al. [13], Singh and Jatwa [14], Singh and Solanki [15], Tailor et al. [16], Khan and Singh [17], Khan [18] and Qureshi et al. [19].

In this article, we have proposed a generalized mean estimator using the auxiliary information with the expectation that the proposed estimator will perform better than the competing estimators. We have used some robust parameters associated with the auxiliary variable such as coefficient of kurtosis (β2x), upper quartile (Q3x), mid-range (Mx), Hodges-Lehmann (Hx), tri-mean (Tx), Gini’s mean difference (Gx), Downton’s method (Dx), probability-weighted moments (Px) and deciles mean (DMx). The generalized estimator also produces a family of sub-generalized estimators and sub-families of ratio and product-type estimators. In Section 2, we have discussed sampling methodology, notations, and some associated estimators. In Section 3, we derived the expressions for bias and MSE of the proposed estimator. Many special cases have been discussed in which the proposed estimator reduces to ratio and product-type estimators. Mathematical comparisons of the proposed estimator with the existing estimators are given in the same Section. An extensive numerical study is conducted using two real and two artificial populations in Section 4. A brief discussion of the paper is given in Section 5.

2. The methodology of systematic sampling and classical estimators

Consider a population P = (P1, P2, , Pj,, PN) composed of N distinct elements in some specific order that labeled from 1 to N (1,2, ,j, , N), where we mentioned the jth element on P denoted by Pj for the simplicity. Further suppose that N is a product of two non-negative integers n and k, such that N = nk. Let Y is the main variable of interest and X be an auxiliary variable for which the values of both the study and the auxiliary variables can be defined by their label as y = (y1, y2,,yj,, yN) and x = (x1, x2,,xj,, xN) respectively. Let yij and xij be the values of the main study variable and the auxiliary variable on jth (j = 1, 2, 3, , k) unit in the ith (i = 1, 2, 3, , n) systematic sample. To select a sample of size n, we draw a random number between 1 to k (suppose it is j) and then every kth unit is selected such that j, j+k, j+2k, , j+(n-1) k successive digits. Consequently, we have total k possible samples for each size n.

The means and variances for the study and the auxiliary variables for systematic samples may be obtained as and and

The mean estimator (without having the auxiliary information) along with the variance equation is given by where and

The ratio and product estimators proposed by Swain (1964) and Shukla (1971) under systematic sampling design are defined as

The expressions for MSE for the estimators µr, sys and µp,sys are given by where where and Here ρy and ρx are the intra-class correlations whereas Cy and Cx are the coefficients of correlation of their subscripts and ρyx is the correlation coefficient between the subscript.

3. Proposed generalized estimator

In this section, we have proposed a generalized estimator using an auxiliary variable for the estimation of population mean by combining modified ratio and regression type estimators under systematic sampling as (1) With and where bsys is the regression coefficient between the sample observations of the study variable and the auxiliary variable and v is a suitable constant to produce ratio type and product type estimators by assuming values 1 and -1 respectively. Here λ is an optimized constant used for the value of minimum MSE. Further, we have α (α≠ 0) and γ, which may assume real numbers or different robust parameters (define in Appendix E) of the auxiliary variable X.

Some notations are necessary to derive the bias and the MSE. Let and and

Using the above notations, the proposed generalized estimator in (1) may be written as (2)

Simplifying and expanding the above equation using the Taylor series up to the first order, we have where and

After simplification, we applied expectation to the above equation, and we get

The mathematical expression of bias of the proposed estimator is where

We re-write (2) by using Taylor expansion up to the first-order

On squaring and applying the expectation of the above equation, we get

The first-order MSE of the proposed estimator is (3)

3.1 Optimum choice for scalar “λ

For the minimum MSE of the proposed estimator, we differentiate (3) for “λ” and equate to zero to get the optimum value of scalar λ as

The minimum MSE of the proposed estimator is given by

The can be obtained by replacing , , ρyx and λopt their estimates , , ryx and respectively.

Note that the proposed estimator may reduce to some classical estimators using different values of bsys, v, α, γ, and λ as shown in Table 1.

thumbnail
Table 1. Classical estimators as the family of the proposed estimator.

https://doi.org/10.1371/journal.pone.0278619.t001

3.2 Algebraic comparisons

In this sub-section, we have made some algebraic comparisons to get the optimum conditions in which the proposed estimator performs superior to the competing estimators.

The proposed estimator performs better than the basic mean estimator if or

The proposed estimator performs better than the usual ratio estimator if or

The proposed estimator performs better than the product-type estimator if or

4. Simulation study

For the assessment of the proposed estimator and its sub-cases, a simulation study is carried out for each the population separately. The statistics of all the populations are presented in Appendix B. The mathematical formulae to compute the absolute biases (ABs) and percentage relative efficiencies (PREs) for all the estimators are defined as and where and

Sources of population

Four populations have been taken to obtain the results of ABs and PREs of all the estimators using 50,000 simulations independently. The sources of the population are

  1. Source-1:
  2. Cochran [5]
  3. Y= Food Cost; X= No. of persons in a family
  4. N=33 and n = 03.
  5. Source-1I:
  6. Murthy [20]
  7. Y = Timber Volume; X = Strip-wise Length (cents)
  8. N = 176 and n = 24.
  9. Sources-III:
  10. A bivariate population for (Y, X) is generated using an R-package of uniform distribution with the parameters 2 and 50.
  11. Source-IV:
    A bivariate normal population of size N = 1000 (n = 200) is generated using an R-package with a mean vector and variance-covariance matrix: and

The following steps have been coded in R-language to get the results of ABs and the PREs.

From the numerical illustrations presented in Table 2, it is notice that the generalized estimator is more efficient than the competing estimators based on the results of ABs and PREs for all populations. Many special cases of ratio-type ( and ) estimators are computed for populations I-III and product-type ( and ) estimators for population IV (Appendix C). We have used v = 1 for populations I-III and v = -1 for population IV and observed that the ratio-type estimators performed well for the populations having positive correlation whereas the product-type estimators performed better for the population having negatively correlated auxiliary data with the study variable. Overall results revealed that the efficiency of the proposed estimator is every high than existing estimators for all populations.

thumbnail
Table 2. The amount of ABs and PREs of different estimators.

https://doi.org/10.1371/journal.pone.0278619.t002

5. Conclusion

Systematic sampling is often considered to be very useful for populations under different disciplines like agriculture, environmental, ecological, forestry, and marine science. It is also applied outside the afford-mention fields, as it is easy to understand and simple to execute. In the present study, our primary concern is to suggest a generalized estimator for the estimation of population mean using systematic sampling. The proposed estimator can produce several other estimators as special cases using different choices of defining parameters. We further suggested several efficient estimators as special cases of the proposed estimator using known parameters associated with the auxiliary variable. The optimum conditions are also derived under which the proposed estimator provide more precise estimates as compared to the existing estimators. We have analyzed the numerical comparison using two real and two artificial populations. The proposed estimator along with its sub-estimators has shown great efficiency than the sample mean, classical ratio, and product estimators for all populations. The proposed estimator performs better than the ratio and product estimator for v = 1, and -1. On the behalf of theoretical comparisons and numerical findings, we suggest that the proposed estimator is very efficient and useful for mean estimation using systematic sampling.

Only single auxiliary variable is considered in this research to propose the generalized estimator for mean estimation. In future, multi auxiliary variables would be incorporated to propose more generalized estimator in the presence of measurement error and nonresponse using systematic sampling.

Supporting information

S1 Appendix. This file contains multiple tables.

https://doi.org/10.1371/journal.pone.0278619.s001

(DOCX)

Acknowledgments

Authors are deeply thankful to the Editor in Chief and the anonymous reviewers for their valuable comments to improve the quality of the paper.

References

  1. 1. Madow W. G. and Madow L. H. (1944). On the theory of systematic sampling, Annals of Mathematical Statistics, 15, 1–24.
  2. 2. Cochran W. G. (1946). Relative accuracy of systematic and stratified random samples for a certain class of populations, The Annals of Mathematical Statistics, 17, 164–177.
  3. 3. Finney D.J. (1948). Random and systematic sampling in timber surveys. Forestry, 22, 64–99.
  4. 4. Zinger A. (1964). Systematic sampling in forestry, Biometrics, 20:3, 553–565
  5. 5. Cochran W.G. (1977). Sampling Techniques. New York: Wiley.
  6. 6. Swain A.K.P.C. (1964). The use of systematic sampling ratio estimate, Journal of the Indian Statistical Association, 2, 160–164.
  7. 7. Shukla, N. D. (1971). Systematic sampling and product method of estimation,” in Proceeding of the all-India Seminar on Demography and Statistics, (Varanasi, India 1971).
  8. 8. Singh M. P. (1967). Ratio-cum-product method of estimation, Metrika, 12, 34–42.
  9. 9. Srivastava K. and Jhajj H. S. (1983). A class of estimators of the population mean using multi-auxiliary information, Calcutta Statistical Association Bulletin, 32, 47–56.
  10. 10. Kushwaha K. S. and Singh H. P. (1989). Class of almost unbiased ratio and product estimators in systematic sampling, Journal of the Indian Society of Agricultural Statistics, 41(2), 193–205.
  11. 11. Banarasi S., Kushwaha N. S. and Kushwaha K. S. (1993), A class of ratio, product and difference-type estimators in systematic sampling, Microelectronics Reliability. 33(4), 455–457.
  12. 12. Singh H. P, and Singh R. (1998). Almost unbiased ratio and product type estimators in systematic sampling, Questiio, 22(3), 403–416.
  13. 13. Singh H. P., Tailor R. and Jatwa N. K. (2011). Modified ratio and product estimators for population mean in systematic sampling, Journal of Modern Applied Statistical Methods, 10(2), 424–435.
  14. 14. Singh H. P., and Jatwa N. K. (2012). A class of exponential type estimators in systematic sampling, Economic Quality Control, 27(2), 195–208.
  15. 15. Singh H. P., and Solanki R. S. (2012). An efficient class of estimators for the population mean using auxiliary information in systematic sampling, Journal of Statistical Theory and Practice, 6 (2), 274–285.
  16. 16. Tailor T., Jatwa N., and Singh H. P., (2013). A ratio-cum-product estimator of finite population mean in systematic sampling, Statistics in Transition, 14(3) 391–398.
  17. 17. Khan M. and Singh R. (2015). Estimation of population mean in chain ratio-type estimator under systematic sampling, Journal of Probability and Statistics, Article ID248374, 2 pages.
  18. 18. Khan M. (2015). A general class of exponential type estimator for population mean under systematic sampling using two auxiliary variables, Journal of Probability and Statistics, Article ID2374837, 6 pages.
  19. 19. Qureshi M.N., Khalil S., and Hanif M. (2018). Generalized Semi-Exponential type estimator in systematic sampling, 17(2): 283–290.
  20. 20. Murthy M.N. (1967). Sampling Theory and Methods. Statistical Publishing Society. Calcutta 35, India.