Skip to main content
Advertisement
Browse Subject Areas
?

Click through the PLOS taxonomy to find articles in your field.

For more information about PLOS Subject Areas, click here.

  • Loading metrics

Use of an efficient unbiased estimator for finite population mean

  • Javid Shabbir ,

    Roles Conceptualization, Data curation, Formal analysis, Investigation, Methodology, Writing – original draft

    javidshabbir@gmail.com

    Affiliation Department of Statistics, Quaid-i-Azam University, Islamabad, Pakistan

  • Ronald Onyango

    Roles Data curation, Funding acquisition, Resources, Software, Validation, Visualization, Writing – review & editing

    Affiliation Department of Applied Statistics, Financial Mathematics and Actuarial Science, Jaramogi Oginga Odinga University of Science and Technology, Bondo, Kenya

Abstract

In this study, we propose an improved unbiased estimator in estimating the finite population mean using a single auxiliary variable and rank of the auxiliary variable by adopting the Hartley-Ross procedure when some parameters of the auxiliary variable are known. Expressions for the bias and mean square error or variance of the estimators are obtained up to the first order of approximation. Four real data sets are used to observe the performances of the estimators and to support the theoretical findings. It turns out that the proposed unbiased estimator outperforms as compared to all other considered estimators. It is also observed that using conventional measures have significant contributions in achieving the efficiency of the estimators.

1. Introduction

In literature, many researchers have constructed or modified several forms of ratio, product, and regression type estimators by using the auxiliary information in estimating the finite population mean. The auxiliary information can be used either at survey stage or designing stage or estimation stage or at all stages to enhance the precision of the estimators by taking the advantage of correlation between the study variable and the auxiliary variable. In this study, we use the auxiliary variable as well as rank of the auxiliary variable at estimation stage to estimate the finite population mean. [1] were the pioneer whom used the idea of ratio of the study variable and the auxiliary variable in estimating the population mean. Singh and Singh [2] suggested the [1] type estimator when some parameters of the auxiliary variable are known in advance. [3] slightly modified the idea of [1] and suggested a new estimator for estimating the population mean. [4] used the known population parameters of the auxiliary variable in their suggested estimator for mean estimation. [5] extended the [1] estimator by using two auxiliary variables to estimate the population mean. [6, 7] modified the [1] type estimator for mean estimation in simple and stratified sampling. [8] have given justification in their proposed estimator by using dual use of the auxiliary variable in their study. [9] used the dual auxiliary variable in estimating the mean of the sensitive variable under randomized response technique (RRT). [10] modified the existing ratio estimator by using the dual auxiliary information for mean estimation. [11] suggested a difference type exponential estimator based on dual auxiliary variable for mean estimation. Recently [12] suggested a difference type estimator using the dual auxiliary variable under non-response in simple random sampling.

There are several estimators exist in literature which give the biased results and consequently variance or MSE tend to be inflated. This serious drawback encouraged us to construct the unbiased estimator which should be better than other considered estimators in literature. So combining the ideas of [1] and [2], we suggest an improved unbiased estimator for estimating the finite population mean.

In Section 2, we introduce some useful notations and symbols. Section 3 gives the existing estimators in literature. The proposed estimator is discussed in Section 4. The numerical results based on four real data sets are mentioned in Section 5. The conclusion is given in Section 6.

2. Symbols and notations

Consider a finite population Λ = Λ{Λ1, Λ2, …, ΛN} of N units. A simple random sample without replacement (SRSWOR) is used to draw a sample of size n units from a population. Let yi, xi, and ri be the observed values of the study variable (Y), the auxiliary variable (X) and rank of the auxiliary variable (R) respectively. Let , and respectively be the sample means corresponding to the population means , and . Let , , and respectively be the sample variances corresponding to population variances , , and . Let , , and be the coefficients of variation of Y, X, and R respectively. Let ρyx = Syx / (Sy Sx), ρyr = Syr / (Sy Sr), and ρxr = Sxr / (Sx Sr), be the correlation coefficients between their respective subscripts, where , , and be the population covariances between their respective subscripts. Corresponding sample covariances are , , and .

We define the following relative error terms to derive bias and MSE or variance expressions.

Let and , , , such that Ei) = 0, (i = 0,1,2,3), , , , , E0Ψ1) = ΥCyx, E0Ψ2) = ΥCyr, , E1Ψ2) = ΥCxr, , . where Cyx = ρyxCyCx, Cyr = ρyrCyCr, Cxr = ρxrCxCr, , , and .

3. Existing estimators

Now we discuss some well-known estimators in estimating the finite population mean.

  1. The usual sample mean estimator is , and its variance, is given by (1)
  2. A general class of Hartley-Ross unbiased type estimators, is given by (2) where , , , , , , , j = 0, 1, 2; c and d are the known population parameters of the auxiliary variable which may be coefficient of variation (Cx), coefficient of skewness (β1x), coefficient of kurtosis (β2x) and correlation coefficient (ρyx).

Using the assumption and , an unbiased general estimator is given by (3)

The variance of , is given by (4) where , and .

Note:

  1. Put c = 1, d = 0, in (3), so j = 0, i.e. , and where , we get the usual Hartley-Ross estimator and its variance as: (5) and (6)
  2. Put c = Cx, d = B2(x), in (3), so j = 1, i.e. , , and , where , we get the [4] estimator with its variance, are given by: (7) and (8)
  3. Put c = Cx, d = ρyx, in (3), so j = 2, i.e. , , and where , we get another [4] estimator and its variance, is given by: (9) and (10)
  4. A difference type estimator using a single auxiliary variable with its ranks, is given by: (11) where di(i = 1, 2) are constants.
    The variance of , is given by where
    The minimum variance of , at optimum values of di(i = 1, 2) i.e. and , is given by (12) where
  5. [6] suggested the following unbiased estimator using the single auxiliary variable and is given by: where c and d are defined earlier i.e. t = −1, 0, 1; α = 0, 1 and Q is a constant whose value is to be estimated.
    For α = t = 1, the above estimator becomes: (13) .
    The minimum variance of at optimum values of Q i.e. , is given by (14) where ∇1 = ∇1a + ∇1b,
  6. [8] suggested an idea of using rank of the auxiliary variable in the following estimator, is given by (15) where Hi(i = 1,2,3) are constants; c and d are defined earlier.
    The bias of the estimator , is given by (16)

Since and are known, so replacing and Cyx by their consistent estimates and in (16), the estimated bias of becomes (17)

Subtracting from , we get an unbiased estimator by replacing Hi(i = 1,2,3) by Li(i = 1,2,3) which is considered by [10] as: (18) or (19)

Rewriting in terms of errors, we have (20)

Solving (20), the bias of becomes zero. Now squaring and taking expectation of (20), the variance of becomes: (21) where

The optimum values Li(i = 1,2,3) are , , .

The minimum variance of is given by (22) where

4. Proposed almost unbiased estimator

On the lines of [1, 8, 10], we propose the following alternative new unbiased estimator. This estimator is based on usual ratio, difference, and exponential ratio type estimators. The purpose is to construct an unbiased estimator that should be better than all considered estimators in estimating the finite population mean. (23) where Si(i = 1,2,3) are constants.

Rewriting (23) in terms of errors, we have (24)

From (24), the bias of , is given by (25)

The estimated bias of , is given by (26)

Subtracting estimated bias given in (26) from the proposed estimator given in (23), the unbiased proposed estimator becomes: (27) or (28)

Solving (28) in terms of errors, we have (29)

From (29), we have (30)

Squaring (29) and then taking expectation, we get the variance of as: (31) where

The optimum values of Si(i = 1,2,3) are , , , where

Substituting the optimum values of Si(i = 1,2,3) in (31), we get the minimum variance of , which is given by (32) where

5. Numerical example

We use the following 4 real data sets for a numerical study.

  1. Population 1: [source: [13]]
    Y = Number of tube wells, X = Net irrigated area.
    N = 69, n = 10, , , , , , , Cy = 0.8421, Cx = 0.8478, Cr = 0.5731, ρyx = 0.9224, ρyr = 0.7140, ρxr = 0.8193, Δ220 = 8.0922, Δ210 = 2.1398, Δ120 = 2.1183, Δ102 = 0.3920, β1x = 2.3808, β2x = 7.2159.
  2. Population 2: [source: [13]]
    Y = Number of tube wells, X = Number of tractors.
    N = 69, n = 10, , , , , , , Cy = 0.8421, Cx = 0.7969, Cr = 0.8654, ρyx = 0.9118, ρyr = 0.7409, ρxr = 0.8654, Δ220 = 6.7066, Δ210 = 1.9555, Δ120 = 1.8119, Δ102 = 0.3825, β1x = 1.855, β2x = 3.7632.
  3. Population 3: [source: [14]]
    This data set is based on Marmara region of Turkey in 2007.
    Y = Number of teachers, X = Number of classes.
    N = 127, n = 31, , , , , , , Cy = 1.2559, Cx = 1.1150, Cr = 0.5769, ρyx = 0.9789, ρyr = 0.8312, ρxr = 0.8516, Δ220 = 4.7079, Δ210 = 1.6136, Δ120 = 1.6171, Δ102 = 0.4233, β1x = 1.7205, β2x = 2.3149.
  4. Population 4: [source: [14]]
    This data set is based on Marmara region of Turkey in 2007.
    Y = Number of teachers, X = Number of students.
    N = 127, n = 31, , , , , , , Cy = 1.2559, Cx = 1.4654, Cr = 0.5751, ρyx = 0.9366, ρyr = 0.8240, ρxr = 0.7834, Δ220 = 4.7079, Δ210 = 1.5674, Δ120 = 1.7115, Δ102 = 0.4015, β1x = 2.1638, β2x = 4.5928.

The results based on Populations 1–4 are given in Tables 14. Tables 14 give the results when no conventional measures and conventional measures are used. We use the following expression to obtain the percent relative efficiency (PRE) as: where i = 0, HR, S1, S2, D, CK, I, P.

thumbnail
Table 1. Results of different estimators for Population 1.

https://doi.org/10.1371/journal.pone.0270277.t001

thumbnail
Table 2. Results of different estimators for Population 2.

https://doi.org/10.1371/journal.pone.0270277.t002

thumbnail
Table 3. Results of different estimators for Population 3.

https://doi.org/10.1371/journal.pone.0270277.t003

thumbnail
Table 4. Results of different estimators for Population 4.

https://doi.org/10.1371/journal.pone.0270277.t004

In Tables 14, the proposed unbiased estimator outperforms in all four Populations but the [4] estimators in Populations 1–3 and [1] estimator in Population 4 are performing poorly.

6. Conclusion

In this study, we have proposed an unbiased class of estimators in estimating the finite population mean in simple random sampling using the single auxiliary variable and rank of the auxiliary variable. Expressions for biases and MSEs or variances are obtained up to first order of approximation. Four data sets are used for numerical study. The proposed estimator outperforms in all four populations as compared to all considered estimators. It is observed that use of conventional measures i.e. Cx, β1x, β2x, and ρyx have significant role in increasing the efficiency of the estimators in Tables 14.

[4] estimators in Populations 1–3 and [1] estimator in Population 4 show the poor performance but the proposed unbiased estimator have an excellent performance as compared to all considered estimators in all four populations 1–4.

Acknowledgments

Authors are thankful to the learned referees for their valuable suggestions which helped in improving the manuscript.

References

  1. 1. Hartley HO, Ross A. Unbiased ratio estimators. Nature. 1954; 174: 270–272.
  2. 2. Singh R, Singh HR. A Hartley-Ross type estimator for finite population mean when the variables are negatively correlated. Metron. 1995; 43: 205–216.
  3. 3. Rao TJ, Swain AKPC. A note on the Hartley-Ross unbiased ratio estimator. Communications in Statistics- Theory and Methods. 2014; 43(15): 3162–3169.
  4. 4. Singh HP, Sharma B, Tailor R. Hartley-Ross type estimators for population mean using known parameters of auxiliary variable, Communications in Statistics- Theory and Methods. 2014; 43: 547–565.
  5. 5. Javed M, Irfan M, Pang T. Hartley-Ross type unbiased estimators of population mean using two auxiliary variables. Scientia Iranica. 2019; 26(6): 3835–3845.
  6. 6. Cekim H. O. and Kadilar C. New unbiased estimator with the help of Hartley-Ross type estimator, Pakistan Journal of Statistics. 2016; 32(4): 247–260.
  7. 7. Cekim HO, Kadilar C. Hartley-Ross type unbiased estimator using the stratified sampling. Hacettepe Journal of Mathematics and Statistics. 2017; 46(2): 293–302.
  8. 8. Haq A, Khan M, Hussain Z. A new estimator of finite population mean based on the dual use of the auxiliary information, Communications in Statistics- Theory and Methods. 2017. 46(9); 4425–4436.
  9. 9. Zahid E, Shabbir J. Estimation of finite population mean for a sensitive variable using dual auxiliary information in the presence of measurement error. Plos One. 2019; e02121111, 1–17.
  10. 10. Irfan M; Javed M, Bhatti SH, Raza MA, Ahmed T. Almost unbiased optimum estimators for population mean using dual auxiliary information, Journal of King Saud University-Science. 2020; 32(6): 2835–2844
  11. 11. Irfan M, Javed M, Bhatti SH. Difference type-exponential estimators based on dual auxiliary information under simple random sampling. Scientia Iranica. 2022; 29(1): 343–354.
  12. 12. Ahmad S, Hussain S, Aamir M, Khan F, Alshahrani MN, Alqawba M. Estimating of finite population mean using dual auxiliary variable for nonresponse using simple random sampling. Aims Mathematics. 2022. 793: 4592–4613.
  13. 13. Singh R, Mangat N. S. Elements of Survey Sampling. Kluwer Academic Publishers. 1996.
  14. 14. Koyuncy N, Kadilar C. Family of estimators of population mean using two auxiliary variables in stratified sampling, Communications in Statistics- Theory and Methods. 2009; 38: 2398–2417.