Figures
Abstract
In this paper, we propose a generalized class of exponential type estimators for estimating the finite population mean using two auxiliary attributes under simple random sampling and stratified random sampling. The bias and mean squared error (MSE) of the proposed class of estimators are derived up to first order of approximation. Both empirical study and theoretical comparisons are discussed. Four populations are used to support the theoretical findings. It is observed that the proposed class of estimators perform better as compared to all other considered estimator in simple and stratified random sampling.
Citation: Ahmad S, Arslan M, Khan A, Shabbir J (2021) A generalized exponential-type estimator for population mean using auxiliary attributes. PLoS ONE 16(5): e0246947. https://doi.org/10.1371/journal.pone.0246947
Editor: Feng Chen, Tongii University, CHINA
Received: October 19, 2020; Accepted: January 28, 2021; Published: May 13, 2021
Copyright: © 2021 Ahmad et al. This is an open access article distributed under the terms of the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are credited.
Data Availability: All relevant data are within the paper and its Supporting Information files.
Funding: The authors received no specific funding for this work.
Competing interests: The authors have declared that no competing interests exist.
Introduction
In survey sampling, we generally use the auxiliary information to increase precision of the estimators by taking the advantages of correlation between the study variable y and the auxiliary variable x. When the relationship between y and x is positive, then the ratio estimator gives efficient results, and when the relationship is negative, then the product estimator performs better. However under different setup, the auxiliary attributes have been studied by different authors. In some situations, auxiliary information can be quantified in the form of auxiliary proportions to get better precision. For this reason, several authors have used one or more auxiliary proportions at the estimation stage to increase the efficiency of the estimators. The ratio, product and regression methods of estimation are the good examples in this context. These methods of estimation are more efficient than the usual mean per unit estimator under certain conditions. Different estimators have been suggested or modified by many authors when used mixed types estimators i.e using regression and ratio type estimators or exponential type estimator. These mixed estimators outperformed than the individual estimators. For this reason, several authors have used the auxiliary variables and auxiliary attributes at estimation stage to increase the efficiency of the estimators. For example, the expansiveness of a tree can be used as a key auxiliary variable while estimating the ordinary stature of trees in a forest and moreover the sort of a dairy creatures is a significant auxiliary characteristic while estimating typical milk yield. Additionally, to estimate the mean time-based compensations earned by the individuals, the auxiliary attribute can be utilized in type of the education and martial status etc.
Wynn [1] proposed an unbiased ratio estimator using the auxiliary characters in the form of known population proportion of the auxiliary variable when the population is divided into two classes. Singh et al. [2] suggested ratio of proportion utilizing the auxiliary variables for further investigation. Naik and Gupta [3] introduced the idea of point bi-serial correlation co-efficient. Using this idea, many authors have used the information on the auxiliary attribute for improving the precision of the estimators. Solanki and Singh [4] suggested a class of estimators for population mean of the study variable using known population proportion of the auxiliary attribute. The suggested class of estimators is more general and includes the usual unbiased sample mean estimator. Expressions of bias and mean square error (MSE) are obtained under large sample approximation. Malik and Singh [5] introduced the exponential type estimator using two auxiliary attributes. Zeng et al. [6] proposed Tobit model for the analysis of crash rates by injury severity for both correlation across injury severity and unobserved heterogeneity across road-segment observations are accommodated. The Tobit model is compared multivariate parameters in the context of Bayesian. In anaother study, Zeng et al. [7] researches multivariate spatio-transient examination for characterizing territory wide accident rates by injury seriousness. Likewise, Zeng et al. [8] investigated the connection between zone-level daytime and evening time crash frequencies and different elements identified with traffic, organization, and land use, including VHT, normal speed, street thickness, crossing point thickness, network example, and land use design.
Consider a finite population U = {U1, U2,…,UN} of size N. We draw a sample of size n by SRSWOR from a population U. Let yi be the study variable and ϕi be the characteristics of the auxiliary attribute i.e ϕi = 1 if the ith unit possess attribute and ϕi = 0, otherwise. Let be the total number of units in the population possessing attribute ϕi and
be the total number of units in the sample possessing attribute ϕi. Let P = (A/N) be the proportion of units in the population and p = (a/n) be the proportion of units in the sample. Let
and
be the population mean and the sample mean respectively. Let P1 and P2 be the population proportion of auxiliary attributes and p1 and p2 be the sample proportion of auxiliary attributes. Let
be the population variance of the study variable y. Let
and
respectively be the population variance of the auxiliary attributes p1 and p2. Let
respectively be the co-efficient of variation of the study variable y. Let
and
be the co-efficients of variation of the auxiliary attributes p1 and p2. Let
be the population covariance between the study variable y and the auxiliary attribute p1 and
be the population covariance between the study variable y and the auxiliary attribute p2. Let
be the population covariance between the auxiliary attributes p1 and p2. Let
be the population point bi-serial correlation co-efficient between the study variable y and the auxiliary attribute p1. Let
be the population point bi-serial correlation co-efficient between the study variable y and the auxiliary attribute p2. Let
be the phi-correlation coefficient between the auxiliary attributes p1 and p2. Let
be the error terms such that E(ei) = 0, (i = 0,1,2),
, where
and
.
Existing estimators in simple random sampling
We discuss the following estimators available in literature.
1. The usual mean per unit estimator in simple random sampling is:
(1)
The MSE or variance of , is given by:
(2)
2. The ordinary ratio type estimator, is given by:
(3)
3. The usual product type estimator, is given by:
(4)
It is well known that the and
are more precise than usual mean estimator
when
and
respectively.
The bias and MSE of , are given by:
(5)
and
(6)
Similarly the bias and MSE of , are given by:
(7)
and
(8)
4. Bahl and Tuteja [9] proposed ratio and product type estimators for estimating finite population mean using information on single auxiliary attribute.
(9)
and
(10)
The bias and MSE of , are given by:
(11)
and
(12)
Similarly the bias and MSE of , are given by:
(13)
and
(14)
5. Kumar and Bhougal [10] proposed an exponential ratio-product type estimator, is given by:
(15)
where α is unknown constant.
The bias and minimum MSE of , are given by:
(16)
and
(17)
The optimum value of α, is given by:
(18)
6. Singh and Kumar [11] suggested double ratio and product type estimators, are given by:
(19)
and
(20)
The bias and MSE of , are given by:
(21)
and
(22)
Proposed class of estimators
In the lines of Shukla et al. [12], we proposed a generalized class of factor type estimators for mean estimator. The proposed estimator, is given by:
(25)
where,
Substituting different values of Ki (i = 1,2,3,4) in (25), we can generate many more different types of estimators from our general proposed class of estimators (see Table 1).
Solving given in Eq (25) in terms of errors, we have
(26)
where
and
To first order approximation, we have
(27)
Using (27), the bias and MSE of are given by:
(28)
and
(29)
Differentiate Eq (29) with respect to σ1 and σ2, we get the optimum values of σ1 and σ2 i.e.
and
Substituting the optimum values i.e σ1(opt) and σ2(opt) in (29), we get minimum MSE of , is given by:
(30)
where
is the multiple correlation coefficient of y on p1 and p2.
Now by putting different values of Ki in Eq (25) some members of the proposed class of estimators can be obtained as:
The bias and MSE of , are given by:
and
The bias and MSE of , are given by:
and
The bias and MSE of , are given by:
and
The bias and MSE of , are given by:
and
The bias and MSE of , are given by:
and
The bias and MSE of , are given by:
and
The bias and MSE of , are given by:
and
The bias and MSE of , are given by:
and
The bias and MSE of , are given by:
and
The bias and MSE of , are given by:
and
The bias and MSE of , are given by:
and
The bias and MSE of , are given by:
and
Theoretical comparison
In this section, we compare our proposed generalized exponential type estimator with other estimator is given by:
Numerical comparison under simple random sampling
To observe the performance of our proposed generalized class of estimators with respect to other considered estimators under simple random sampling, we use the following data sets, which earlier used by many authors in literature.
Population 1. [Source: Koyuncu and Kadilar [13]]
Let y be the number of teachers, p1 be the number of students both primary and secondary schools in Turkey in 2007 for 923 districts in six regions which is greater than 11440.5 and p2 be the number of students both primary and secondary schools in Turkey in 2008 for 923 districts in six regions which is greater than 333.1647. We use the proportional allocation.
and
.
Population 2. [Source: Singh [14]]
Let y be the estimated number of fish caught by marine recreational fisherman in year 1995, p1 be the proportion of fishes caught greater than 1000 in 1993 and p2 be the proportion of fishes caught greater than 2000 in 1994.
and
Population 3. [Source: www.pbs.gov.pk]
Let y be the tobacco area production in hectares during the year 2009, p1 be the proportion of farms with tobacco cultivation area greater than 500 hectares during the year 2007 and p2 be the proportion of farms with tobacco cultivation area greater than 800 hectares during the year 2008 for 47 districts of Pakistan.
and
Population 4. [Source: www.pbs.gov.pk]
Let y be the cotton production in hectares during the year 2009, p1 be the proportion of farms with cotton cultivation area greater than 37 hectares during the year 2007 and p2 be the proportion of farms with cotton cultivation area greater than 35 hectares during the year 2008 for 52 districts of Pakistan.
and
We use the following expression to obtain the Percentage Relative Efficiency (PRE):
where i = 0, R, P, exp(R), exp(P), KB(RP), SK(DR), SK(DP) and
The results based on data sets (1–4) are given in Table 2. It is observed in Table 2, that the proposed esstimators () is more efficient then its competators in SRS.
Table 2 clearly shows that our generalized proposed class of estimators is better than all existing estimators. The product estimators and
perform poor in all four populations because of negative correlation. It is also observed that
also perform poor in populations 1, 3 and 4. From this results, we conclude that the generalized proposed class of estimators is more efficient than other estimators. Some members of the proposed family of estimators based on PRE are given in Table 3.
Table 3 gives the PRE of the proposed family of estimators in simple random sampling. It is obsessed that the proposed family of estimators perform poorly because of poor correlation between study and auxiliary variables.
Existing estimators in stratified random sampling
The auxiliary information is used in reduction of MSE of various estimators for estimating different population parameters that is mean, variance, ratio of two population means and variances etc. To increase the precision we divide the population into homogeneous groups with respect to some characteristic of interest. Many Statisticians have used the auxiliary information in the estimation of population parameters in stratified random sampling for improving the efficiency of estimators.
Dalabehera and Sahoo [15] proposed different regression type estimators in stratified random sampling with two auxiliary variables. Later Kadilar and Cingi [16] also proposed ratio type estimators in stratified random sampling to get efficient results by extending the estimators Upadhyaya and Singh [17]. Singh and Kumar [11] used transformed variables in stratified random sampling. Koyuncu and Kadilar [13] also used ratio and product types estimators using two auxiliary variables in stratified sampling.
Let U = {U1, U2….UN} be a finite population of size N and let y, p1 and p2 respectively be the study and two auxiliary attributes associated with each unit Uj = (j = 1,2,…,N). Assume that a population is stratified into L homogeneous strata with the hth stratum containing Nh units, where h = 1,2,…,L such that . A simple random sample of size nh is drawn without replacement from the hth stratum such that
. Let
be the observed values of y, p1 and p2 on the ith unit of the hth stratum, where i = 1,2,…,nh. Moreover, let
, and
be the sample and population means of y respectively, where
is the known stratum weight.
Let
such that E(eih) = 0, (i = 0,1,2),
where,
Now we discuss the same existing estimators in stratified random sampling.
1. The usual mean per unit estimator and its MSE in stratified random sampling, are given by:
(31)
and
(32)
2. The usual ratio estimator under stratified random sampling, is given by:
(33)
3. The usual product estimator under stratified random sampling, is given by:
(34)
(35)
and
(36)
Similarly the bias and MSE of the , are given by:
(37)
and
(38)
4. Bahl and Tuteja [9] estimator in stratified random sampling, are given by:
(39)
and
(40)
The bias and MSE of , are given by:
(41)
and
(42)
Similarly the bias and MSE of , are given by:
(43)
and
(44)
5. Kumar and Bhougal [10] proposed exponential ratio-product type estimator for the population mean in stratified random sampling, is given by:
(45)
where αh is unknown constant.
The bias and minimum MSE of , are given by:
(46)
and
(47)
The optimum value of αh, is given by:
6. Singh and Kumar [11] suggested double ratio and product type estimators in stratified random sampling, are given by:
(48)
and
(49)
The bias and MSE of , are given by:
(50)
and
(51)
Similarly the bias and MSE of , are given by:
(52)
and
(53)
Proposed class of estimators
In the lines of Shukla et al. [12], we proposed a generalized class of estimators in stratified random sampling, is given by:
(54)
where,
Substituting different values of Kih (i = 1,2,3,4) in (54), we can generate many more different types of estimators from our general proposed class of estimators, given in Table 4.
Solving given in Eq (54) in terms of errors, we have
(55)
where
and
To first order approximation, we have
(56)
Taking squaring and expectation of Eq (56) to first order of approximation, we get the bias and MSE:
(57)
and
(58)
Differentiate Eq (58) with respect to σ1h and σ2h, we get the optimum values of σ1h and σ2h i.e.
Substituting the optimum values of σ1h(opt) and σ2h(opt) in Eq (58), we get minimum of
is given by:
(59)
where
is the multiple correlation coefficient of yh on p1h and p2h.
Now by putting different values of Kih in Eq (54), some member of the proposed class of estimators can be obtained as:
The bias and MSE of , are given by:
and
The bias and MSE of , are given by:
and
The bias and MSE of , are given by:
and
The bias and MSE of , are given by:
and
The bias and MSE of , are given by:
and
The bias and MSE of , are given by:
and
The bias and MSE of , are given by:
and
The bias and MSE of , are given by:
and
The bias and MSE of , are given by:
and
The bias and MSE of , are given by:
and
The bias and MSE of , are given by:
and
The bias and MSE of , are given by:
and
Numerical comparison under stratified random sampling
To observe the performance of our proposed generalized class of estimators with respect to other considered estimators under stratified random sampling, we use the following data sets, which earlier used by many authors.
Population 1. [Source: Kadilar and Cingi [16]]
Let y be the apple production amount in 1999, p1 be the proportion of number of apple trees greater than 20,000 in 1999 and p2 be the proportion of number of apple production amount greater than 25000 in 1998.
In Table 5, the population size is 854 and the sample size is 200 by the use of different strata. Also, find out the coefficient of variation and correlattion coefficeint and conclude that all the correlation is positive among the variables.
Population 2. [Source: Sarndal et al. [18]]
Let y be the population in thousands during 1985, p1 be the population proportion during 1975 which is less than 60 and p2 be the proportion of number of seats in municipal council less than 100. We use proportional allocation.
In Table 6, the population size is 284 and the sample size is 68 by the use of different strata. Also, find out tha sample mean, coefficient of variation and correlattion coefficeint. We conclude that some of the correlation is positive or negative among the variables.
Population 3. [Source: Koyuncu and Kadilar [13]]
Let y be the number of teachers, p1 be the number of students both primary and secondary schools in Turkey in 2007 for 923 districts in six regions which is less then 1000 and p2 be the number of students both primary and secondary schools in Turkey in 2008 for 923 districts in six regions which is less then 200. We use proportional allocation.
In Table 7, the population size is 923 and the sample size is 180 for different strata. Also, find out tha sample mean, coefficient of variation and correlattion coefficeint. We conclude that all of the correlation is positive among the variables.
Population 4. [Source: Gerard et al. [19]]
Let y be the Total Taxation in Euros in 2001, p1 be the Total Taxable income in Euros in 2001 which less then from mean and p2 be the Total average income in Euors in 2001 which is less then from mean. We use proportional allocation.
In Table 8, the population size is 589 and the sample size is 150 for different strata. Also, find out tha sample mean, coefficient of variation and correlattion coefficeint. We conclude that all of the correlation is negative among the variables.
We use the following expression to obtain the Percentage Relative Efficiency(PRE):
where i = 0, Rh, Ph, exp(Rh), exp(Ph),
and
The results based on population 1–4 are given Table 9.
The PRE values of different estimators with respect to is given Table 9. Some members of the proposed class of estimators show poor performance because of negative correlation particularly product type estimators. Overall the performance of the proposed estimators
outperforms as compared to all other considered estimators. The PRE values of some members of the proposed class of estimators are given in Table 10.
From Table 10, we observed that the PRE values of some members of the proposed class of estimators perform poorly. These are expected results because product type estimators are performing poorly.
Conclusion
In this paper, we proposed an improved class of estimators of finite population mean by utilizing data sets on two auxiliary attributes in both simple random sampling (SRS) and Stratified random sampling (StRS) schemes. Bias and MSE expressions of proposed class of estimators and
are acquired upto first order of approximation. It can be seen that, both theoretically and mathematically the proposed class of estimators consistently performs better than the existing estimators considered here under SRS and StRS. Based on these findings, we suggest the utilization of the proposed estimators for efficient estimation of population mean in presence of the auxiliary attributes in SRS and (StRS) schemes, are preferable for future study.
References
- 1. Wynn HP. An Unbiased Estimator Under SRS of a Small Change in a Proportion. Stat. 1976;25: 225.
- 2. Singh G, Singh RK, Kaur P. Estimating Ratio of Proportions Using Auxiliary Information. Biometrical J. 1986;28: 637–643.
- 3. Naik VD, Gupta PC. A note on estimation of mean with known population proportion of an auxiliary character. Journal Indian Soc Agric Stat. 48: 151–158.
- 4. Solanki RS, Singh HP. Improved estimation of population mean using population proportion of an auxiliary character. Chil J Stat. 2013;4: 3–17.
- 5. Malik S, Singh R. An improved estimator using two auxiliary attributes. Appl Math Comput. 2013;219: 10983–10986.
- 6. Zeng Q, Guo Q, Wong SC, Wen H, Huang H, Pei X. Jointly modeling area-level crash rates by severity: a Bayesian multivariate random-parameters spatio-temporal Tobit regression. Transp A Transp Sci. 2019;15: 1867–1884.
- 7. Zeng Q, Wen H, Huang H, Pei X, Wong SC. A multivariate random-parameters Tobit model for analyzing highway crash rates by injury severity. Accid Anal Prev. 2017;99: 184–191. pmid:27914307
- 8. Zeng Q, Wen H, Wong SC, Huang H, Guo Q, Pei X. Spatial joint analysis for zonal daytime and nighttime crash frequencies using a Bayesian bivariate conditional autoregressive model. J Transp Saf Secur. 2020;12: 566–585.
- 9. Bahl S, Tuteja RK. Ratio and Product Type Exponential Estimators. J Inf Optim Sci. 1991;12: 159–164.
- 10. Kumar S, Bhougal S. Estimation of the Population Mean in Presence of Non-Response. Commun Stat Appl Methods. 2011;18: 537–548.
- 11. Singh HP, Kumar S. A regression approach to the estimation of the finite population mean in the presence of non-response. Aust New Zeal J Stat. 2008;50: 395–408.
- 12. Shukla D, Pathak S, Thakur NS. A transformed estimator for estimation of population mean with missing data in sample-surveys. J Curr Eng Res. 2.
- 13. Koyuncu N, Kadilar C. Family of estimators of population mean using two auxiliary variables in stratified random sampling. Commun Stat Methods. 2009;38: 2398–2417.
- 14.
Singh S. Advanced Sampling Theory With Applications: How Michael"" Selected"" Amy. Springer Science & Business Media; 2003.
- 15. Dalabehera M, Sahoo LN. A new estimator with two auxiliary variables for stratified sampling. Statistica. 59: 101–107.
- 16. Kadilar C, Cingi H. Ratio estimators in stratified random sampling. Biometrical J. 2003;45: 218–225.
- 17. Upadhyaya LN, Singh HP. Use of transformed auxiliary variable in estimating the finite population mean. Biometrical J. 1999;41: 627–636. pmid:25855820
- 18.
Särndal C-E, Swensson B, Wretman J. Model assisted survey sampling. Springer Science & Business Media; 2003.
- 19. Gérard M, Jayet H, Paty S. Tax interactions among Belgian municipalities: Do interregional differences matter? Reg Sci Urban Econ. 2010;40: 336–342.