Multivariate Multi-Objective Allocation in Stratified Random Sampling: A Game Theoretic Approach

We consider the problem of multivariate multi-objective allocation where no or limited information is available within the stratum variance. Results show that a game theoretic approach (based on weighted goal programming) can be applied to sample size allocation problems. We use simulation technique to determine payoff matrix and to solve a minimax game.


Introduction
A choice of sampling plan is fundamental to any statistical study because it provides estimates of population parameters. Sample size allocation to each stratum is necessary in stratified random sampling design. An optimum allocation can be applied to each characteristic unless sufficient information about stratum variability is available. However, the optimization technique can lead to misleading results because of limited information about cost and variance. The precision (1/variance) and cost may achieve in the process of implementation. For more discussion on it see [1][2][3], [4,5] and [6].
In this study, we propose a multivariate game theoretic approach for the sample size allocation problem in stratified random sampling design. There are many techniques which are being used for allocation of sample size, such as proportional allocation, optimal allocation etc. All these techniques are suitable when we have sampling frame and other relevant information regarding population variance etc. These techniques are helpful and might be more relevant if we have sufficient information about population. In case when we have i) no or limited information about population ii) one cannot be much optimistic about the sample results that they will be true on average iii) there may be a high variability among sampling units iv) one wants to deal with adverse case scenario regarding variation in sample This technique will help him to answer the above question.
In such games proportional allocation technique is computationally feasible and generally applied ( [7]). Within stratum Variance is vital for a game theoretic approach. In univariate, [8] formulated a mini-max allocation problem which is a function of specified minimum upper bounds for each stratum variances.
The [9] presented "a game theoretic formulation for the multivariate case, where the covariances between pairs of responses are supposed to be constant from stratum to stratum". Moreover, these strategies are functions of stratum variance and covariances. The [10] discussed an optimum allocation for a multivariate design that "minimizes the cost of obtaining estimates with smaller errors than previously specified numbers with confidence level". He also showed that variance information could be useful to obtain nearly optimum allocation.
The [11] obtained posterior variances by using the priori information of both, mean and variance. [12] and [13] proposed a "method of allocation in multivariate surveys where various stratum variances are assumed to be known". It minimizes the cost of having estimates of variances smaller than its predecessor.
In game theory literature, many authors discussed various models of two players game. A traveler's dilemma game (TDG) model on two coupled latices presented by [14] which investigate effects of coupling on cooperation. A simulation study of this model indicates that cooperation behavior varies over lattices. A two player game between cooperator and defector was discussed in [15] in which they simulated utility coupling on weighted lattice. An other two player game on a square lattices using different weights for available strategies modeled in [16]. A risk aversion model presented in [17] when player's participation is probabilistic. The [18] modeled a two player game which considers the reputation and behavior diversity which varies over strategy space. Simulation results show that cooperation behavior influenced by reputation index.
Allocation in multivariate surveys must be optimum for all characteristics. For example, any such allocation which minimizes the cost vectors or the variance functions, which minimizes it or maximizes the relative efficiency comparing with other allocation. A detailed discussion given in [19] [20][21][22], [23][24][25][26], [27][28][29][30], [31] and [32] Second section explains the sampling notations. We set a multi-objective game allocation problem in section 3. Section 4 explains methodology of our approach and discussion on results is given in Section 5.

Population
Let we have population of size N which is further divided into L mutually exclusive strata, where N ¼ P L h¼1 N h . Consider a data set Y jhi for j = 1, 2, . . ., Q characteristics and h = 1, 2, . . ., L strata with i = 1, 2, 3, . . . N h sampling units in the h th stratum. " N h is the population mean of h th stratum of j th characteristic.
If W h ¼ N h N is weight of h th stratum and S 2 jh is the population variance of j th study variable which can calculated from h th stratum as; We draw a simple random sample of size n h independently from each stratum such that P L h¼1 n h ¼ n. Let " y jst is stratified estimator of population mean " Y j of characteristic j, which is given as: y jhi n h . The variance of " y jst is: Ignoring the term independent of n h , we have; If our interest lies in squared coefficient of variation instead of variance, we can use the following expression; Substituting the value of Vð" y jst Þ from Eq 1 in the above equation, we have;

Game setting in a multi-objective allocation problem
We draw a simple random sample from all strata such that P L h¼1 n h ¼ n while assuming a finite population. The objective is to minimize some vector relation of coefficients of variation (CV) while allocating a sample in all strata. For a single characteristic, say j, the simple mean estimator of CV can be expressed as Eq (2).
In particular, an optimum allocation of a sample of size n is a choice of the n h that minimizes Eq (2) subject to the restriction that P L h¼1 n h ¼ n if values S 2 h and " Y j are known. An optimum allocation only be computed if S 2 jh and " Y j are known ( [33]). We can use unbiased estimators as; and " y jst of S 2 jh and " Y j , respectively. Let say z jh is CV 2 that can be computed from sample as; Players: Sampler (player 1) and Adversary (player 2) If we consider sampler as player 1, the z jh from Eq (3) to be his loss in a zero-sum game against Adversary (player 2) for characteristic j in the stratum h. The sampler seeks an allocation that is a good strategy for playing this game to minimize some vectorZ h for all (h = 1, 2, . . ., L). The vector space of strategies (allocations) which are available to the sampler is considered to be ν is; n ¼ fñ ¼ ½n 1 ; n 2 ; n 3 ; . . . ; n L : Therefore, the Adversary selects an independent sample from each strata according to an offered strategy by the sampler. The objective of the Adversary is to choose vectorZ h ¼ ½z 1h ; z 2h ; z 3h ; :::; z Qh from each strata hε(1, 2, 3, . . ., L), which maximize, say; A seemingly natural way to proceed which may lead to interesting results. The Adversary's strategies are multi-objective goal program subject to maximize vector ofz h with in each stratum for a particular n h , hε(1, 2, 3, . . ., L). The Adversary's strategy space Δ can be described as; Payoff matrix of Sampler (player 1) While playing a zero sum game, each player try to optimize his gain or loss. The minimax idea is minimizing the possible loss for a worst case (maximum loss) scenario. A minimax strategy is a mixed strategy game. Both players choose alternate strategies and they make simultaneous moves. It can also been extended to more complex games. Sampler would like to minimize vectorZ ¼ ½z 1 ; z 2 ; z 3 ; :::; z h , where z h is defined above. Payoff of sampler is the gain of Adversary, which can be determined by following multi-objective program; Maximize z h ¼ ðz 1h ; z 2h ; z 3h ; . . . ; z Qh Þ : This can be equivalently written in a matrix S ν×Δ . Each row in S represents loss of sampler for a possible allocation and each column of S represents gain of Adversary for an offered strategy from sampler.

Minimax game for allocation
Assume that the sampler and the Adversary each choose a strategy. This implies that the sampler will pick an allocation vectorñ ¼ ½n 1 ; n 2 ; n 3 ; :::; n L ε n : P L h¼1 n h ¼ n and the Adversary has to pick a sample of actual data according to Eq (6).
Adversary will choose a strategy that maximize z h = (z 1h , z 2h , z 3h , . . ., z Qh ): 8 h = 1, 2, . . ., L. Therefore, the sampler objective is to minimizes his maximum (worst) value within the available budget, while allocating sample of size n to all strata. However, obviously a larger sample will produce better result if there is no restriction on budget. The optimal program consider all possible choices of sample, where adversary can choose his strategies independently. In summary, the objective of the sampler under budgetary condition is; . . . ; L andñ ε n: Theorem: In the game described above, i.e., (ν, Δ, CV 2 ) • A good strategy for Adversary is • An optimal solution Z exists in the allocation problem game, as described in program Eq (7) where Z is the value of the game.
We allow our generic goal program to have Q goals, which may be j = 1, . . ., Q. We determine n h decision variables. These are the factors over which the decision maker(s) may control and determine the decisions to be made. Each goal has an achieved value, z jh , on its underlying criterion. z jh is a function of the compromise decision variables for j th goal. The whole situation is expressed below: The above program can be expressed as a Weighted Goal Programming (WGP) if f 1h , f 2h , Á Á Á, f Qh represent weighted functions in their respective priority. The WGP is formulated to maximize a composite objective function as a vector formed by a weighted sum of coefficients of variation in the respective strata.
The optimal strategy for the sampler isñ ¼ ½n 1 ; n 2 ; n 3 ; . . . ; n L ε n using model (7). This implies for any strategy that the sampler would choose, as the Adversary will sample from every strata to maximize the model (6). Therefore, it is a minimal sampling scheme.

Numerical Illustration
This idea of sample selection is applied on a real data of Master of Philosophy (Table 1) induction into the department of Statistics, Quiad-e-Azam university Islamabad, Fall 2014. Stratum 1 compose on 'other universities' inductees and stratum 2 QAU graduates inductees. Data below represent the 'test plus interview' marks and 'academic record' marks. A stratified random sample is desired to be selected from the given data. The cost of selecting a sampling unit from stratum 1 is Rs. 2000 and from stratum 2 is Rs. 1000 (estimate of the traveling cost in local units, for sampling purpose only). Let we have a budget of Rs. 15000 only, and there is no initial cost on sampling i.e., C 0 = 0.

Computation of payoff matrix
We use the model (6) to compute payoff matrix. Let the two characteristics be the test and interview marks (T & I) and academic record marks (Ac. Rec.). Both have the equal importance because total marks considered are in the selection as the criteria. The above model (6) can be represented as; ; h ¼ 1 and 2 We compute payoff matrix of sampler using equation below for various combinations of (n 1 , n 2 ) that satisfyñεn The above formulation can be expressed as;

Subject to
The problem arises where Adversary required a sample of actual values to maximize the sum of CV 2 h over all characteristics j = 1, 2 forñεn. It is feasible under given cost, however, we choose a simulation technique for this purpose. We have sampled more than twice of the total possible samples N n h h . The population is known and finite, and sampling is done with replacement for the characteristic vector as well as for all possible sizes under the budgetary restriction. We are able to run a maximum loop on 20 × 10 6 randomly selected samples. This simulation process returns maximum value of sum of CV 2 h for both characteristics j = 1, 2 over the whole simulation loop. Results are given in Table 2. Our simulation technique is different

Solution of minimax game
For the outer segment of model (7), we can use any suitable goal programming technique discussed in ( [22,23,28], [29][30][31], [34] and [35]). The above programme Eq (7) can be expressed as a Weighted Goal Programming (WGP) model as; where ðz Ã 1 ; z Ã 2 Þ are sum of optimal Adversary's objectives for stratum 1 and stratum 2, respectively. The objective function in the above equation can be written more precisely as (2). This program runs in general algebraic modeling system (GAMS) to get optimum results. Optimum results are highlighted in Table 2

Discussion on results
We found for our referred example that the total weighted variation (sum over characteristics) in first stratum ranging from 0.0305 to 0.093, obviously lower for higher sample size. And the same in second stratum is 0.00105 to 0.0335. These results are based on simulation, which may differ on some other attempt [36]. We simulated the results for large number of samples (more than 20 millions in some cases). While comparing fluctuations in two characteristics, it is observed that results show high fluctuation in first characteristic ('test plus interview' marks) as compare to second ('academic record' marks) in either strata (see Table 2). The optimal value of this game is 0.01299275 with optimal sampling strategy (6,3).
In literature, sample selection is frequently discussed when sampling frame is known. But our novel methodology is suitable even if sampling frame is unavailable. This addresses the adverse case scenario while our focus is generally on minimizing the estimates of variation. This sampling strategy shows another side of the picture.
Limitation of this study could be following. First, we have chosen weighted goal programming to solve inner problem of maximization to determine a payoff matrix of sampler. However, one can apply various other methods such as, lexicographic, extended lexicographic, fuzzy programming and the value function technique. Even results may be more interesting for different selection of weight criterion. Second, we use standard weight vector W 2 h to solve minimax game, however, various other weight vector may be used for outer minimization problem.