A Model of Two-Way Selection System for Human Behavior

Two-way selection is a common phenomenon in nature and society. It appears in the processes like choosing a mate between men and women, making contracts between job hunters and recruiters, and trading between buyers and sellers. In this paper, we propose a model of two-way selection system, and present its analytical solution for the expectation of successful matching total and the regular pattern that the matching rate trends toward an inverse proportion to either the ratio between the two sides or the ratio of the state total to the smaller group's people number. The proposed model is verified by empirical data of the matchmaking fairs. Results indicate that the model well predicts this typical real-world two-way selection behavior to the bounded error extent, thus it is helpful for understanding the dynamics mechanism of the real-world two-way selection system.


Introduction
Human-initiated systems always run in a complex way. In the past ten years, related work mainly focused on the temporal and spatial distribution characteristics of human activity patterns. Because of the complexity of human behavior, many underlying mechanisms have not been discovered yet. The two-way selection scenario among humans is one of the complicated but common phenomena in daily life. It happens in the processes like choosing a mate between men and women, making contracts between job hunters and recruiters, and trading between buyers and sellers. In a sense, two-way selections can be regarded as the base of building many social relationships. Generally, the participants in a two-way selection process are first classified into two groups by their natural status. Then they observe, study the factors of the people on the other side, and finally make their choices. For instance, in the case of marriages, one's appearance, personality, wealth, and sense of humor, are prevalently taken into consideration. Besides the individual characters, impersonal factors also exert an influence, e.g. the member totals on each side and their ratio. How many characters will be inspected and chosen deeply affects the result of a selection process. However, usually it is difficult to compare and to distinguish these characteristics quantitatively even qualitatively through traditional methods, such as psychological tests and social surveys.
The well-known marriage game in statistical physics has been researched in these papers [1-6], whose main novel concept is the stability of marriages. This view point aims to find a stable matching between the two sets of men and women. Such a model results in the destiny that every one in the sets gets married and the final marriage relationships are ''stable''. However, the internal mechanism of a two-way selection system can be modeled in another way: not all of the participants have to get married in one trial of the processes, i.e. some of them would be successful in matching but the others not. This mechanism would render assistance to some social problems, such as the prediction of the total of friendships or other gregarious relations [7][8][9][10][11][12][13][14][15]. In this paper, we present a model for two-way selections to investigate the factors influencing the matching rate. The data of matchmaking fairs are analyzed to support our model. Based on this model, the method of estimating the number of factors impacting people's decisions is also proposed.

The Model and Analytical Results
Our model of the two-way selection is stated as follows: i) The system has two sets of agents, A and B, respectively amounting to k 1 and k 2 . ii) The ith agent in set A (or set B) has its own character denoted by c Ai (or c Bi ). Correspondingly, the character the ith agent attempts to select is denoted by s Ai (or s Bi ). iii) The agents' characters are denoted by integers without loss of generality. Assume the characters has n types, i.e. c Ai , c Bj , s Ai , s Bj [V, V~f1,2, . . . ,ng, i[f1,2, . . . ,k 1 g, j[f1,2, . . . ,k 2 g. In one trial of the model, c Ai , c Bj , s Ai , s Bj pick an element in S following the uniform distribution. iv) The condition of successful matching of two agents A i and B j is c Ai~sBj and s Ai~cBj . That is, when agent A i 's character meets agent B j 's requirement and vice versa, agent A i and agent B j have a successful matching.
For given k 1 , k 2 and n, the expectation E of the total number of matching pairs in the model is According to lim k??
Equation (2) can be simplified as Because in this case k 2 is very large, Equation (5) can be further simplified as Define where g denotes the ratio of k 2 to k 1 ; j denotes the ratio of n 2 to k 1 ; P denotes the estimated ratio of successful matching pairs to the average number of two type agents. Then Equation (2) can be transformed into Equation (3) can be written as: Equation (6) can be written as: , 1ƒj%g or j%1ƒg: ð10Þ Figure 1 shows the comparison between the analytical predictions of (1) and the simulation results. Figure 2(a) shows the comparison between the analytical predictions of (9) and the simulation results under the condition 1ƒg%j and displays a power-law relation with the exponent 21 between P and j. Figure 2(b) shows the comparison between the analytical predictions of (10) and the simulation results under the condition 1ƒj%g and displays a power-law relation with the exponent 21 between P and g; Figure 2(d) shows the comparison between the analytical predictions of (10) and the simulation results under the condition j%1ƒg and displays the same power-law relation to the result in Figure 2(b). The above analytical predictions and simulation results are consistent with each other. That is to say all analytical results are reliable.
Consider a special case k 1~k2~k , resulting in g~1. On the one hand, Equation (9) can be simplified as The relation between P and j approximates a power law with the exponent 21, and this case is shown in Figure 2(c). Equation (10) can be simplified as P&1, suggesting that almost all of the agents can match successfully under the condition j%1ƒg. On the other hand, because the condition k 1~k2~k results in l 1~l2 , from (5) we can obtain The second term of (12) is the number of the agents that can not successfully match in type A or type B. The larger k is, the smaller ffiffiffiffiffiffiffiffiffi ffi n 2 =p p ffiffiffi k p =k is. In reality, this is a result of fluctuation. The total combinations of the ''own'' state and the ''expecting'' state for an agent have n 2 possibilities in the model. In theory, the expected times of each state appearance is k=n 2 . However, due to the fluctuations, almost all frequencies of every state appearance deviate around k=n 2 . As a result, some agents can not successfully match. The number of times that each state may appear obeys the binomial distribution. The fluctuation is closely related to the standard deviation, according to the binomial theorem and standard deviation formula, we can obtain that the standard deviation equals ffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi k(1{1=n 2 )=n 2 p , which is directly proportional to ffiffiffi k p . Thus, the number of the agents that can not successfully match is also proportional to ffiffiffi k p . It explains the relationship between the second item of (12) and ffiffiffi k p . From (12), we know that the proportionality coefficient is ffiffiffiffiffiffiffiffiffi ffi n 2 =p p .

The Verification Between the Model and Experimental Data
As the mate choosing between men and women is a typical realworld two-way selection system, eighty-two reported records of matchmaking fairs are analyzed to verify our model. Due to the uncertainty of approximation in these reports, we classify the data  In the four sub-figures, the parameter k 1~1 00; the squares are the simulation data; the solid lines are analytical predictions. In (a), g~2; j is assigned to values from 100 to 1000; the solid line is obtained from (9). In (b), j~2; g is assigned to values from 10 to 1000; the solid line is obtained from (10). In (c), g~1; j is assigned to values from 10 to 100; the solid line is obtained from (11). In (d), j~0:1; g is assigned to values from 10 to 1000; the solid line is obtained from (10). doi:10.1371/journal.pone.0081424.g002 into three categories with specified possible ranges according to their descriptions: i) ''nearly x'' (possible range 0:95x+0:05x); ii) ''about x'' (possible range 1:00x+0:05x); iii) ''over x'' (possible range 1:05x+0:05x). The full list of the data records is shown in Table 1. All data of matchmaking fairs are collected from the websites shown in Table 2.
In our model, n is an internal parameter needed to be measured. Because a news report (descried as an experiment below) generally includes only the total of participants and the number of successful matching pairs, the male-female or female-male ratio g defined in (7) should be estimated first. Under the assumption k 1 ƒk 2 , the lower bound of g is g min~1 , and once the total of participants K~k 1 zk 2 and the number of matching couples E is determined, the upper bound of g in that experiment is known: gƒ(K{E)=E.
Let N be the number of experiments, g max,i be the upper bound of g in the ith experiment, and H~fg max,1 ,g max,2 , . . . ,g max,N g be the set of all upper bounds. By processing N~82 experiments in Table 1, we obtain min(H)~2 and max(H)~365:67. Consider the least square criterion for fitting the model and the experimental data where P e,i denotes the experimental data in the ith experiment and P t,i (n,g) denotes the corresponding theoretical value calculated by (8), and the reality that in a matchmaking fair the  Finally we obtain the estimation n~15. Figure 3 shows the relationship between the experimental data and the analytical predictions of our model. The red curve and olive curve are obtained from (8). The parameters of red curve are g~1, n~15; the parameters of olive curve are max(H)~365:67, n~15. According to (7), when k 1 is equal to the minimum 1, j takes the maximum value 225. The error bars of ordinate P of round dots represent the ranges of empirical data P in Table 1. Because k 1 is unknown and j is undetermined, the bound for k 1 in the ith experiment is min(k 1,i )~E i ƒk 1,i ƒmax(k 1,i )~( k 1,i zk 2,i )=2, and the bound for corresponding j i is 15 2 =max(k 1,i )ƒj i ƒ15 2 =min(k 1,i ). Therefore, the ranges of abscissa j i of round dots are relatively wide and the middle points lie in j i~( 15 2 =max(k 1,i )z15 2 =min(k 1,i ))=2. Figure 3 also shows when j is relatively small and corresponding k 1 is big, all empirical data are enclosed between two curves; when j is relatively big and corresponding k 1 is small, some empirical data are enclosed between the two curves, but other empirical data lie above the red curve and the trend of the empirical data is opposite to the analytical predictions. The possible reasons are: on the one hand, organizers of some matchmaking fairs select only a few participants meeting their requirements from a large number of applicants, so a participant is easier to find the right man or woman; on the other hand, when the number of participants is small in a matchmaking fair, they understand the difficulty of finding an ideal object so compromise to a goodish choice. The two reasons above cause that the fewer the participants are, the Website name of matchmaking fairs higher the matching probability P is. Based on these effects, the deviation of experimental data from the model is acceptable.

Conclusion
We propose a model of the two-way selection system and provide its analytical solution. Under several conditions, the compact approximations are derived analytically and verified by the simulation results. In the model, the parameter n that denotes the number of characters directly determines the probability of the successful match -due to its importance, we propose a rough method to estimate its value by fitting the empirical data collected via the Internet and the result is n~15. Under some artificial assumptions, most of the experimental data fall into the range predicted by our model, so this model is helpful for understanding the dynamics mechanism of the real-world two-way selection systems, and provides a starting point for researching the nature of real-world two-way selection systems. We believe our model could enlighten readers in this rapidly developing field.