Skip to main content
Advertisement
Browse Subject Areas
?

Click through the PLOS taxonomy to find articles in your field.

For more information about PLOS Subject Areas, click here.

  • Loading metrics

Entropy Based Modelling for Estimating Demographic Trends

Abstract

In this paper, an entropy-based method is proposed to forecast the demographical changes of countries. We formulate the estimation of future demographical profiles as a constrained optimization problem, anchored on the empirically validated assumption that the entropy of age distribution is increasing in time. The procedure of the proposed method involves three stages, namely: 1) Prediction of the age distribution of a country’s population based on an “age-structured population model”; 2) Estimation the age distribution of each individual household size with an entropy-based formulation based on an “individual household size model”; and 3) Estimation the number of each household size based on a “total household size model”. The last stage is achieved by projecting the age distribution of the country’s population (obtained in stage 1) onto the age distributions of individual household sizes (obtained in stage 2). The effectiveness of the proposed method is demonstrated by feeding real world data, and it is general and versatile enough to be extended to other time dependent demographic variables.

Introduction

Predicting demographic trends (DT) [1] in the light of emerging complex processes [2] of the 21st Century continues to be an important and open research topic. Understanding developments and the changes in population is critical in assisting governments in targeting policies for the future and saving money for education, public health, retirement, transportation, energy consumption among others [3][4]. Specifically, DT refers to the changes in the joint distribution between population with time, age or other demographic factors, such as household’s size, health measures, economic status, religious affiliation, education, marriage, etc [5][6][7].

Forecasting DT is a challenging task, and remains to be a fundamental concern in both basic and applied ecology [8]. The complexity lies in the DT’S intricate connectivity to the heterogeneous activities of a large group of individuals, and it is impacted by observed and unobserved time dependent factors [9][10]. Existing methods such as the least square methods [11][12][13] and Bayesian inference [14], in spite of being the most extensively used procedures in estimating and predicting various engineering problems, fail to capture the driving mechanisms of complex processes that shapes DT [15]. There are very few literatures on building optimization models for understanding DT. Typical approaches involve incorporating factors such as environmental [16][17][18], demographic [19] and/or observer-related covariates [20]. However, data to support and verify such techniques is often not readily available as [21]–[23] suggesting that building an optimization model constrained by limited data to characterise DT is fundamentally important with a lot of potential applications.

Entropy-based methods, the measure of the uncertainty in random variables, have been successfully applied to many modelling and estimation problems, as seen in [24][25][26][27]. In this paper, we introduce the entropy-based method to estimate DT. We build the model motivated by our empirical observation that the age distribution of population follows an increasing entropy trend. The paradigm is based on minimizing the entropy-based objective function and incorporating some parameters describing the historical trends into the constraints where the dynamic and intrinsic properties can be reflected. We illustrate this procedure by estimating the evolution of demographic distributions over ages and household sizes. Our work involves a three-fold modeling stages. Firstly, an “age-structured population model” based on Leslie matrix [28][29][30][31] is used to predict the age distribution of a country’s population. This makes the modelling of the demographic temporal distributions become possible, as one usually needs to project the age distribution of population into other factors. Secondly, the age distribution of each household size is estimated based on a proposed entropy-based model, where we propose an entropy formulated cost function and incorporate the DT into the constraint conditions. The model applied in this stage is called “individual household size model”. Finally, the age distribution of the country’s population (obtained in stage 1) is projected onto the age distributions of individual household size types (obtained in stage 2), which we refer to as “total household size model”. Note that our estimation does not rely on any observed determinant on the formation of households. The evolution of the household size is estimated based on the historical information and the entropy principle.

To compare with existing works [3][32], our method predicts DT with limited information [33]. The output is a joint distribution of age and other demographic variables over time. Among its applications will be on policy analysis, economic forecasting and urban planning and so on. For the purpose of illustration, we use the population data from US Census and predict the age DT for each household size in 2010, based on the historical data in 2000 and 2006. The remaining parts of the paper are organized as follows. Section 2 lists the definitions and notations which are used throughout the article. Section 3 presents the three stages for the estimation of DT. The simulation results based US data are illustrated in Section 4 and we conclude the article in Section 5.

Methodology

Notations

In the following, we list the definitions and notations that will be used throughout the article:

  1. t: The year index.
  2. .T: Matrix transpose.
  3. : The estimation of a variable.
  4. Aupper: The upper bound age.
  5. i: The age index (i = 0, 1, …, Aupper).
  6. Pi(t): The population for the people at age i (older than i but younger than i + 1) in the year t.
  7. P(t) = [P0(t) … Pi(t) … PAupper(t)]T: The population vector for the people at all ages in the year t.
  8. Ni(m, t): The population for the male at age i in the year t.
  9. Ni(f, t): The population for the female at age i in the year t.
  10. N(m, t) = [N0(m, t) N1(m, t) … NAupper(m, t)]T: The population vector for the male at all ages.
  11. N(f, t) = [N0(f, t) N1(m, t) … NAupper(f, t)]T: The population vector for the female at all ages.
  12. Di(m, t): The death rate for a male at age i in the year t.
  13. Df(i, t): The death rate for a female at age i in the year t.
  14. Bi(t): The fertility rate for a female at age i in the year t.
  15. Ratiomf(t): The ratio of the number of newly born boys to girls in the year t.
  16. Immig(m, t): The male immigrants vector in the year t.
  17. Immig(f, t): The female immigrants vector in the year t.
  18. Emig(m, t): The male emigrants vector in the year t.
  19. Emig(f, t): The female emigrants vector in the year t.
  20. m0: Total number of household sizes.
  21. j: The household size index (j = 1, 2, …, m0).
  22. k0: Number of historical years’ data used in the individual household size Model (12).
  23. κ: An index applied on the historical data for the year tκ (κ = 0, 1, …, k0).
  24. Gn: The people in the age interval [0 An] where An is an upper bound age of this group.
  25. n: The group number index of Gn (n = 1, 2, …, n0) (as seen in Formula (10)).
  26. n0: Number of groups (Gn) in the individual household size Model (12).
  27. : The probability (percentage) that people at age i in household size j in the year t.
  28. pi(t) = [p0(t) … pAupper(t)]T: Age distribution of the population in the year t.
  29. Age distribution of household size j in the year t.
  30. Age distribution of household size j in the year tκ.
  31. Entropy(t): The entropy of the population distribution in the year t.
  32. : A ratio of people in group Gn to the population in household size j in Formula (10).
  33. : A parameter defined in Model (12).
  34. {.}i: The vector that contains the values of the variable {.} by changing subscript i.
  35. : A weight of the objective function in Model (12).
  36. ω1, …, ωk0: The weights defined in Model (12) and Formula (13).
  37. An error term in Formula (14).
  38. The upper bound of ξk(t + 1) and .
  39. H: The hessian matrix.
  40. xj(t): The number of household size j (j = 1, …, m0) in the year t.
  41. The vector contains the number of each household size.
  42. W: A weighting matrix in the total household size Model (16).
  43. τ: A parameter in the matrix W in the total household size model (Formula (18)).
  44. u: A small positive weight parameter in the total household size Model (16).
  45. Predicted weighting matrix by collecting the predicted age distributions of all household sizes.

Three stages for forecasting the demographic trends

Fig 1 summarizes the three stages for forecasting the DT. Stage 1: using an “age-structured population model” to predict the population in the year t + 1. Stage 2: using an “individual household size model” to estimate the age distribution for each household size j based on data in the historical years where the DT reflected in the previous years can be incorporated into the constraint conditions. Stage 3: Combining the results from Stages 1 and 2, and employing a “total household size model” to predict the number of each household size. We detail in the next subsections each of the three stages shown.

thumbnail
Fig 1. Illustration of the three stages for forecasting the demographic trends.

https://doi.org/10.1371/journal.pone.0137324.g001

Age-structured population model: for estimating age distribution of the population

We consider the population as a summation of all the organisms of the same group or species, who live in the same geographical area, and have the capability of inter-breeding. Quite frequently, the prediction of demographic temporal distributions is highly linked to the population’s age-structure. Demographic temporal distribution modeling is achievable using the “age-structured population model” since it allows projection of the age distribution into other factors.

Assumptions. We apply the Leslie matrix method [28]–[31] that assumes:

  1. There is no plague, disaster or war that will lead to abrupt changes in age specific death rate.
  2. Statistical variables such as birth rates and birth ratio are slowly changed and predictable.
  3. The fertility rate for both local residents and immigrants is the same.
  4. All people who are older than Aupper are in the same age group. Here, we set Aupper = 90.

Problem formulation. We first consider the case without immigration and emigration. In the year t + 1, the number of people at age i + 1 is (1) where t and t + 1 denote the current year and the next year, respectively, and i = 0, 1, …, Aupper − 1 is the age index. When i = Aupper, we have (2) Let [i1, i2] be the age interval that a female has the ability to give birth. Then, P0(t + 1) = N0(m, t + 1) + N0(f, t + 1) and (3) where Ratiomf(t) is a ratio of the newly born boys (N0(m, t)) to the newly born girls (N0(f, t)) at year t. Let (4) be the vectors of the population, male population and female population, respectively, for ages between 0 and Aupper at year t.

Next, we extend the model to take into account of immigration effects. Let Immig(m, t)/Emig(m, t) and Immig(f, t)/Emig(f, t) be the respective immigrants and emigrants vector for males/females at year t. We obtain the “age-structured population model” as follows: (5) where A(t), B(t) and C(t) are the matrices constructed based on Eqs (2)–(4), and given by (6) (7) (8) Note that the population data we collected allows us to estimate the values of all the above parameters (such as the fertility rates and death rates). These parameters change slowly and are predictable which confirm the validity of our assumption. Thus, the population distribution for the coming years can be predicted based on the age-structured population Model (5), and its estimation is denoted as for the year t + 1 as shown in Model (16) later.

Individual household size model: for estimating age distribution for each household size

In this section, we will describe in detail our individual household size model that estimates the age distribution of each household size. The model is operated by minimizing an entropy based objective function and using the historical trends as constraints, where both the dynamic and intrinsic properties are reflected.

Let pi(t) be the probability that a person is at age i in year t. We define an entropy function for year t as follows: (9) where and pi(t) ≥ 0.

Fig 2 plots the entropy of the age distribution based on the population data collected from six countries. In general, the entropy of the age distribution increases monotonically with respect to time in most countries. This observation suggests that we can estimate the age distribution of a particular household size based on entropy concepts. To this end, we divide the household size into n0 types: i.e., 1 person per household, 2 persons per household, …, until n0 persons per household.

thumbnail
Fig 2. The age distribution entropy of selected countries as a function of time.

Note that there is a slight decrease in the entropy of the Indonesia’s population from 1990 to 2000 perhaps due to the difference in the statistics method used in the years (1990 vs 2000) considered as indicated in https://international.ipums.org/international/.

https://doi.org/10.1371/journal.pone.0137324.g002

Let j be the household size index and assume that we already have the age distributions for each household size j (j ∈ {1, …, m0}) in the years t, t − 1, …, tk0, which are denoted as for κ = 0,1, …, k0. Let represent the percentage of persons at age i in household size j in the year t + 1. This means we group the people whose ages are above 90 years together. Our objective is to estimate the age distribution in the year t + 1 based on the historical data.

We group the people from 0 to Aupper years old into n0 groups, i.e., the groups Gn for n = 1, …, n0, where n0Aupper. The age interval for the group Gn is [0, An] and 0 < A1 < A2 < … < An0 = Aupper. It is easy to see that Gn−1Gn. Define as a parameter such that (10) which means that is a ratio of people in group Gn, i.e., in the age interval [0, An], to the population in household size j. Note that ∀j ∈ {1, …, m0}, αn0(t + 1) = 1 since An0 = Aupper.

Let (11) be the parameter which reflects the percentage change of the ratio from the year t to the next year t + 1.

From here, we build an individual household size model to predict the age distribution for each household size type j where j = 1, 2, …, m0, by optimizing the following: (12)

Again, given that the entropy of the population is monotonically increasing with time, we can minimize an entropy based cost function under some constraints by employing the historical data. Compared with Eq (9), we omit the minus sign “−” such that the model becomes a minimization problem. The upper limit of such entropy as t = +∞ is a uniform distribution with a histogram function having a constant 1/Aupper magnitude. Essentially, there are two parts in this cost function where is a small positive weight parameter. The first part is the cross entropy distance (KL distance [34]) between and the historical data, and the second part is the relative entropy distance between and population distribution when t = +∞.

Note that we can never know the value of at the year t as we do not know . However, it can be estimated from the historical data as: (13) where ωκ for κ = 1, …, k0 + 1 are decreasing weights, which implies that the more recent data is more valued. Let be an error term of the estimation, then we have (14) the distribution of is known and bounded within . Usually can be assumed as a random variable uniformly distributed in . We now have that:

Theorem 1. The optimization problem defined in Model (12) is a strict convex optimization.

Proof. Note that the Hessian matrix H of the objective function is given by: (15) Since for all i and j, it is easy to see that H is a positive definite matrix. On the other side, it is known that the constraints of the optimization problem in the Model (12) are linear. Therefore, the feasible domain is a convex set. Both the objective function and the feasible domain are convex, hence the problem is a convex optimization. Note that one only needs to find a local minimum point of a convex optimization to obtain the global minimum point [35][36][37][38].

Total household size model: for estimating the number of each household size

In this section, we build a total household size model to further estimate the number of each household size j for j = 1, 2, …, m0 based on the predicted age distribution of population and age distribution of each individual household size. Here, our objective is to estimate the number of household size j for j = 1, 2, …, m0 in the year t + 1.

Let xj(t) be the number of household with size j in the year t and denote that . We hope to estimate the vector . As mentioned, the first stage is to obtain the estimated total population distribution based on the current fertility rate and death rate. The second stage is then to obtain the estimated age distribution of each household type j denoted as . Now we estimate the household number distribution by solving the following total household size model: (16) where ∣∣.∣∣ is the L2 norm, and X(t + 1) ≥ 0 means each component of X(t + 1) is nonnegative, and is a weighting matrix collected from the the predicted age distributions of all household sizes: (17) and W is a diagonal weighting matrix for different household size type given by (18)

The above objective function contains two parts with u being a small positive weight parameter. The first part is the distance between the estimated age distribution for population and the accumulative of the age distribution for all household sizes. The other part is the weighted distance of the estimated X(t + 1) (denoted as ) to X(t). As there are j persons in the household size j, we construct a diagonal weighting matrix W with a given power τ > 0 in Eq 18. As shown in Theorem 2, the optimization of Eq (16) is also convex.

Theorem 2. The optimization problem defined in Model (16) is convex.

Prof. The proof is similar to Theorem 1. The Hessian matrix of the objective function in Eq (16) is (19) Obviously H is a positive definite matrix and we have this theorem holds.

Simulations

In this section, we illustrate the procedure we have discussed above using the US’s Census population data. We predict the demographic distribution in the year 2010 based on the historical data in years 2000 and 2006. The prediction is then compared with the actual Census data in the year 2010. We show that the method we described here accurately captures the actual statistics. As mentioned, there are three stages in the estimation:

  1. Stage 1: Estimating the age distribution of the population by employing the age structure based population model in Section 3A.
  2. Stage 2: Estimating age distribution for each household size type by employing the individual household size model in Section 3B.
  3. Stage 3: Estimating the number of different household size type by employing the total household size model in Section 3C.

In Stage 1, we collect the population data from the US Census and get the values of all parameters that are required in Model (5). By solving this model, we obtain the estimation of the population in the year 2010 based on the the data in the year 2000 and 2006 in Fig 3.

thumbnail
Fig 3. The predicted age distribution of US population for the year 2010 based on the population data in the years 2000 and 2006.

https://doi.org/10.1371/journal.pone.0137324.g003

In Stage 2, by letting ω = [0.95 0.025 0.025] and assuming that the error term bound , we divide the population into 9 groups (Gn, n = 1, 2, …, 9) and let An = n⋅10. By solving the individual household size model (12), the age distributions for the household sizes j = 1 and j = 2, …, 7 are obtained in Figs 410, respectively. It is seen that the individual household size model predicts accurately the age distribution of all household sizes.

thumbnail
Fig 4. The estimated age distribution for household size type 1 in the year 2010 (x axis is the age index and y axis is the probability).

https://doi.org/10.1371/journal.pone.0137324.g004

thumbnail
Fig 5. The estimated age distributions for household size 2 in the year 2010 using US data.

https://doi.org/10.1371/journal.pone.0137324.g005

thumbnail
Fig 6. The estimated age distributions for household size 3 in the year 2010 using US data.

https://doi.org/10.1371/journal.pone.0137324.g006

thumbnail
Fig 7. The estimated age distributions for household size 4 in the year 2010 using US data.

https://doi.org/10.1371/journal.pone.0137324.g007

thumbnail
Fig 8. The estimated age distributions for household size 5 in the year 2010 using US data.

https://doi.org/10.1371/journal.pone.0137324.g008

thumbnail
Fig 9. The estimated age distributions for household size 6 in the year 2010 using US data.

https://doi.org/10.1371/journal.pone.0137324.g009

thumbnail
Fig 10. The estimated age distributions for household size 7 in the year 2010 using US data.

https://doi.org/10.1371/journal.pone.0137324.g010

In stage 3, the numbers of each household size are estimated by solving the total household size Model (16). As seen in Fig 11, the difference between the estimation and the real values is quite close, which again shows the accuracy of our proposed method.

thumbnail
Fig 11. The prediction of number of household size distribution for US in the year 2010.

https://doi.org/10.1371/journal.pone.0137324.g011

In addition, we also look at the cases when the error term bound . Let’s say, implying that is randomly distributed in the interval [−2 2]. By repeating the simulations many times over (200 times in our case), we can verify the robustness of our estimations. We take the household size 1 as an example. As seen in Fig 12, most of the real values of the age distribution are located inside the red area generated by the estimations.

thumbnail
Fig 12. The illustration of the robustness of the estimation compared with their real values for household size 1.

https://doi.org/10.1371/journal.pone.0137324.g012

To investigate how the parameters τ and u affect the performance of the estimation, we set different values for them and see how the error term changes, where ‖.‖1 denotes the L1 norm. Fig 13 shows the error term against different values of τ. We observed that when τ is around 0.5, the proposed algorithm achieves its best performance. This is reasonable since the diagonal elements WT W is just the number of persons in each household size when τ = 0.5. On the other hand, Fig 14 shows how the parameter u impacts the estimation error. By setting different values for u in the interval [0 0.2], we observe that the performance of the algorithm provides the best fit when u ∈ [0.005 0.01].

Discussion and Conclusions

In this paper, we have demonstrated a new method that estimates the development of age and household’s size distributions. The procedure consists of three models in three coupled stages, we referred to as: the age-structured population model in stage 1 where the age distribution of countries’ population was predicted; the individual household size model in stage 2 where the age distribution of each individual household size was estimated; and the total household size model in stage 3 where the number of different household sizes was derived by projecting the age distribution of total population onto the age distributions of individual household sizes. The procedure described here indicates that demographic trends can be accurately estimated using entropy as an optimisation variable, which we believe will be of potential interest to both academics and practitioners alike. We have illustrated and validated the correctness and accuracy of the proposed method using US data. While we have considered age and household size distributions in this article, we note that the method we have demonstrated is general and versatile enough to be extended to other time dependent demographic variables.

Acknowledgments

This work is supported by the Integrated City Planning Programme, SERC, A*STAR (Grant no. 1325000001), and Complex Systems Programme, SERC, A*STAR (Grant no. 1224504056).

Author Contributions

Conceived and designed the experiments: GL DZ YX CM. Performed the experiments: GL SHK HYX NH. Contributed reagents/materials/analysis tools: GZ CM. Wrote the paper: GL DZ YX SHK HYX NH GZ CM.

References

  1. 1. Lee R, The demographic transition: three centuries of fundamental change. The Journal of Economic Perspectives. 2003; 17:167–190.
  2. 2. Gao ZK et al, Multiscale complex network for analyzing experimental multivariate time series. Europhysics Letters. 2015; 109: 30005p1–30005p6. 2015.
  3. 3. Swanson D and Hough G, An evaluation of persons per household (PPH) estimates generated by the american community survey: A Demographic Perspective. Population. 2012; 31:235–266.
  4. 4. Xu HY, Kuo SH, Li G, Legara EFT, Zhao G and Monterola CP, Generalized Cross Entropy Modeling for Estimating Joint Distribution from Incomplete Information, A*STAR working paper, 2014.
  5. 5. O’Neilla BC, Daltonb M, Fuchsc R, Jianga L, Pachauric S and Zigovad K, Global demographic trends and future carbon emissions, Proceedings of the National Academy of Sciences of the United States of America (PNAS). 2010;107:17521–17526.
  6. 6. Stephenson J, Newman K and Mayhew S, Population dynamics and climate change: what are the links? Journal of Public Health. 2010; 32:150–156. pmid:20501867
  7. 7. Salvo JJ, and Brown WA, Population estimates and the needs of local governments, Paper Presented at U.S. Census Bureau Conference on population Estimates: Meeting User Needs (Alexandria VA). 2006.
  8. 8. Humbert JY, Mills LS, Horne JS and Dennis B, Journal compilation. A better way to estimate population trends. 2009; 118:1940–1946.
  9. 9. Gao ZK and Jin ND, A directed weighted complex network for characterizing chaotic dynamics from time series. Nonlinear Analysis-Real World Applications. 2012; 13:947–952.
  10. 10. Gao ZK, Fang PC, Ding MS and Jin ND, Multivariate weighted complex network analysis for characterizing nonlinear dynamic behavior in two-phase flow, Experimental Thermal and Fluid Science. 2015; 60:157–164.
  11. 11. Csiszar I, Why least squares and maximum entropy? An axionatic approach to inference linear inverse problem, Annual of Statistics. 1991; 19:2032–2066.
  12. 12. Markovsky I and Huffel SV, Overview of total least-squares methods, Signal processing. 2007; 87:2283–2302.
  13. 13. Li G and Wen C, Identification of Wiener systems with clipped observations, IEEE Transactions on Signal Processing. 2012; 60:3845–3852.
  14. 14. Aster RC, Borchers B and Clifford H, Parameter estimation and inverse problems, Second Edition. Elsevier. 2012.
  15. 15. Saadi S and Rahman A, Evidence of non-stationary bias in scaling by square root of time: Implications for Value-at-Risk, Journal of International Financial Markets, Institutions and Money. 2008; 18:272–289.
  16. 16. Dennis B, and Otten MRM, Joint effects of density dependence and rainfall on abundance of San Joaquin kit fox, The Journal of Wildlife Management. 2000; 64:388–400.
  17. 17. Beyene J, Fallah S, Bull SB, Tritchler D, Chan V and Knight J, Modeling complex disease with demographic and environmental covariates and a candidate gene marker. Genetic Epidemiology. 2001, 21:423–428.
  18. 18. Trawinski PR and Mackay DS, Identification of environmental covariates of West Nile virus vector mosquito population abundance. Vector Borne and Zoonotic Diseases. 2010; 10:515–26. pmid:20482343
  19. 19. Owens J, Dickerson S and Macintosh DL, Demographic covariates of residential recycling efficiency. Environment and behaviour. 2000; 32:637–650.
  20. 20. Link WA and Sauer JR, New approaches to the analysis of population in Land Birds: Comment. Ecology. 1997; 78:2632–2634.
  21. 21. Fagan WF, Meir E, Prendergast J, Folarin A and Karieva P, Characterizing population vulnerability for 758 species. Ecology Letters. 2001; 4:132–138.
  22. 22. Inchausti P and Halley J, Investigating long-term ecological variability using the global population dynamics database. Science. 2001; 293:655–657. pmid:11474102
  23. 23. Brook BW, and Bradshaw CJA, Strength of evidence for density dependence in abundance time series of 1198 species. Ecology. 2006; 87:1445–1451. pmid:16869419
  24. 24. Abbas AE, Entropy methods for joint distributions in decision analysis. IEEE Transactions on Engineering Management. 2006; 53:146–159.
  25. 25. Phillips SJ, Anderson RP and Schapire RE, Maximum entropy modeling of species geographic distributions. Ecological Modeling. 2006; 190:231–259.
  26. 26. Leyk S, Nagle NN and Buttenfield BP, Maximum entropy dasymetric modeling for demographic small area estimation. Geographical Analysis. 2013; 45:285–306.
  27. 27. Salois MJ, Regional changes in the distribution of foreign aid: An entropy approach, Physica A, 2013; 392:2893–2902.
  28. 28. Kot M, Elements of mathematical ecology, Cambridge. Cambridge University Press, 2001.
  29. 29. Leslie PH, On the use of matrices in certain population mathematics. Biometrika. 1945; 33:183–212. pmid:21006835
  30. 30. Leslie PH, Some further notes on the use of matrices in population mathematics. Biometrika. 1948; 35:213–245.
  31. 31. Guckenheimer J, Oster GF and Ipaktchi A, The dynamics of density dependent population models. Journal of Mathematical Biology. 1977; 4:101–147.
  32. 32. Smith SK, Nogle J and Cody S, A regression approach to estimating the average number of persons per household. Demograhy. 2002; 39:697–712.
  33. 33. Miller DJ and Liu W, On the recovery of joint distributions from limited information. Journal of Econometrics. 2002; 107: 259–274.
  34. 34. Contreras-Reyes JE, Asymptotic form of the KullbackLeibler divergence for multivariate asymmetric heavy-tailed distributions. Physica A. 2014; 395:200–208.
  35. 35. Kuhn HW and Tucker AW, Nonlinear programming. Proceedings of 2nd Berkeley Symposium (Berkeley: University of California Press). 1951; 481–492.
  36. 36. Li G, Ning N, Ramanathan K, He W, Pan L and Shi L, Behind the magical numbers: hierarchical chunking and the human working memory capacity, International journal of neural systems. 2013; 23:1350019. pmid:23746292
  37. 37. Li G, Wen C and Zhang A, Fixed point iteration in identifying bilinear models, Systems & Control Letters. 2015; 83:28–37.
  38. 38. Li G, Wen C, Zheng WX and Zhao G, Iterative identification of block-oriented nonlinear systems based on biconvex optimization, Systems & Control Letters, 2015; 79:68–75.