Skip to main content
Browse Subject Areas

Click through the PLOS taxonomy to find articles in your field.

For more information about PLOS Subject Areas, click here.

  • Loading metrics

Identifying the factors associated with cesarean section modeled with categorical correlation coefficients in partial least squares

  • Maryam Sadiq ,

    Contributed equally to this work with: Maryam Sadiq, Tahir Mehmood, Muhammad Aslam

    Roles Data curation, Formal analysis, Investigation, Writing – original draft, Writing – review & editing

    Affiliation Department of Mathematics and Statistics, Riphah International University, Islamabad, Pakistan

  • Tahir Mehmood ,

    Contributed equally to this work with: Maryam Sadiq, Tahir Mehmood, Muhammad Aslam

    Roles Conceptualization, Investigation, Methodology, Software, Supervision, Validation, Writing – review & editing

    Affiliation School of Natural Sciences (SNS), National University of Sciences and Technology (NUST), Islamabad, Pakistan

  • Muhammad Aslam

    Contributed equally to this work with: Maryam Sadiq, Tahir Mehmood, Muhammad Aslam

    Roles Project administration, Supervision, Writing – review & editing

    Affiliation Department of Mathematics and Statistics, Riphah International University, Islamabad, Pakistan


28 Aug 2019: The PLOS ONE Staff (2019) Correction: Identifying the factors associated with cesarean section modeled with categorical correlation coefficients in partial least squares. PLOS ONE 14(8): e0221955. View correction


Cesarean section (CS) is associated with maternal morbidity and mortality in developing countries. This study is conducted to assess factors associated with CS in Pakistan using partial least squares (PLS) algorithm, where categorical factors are modeled. Nationally representative maternal data from Pakistan Demographic and Health Surveys (PDHS) conducted during 2012-2013 is used in this study. Among correlation coefficient based PLS regression proposed algorithms for categorical factors, Pearson’s Contingency Coefficient (CC) PLS coupled with loading weight (LW) appeared to be the most efficient method in terms of model performance and influential factor selection. Region of residence, type of place of residence, mother’s and her partner’s level of education, wealth index, year of birth, previous terminated pregnancy, use of contraception, prenatal care provided by a doctor and nurse/midwife/LHV (lady health visitor), assistance provided by a nurse/midwife/LHV,number of antenatal visits, size of child, antenatal care provided by government hospital, transport facility for medical care, baby birth status, mother’s age at first birth, preceding birth interval and vaccination of hepatitis B-1 and B2 are found to be significantly affecting the CS delivery method. Correlation coefficient based PLS regression algorithms may serve more efficiently as a multivariate technique to treat high-dimensional categorical data.


Globally, cesarean section (CS) delivery rates have accelerated in recent decades [35, 57]. CS is a surgical technique adopted to prevent medical complications and maternal/fetal mortality during delivery [4]. High quality differential maternal health care facility is a vital necessity for every woman across the world [40]. Unnecessary CS may result in an increased hazard of maternal as well as neonatal deaths [7]. The world health organization (WHO) seriously noticed and evaluated the high CS rate in 2015. Considering two conditions; necessity of CS and avoiding needless CS, WHO recommended to set 5-15% CS rate to rescue the maternal/neonatal lives in essential conditions but avoid unnecessary CS surgery [48].

Pakistan Demographic and Health Survey (PDHS) (2012-13) reported a CS rate of 39% among highly educated women and 34% in women with highest wealth quintile in Pakistan. The final report summerized huge rural-urban variation in CS rates and relatively higher rate for first births (23%), increased antenatal visits (30%) and births in health facility (29%) were observed [13]. According to the WHO report (2015), non-availability and deficiencies of various medical services in developing countries are estimated. The report further documented the highest rate of maternal deaths recorded predominantly in South Asia and Sub-Saharan Africa [47] presenting the adverse maternal and neonatal health outcomes in these regions. Approximately 60% of maternal deaths occur only in 10 countries of the world including Pakistan [47]. China has the highest CS rate among Asian countries [36], while perceptible increase in CS rates is also observed in some South Asian countries in recent years [9].

Diversity of trends in rates and risk factors of CS is noticed predominant over time, especially in Sub-Saharan Africa and South Asia. Regional disparities and disproportionate socioeconomic levels are reported as influential factors of CS [29]. Maternal morbidity is strongly associated with CS in developing countries [2] and specifically elective CS without medical symptoms is reported as a significant risk factor of higher rate of this morbidness [44]. Houweling et al. (2007) examined poor-rich inequalities in maternal care using Demographic and Health Survey (DHS) data from 45 developing countries including Pakistan. They reported huge poor-rich variation in CS rates within rural as well as urban regions [20]. Similarly, two other studies revealed lower CS rates among the poor in developing countries of Africa and South Asia [9, 34]. Another study investigated trends and inequalities in CS rates in Pakistan using data from Pakistan Demographic and Health Surveys (PDHS) administrated during 1990 to 2013. This study documented significant association of CS with wealth index, education and urbanity of women [41]. Olusanya et al. (2009) analyzed data collected during universal newborn hearing screening (UNHS) program in Nigeria. They established significant association of parity, maternal age, maternal positive HIV, social class, lack of antenatal care and multiple gestations with higher risk of emergency CS delivery [45].

Advances in public health generates high-dimensional data having many factors, where some may be irrelevant or redundant. Analyzing such high-dimensional health data faces the curse of dimensionality for effective interpretation of the fitted model. Curse of dimensionality refers to a few samples with many factors which results in multicollinearity and over fitting [26, 27]. In recent years, partial least squares (PLS) based methods have been the subject of increasing concern and attention as a multivariate approach for modeling multi-collinear data. For improved model performance, a large number of modified PLS-based algorithms have been proposed yet. For instance, canonical-powered partial least squares (CPPLS) is established by integrating PLS with canonical correlation analysis for classification and regression problems [23, 24]. Soft-threshold or sparse partial least squares is another version of PLS, introduced [52] by defining a soft-threshold in the algorithm nearly similar to sparse PLS [30]. Other modified PLS algorithms include orthogonal PLS (oPLS) [55], penalized PLS (pPLS)[32, 33], robust PLS (roPLS)[16, 22], kernel PLS (kPLS)[18], interval PLS (iPLS)[43], recursive PLS (rPLS) [19], quadratic PLS (qPLS) [60], generalized PLS (gPLS) [5], weighted PLS (wPLS) [21], genetic algorithm combined with partial least square (gaPLS)[31], radial-based PLS (rbfPLS) [58], distance-based PLS (dbPLS)[28]. Most PLS algorithms deal with factors measured on a continuous scale and no specific algorithms is presented yet to address the categorical scale factors. The main objective of this study is to improve the PLS algorithm to specifically handle the factors measured on categorical scale. The secondary objective is to identify the significant factors associated with CS using a most efficient PLS algorithm. To extend the PLS approach to specifically handle the factors measured on categorical scale, six PLS algorithms with modified loading weights established on categorical measures of association are proposed in this study. The model performance was compared with standard PLS and the algorithms were further used for selecting important factors of CS in Pakistan.

Materials and methods

Data set

The data set having 39 factors with 1660 observations is obtained from Pakistan demographic and health survey (PDHS) 2012-13 for the present study. This survey was conducted by the National Institute of Population Studies (NIPS), Pakistan. The United States agency for international development (USAID) provided financial and technical assistance for the survey. The PDHS is part of the worldwide Demographic and Health Survey program, which is designed to collect data on fertility and family planning along with maternal and child health. The delivery method is taken as the response factor (y) with two categories; cesarean section (CS) group and vaginal delivery group having equal observations.

Partial least square(PLS): Standard form

Ordinary least squares (OLS) modeling is not an appropriate method due to multicollinearity between factors, hence, PLS being an alternative of OLS is used for modeling perspective. Among the several genres of PLS, the orthogonal score PLS algorithm is considered here due to its simplicity and wide applicability in factor selection methods. The algorithm initially centered the data and . Defined by Naes and Helland [42], it assumes that some A is equal to the number of components to be predicted (where A ≤ p), then for a = 1, 2, …, A the algorithm runs:

  1. Loading weights are computed by
    The weights define the direction in the space spanned by Xa−1 of maximum covariance with ya−1. Loading weights are normalized to have length equal to 1 by
  2. Score vector ta is computed by
  3. X-loadings pa are computed by regressing the factors in Xa−1 on the score vector:
    Similarly Y-loadings qa are computed by
  4. Deflate Xa−1 and ya−1 by deducing the contribution of ta:
  5. If a < A return to 1. The computed loading weights, scores and loadings during each iteration of the algorithm be stored in vectors/matrices which are W = [w1, w2, …, wA], T = [t1, t2, …, tA], P = [p1, p2, …, pA], q = [q1, q2, …, qA].

The PLS estimators for the regression coefficients for the linear model are found by and .

The standard PLS works well for quantitative response y and explanatory factors from X but if response and factors are qualitative, which is the case of the current study, then standard PLS may not be optimal. PLS loading weight plays key role in model building and also has the ability to select influential factors. Loading weights reflect the correlation between response y and explanatory factors from X. If the data set is qualitative then Cramer’s V, Phi coefficient, Tschuprow’s T coefficient, Contingency Coefficient, Yule’s Q and Yule’s Y are the recommended measures of correlation.

Cramer’s V (CV) PLS

Cramer’s V correlation coefficient defined by Harald Cramer in 1964 [12] measures the association between nominal factors. It ranges from 0 to 1 and is used to define the PLS loading weights as (1) Where χ2 is derived from Pearson’s chi-squared test, n is the total number of observations, r and c denote number of categories in response and factor respectively.

Phi coefficient (PC) PLS

Phi correlation coefficient also reffered as mean square contingency coefficient [12] is been used in defining the PLS loading weights as (2)

Tschuprow’s T coefficient (TC) PLS

Tschuprow’s T correlation coefficient [56] is the refined form of Phi coefficient and is used in defining the PLS loading weights as (3) where r and c denote the number of categories in response and explanatory factor respectively and ϕ is the mean square contingency defined as (4) Where ϕij is the proportion of the sample in the (i, j)th cell of the r × c contingency table.

Pearson’s contingency coefficient (CC) PLS

Pearson’s contingency coefficient [15] measures the strength of association between categorical factors, and is used for defining the loading weights as (5)

Yule’s Q (YQ) PLS

Yule’s Q correlation coefficient [62] determines the strength of relationship between the expalnatory factor and the response. Yule’s Q based loading weights are defined as; (6) where OR represents the odds ratio.

Yule’s Y (YY) PLS

Yule’s Y or the coefficient of colligation [62] is a measure of association for qualitative data. (7)

Filter methods for factor selection in PLSR

In standard PLS a variety of factor selection methods exist [38, 53]. Here the following five filter methods for subset selection of influential explanatory factors are considered.

Loading weight(LW)

The loading weighs rj used as a measure of identification of important factor is defined as [37]; (8)

Regression coefficients(RC)

The PLS estimator of the regression coefficient for the model is represented by; (9)

Variable importance in projection (VIP)

Variable importance in projection defined by [14, 59] is the measure to assemble the importance of each factor based on loading weight. For factor j, the VIP measure is (10) where SSa denote the sum of squares explained by the ath component and the importance of jth factor is represented by the term (waj/‖wa‖)2]. Hence, the VIP score Vj represents the contribution of jth factor based on variance explained by each component. If Vj is less than a defined threshold, jth factor can be excluded, where the threshold ranges from 0 to ∞. A threshold between 0.83 to 1.21 is recommended [11] while Vj > 1 is a generally accepted threshold [14, 17].

Selectivity ratio (SR)

The selectivity ratio (SR) is the ratio between explained variance (Ve) and residual variance (Vr) for ith factor on target-projected component for reponse. SR is defined as; (11)

The defined threshold is SR > F(critical) where F(critical) represents the value corresponding to the F-test. Hence, the factor with SR value greater than the threshold is included in the model. The SR provides the numerical contribution of each factor included in the model. The higher the value of SR, the more important the factor is, for prediction purpose. Lowest SR recommends to eliminate the corresponding factors without affecting the performance [51].

Significance multivariate correlation (SMC)

The basic concept of significance multivariate correlation is to minimize the influence of irrelevant factors in X-structure and enhance the importance of factors which have high contribution related to response factor. SMC can be used for simulated as well as real data sets. (12) Where MSRegression is the mean square regression and MSResidual is denotes the mean square residual [54]


The CS data set contains 39 factors sampled over 1660 samples (mothers). Cramer’s V and Phi correlation coefficients are used to detect the presence of multicollinearity in the nominal data. The correlograms shown in Fig 1 evidenced strong correlation between 12 factors while moderate correlation is observed between various other factors by both methods. Presence of multicollinearity violates the assumption of linear independence and hence, logistic regression and generalized linear models become inappropriate to handle collinear data. Therefore, PLSR is used to deal categorical data with high multicollinearity.

Fig 1. Correlogram by Cramer’s V correlation matrix is presented in upper panel while the lower panel represented the Phi correlation matrix.

Color intensity and the size of the circle are proportional to the strength of the correlation measure between factors.

The survey data may include some noise samples. It is important to identify and eliminate the noise samples. For this, the standard PLS model over the data is fitted and PLS scores from component 1 and component 2 were plotted, as presented in the upper panel of Fig 2. The women laying out of red circle were supposed to be outliers and were discarded from the data set for further analysis. For model fitting, samples are required to be independent,therefore, the PLS scores were clustered. For illustration purpose, lower panel of Fig 2 presents the visualized graph showing several samples (mothers) grouped in one cluster. The samples grouped in a cluster are correlated, hence one member from each cluster should be considered only. Since the samples/mothers can be divided into two groups, namely CS group and vaginal delivery group. Both groups are clustered separately through k-means and optimum number of clusters were found. Therefore, 100 women from CS group and 100 women from the vaginal delivery group were selected by picking the centroid of each cluster.

Fig 2. The PLS scores from component 1 and component 2 were plotted in the upper panel.

Mothers laying out of red circle were considered outliers. For illustration purposes, the visualized graph showing several samples (mothers) grouped in one cluster is presented in the lower panel.

After initial processing, 39 explanatory factors measured over 200 samples (mothers) were considered for further analysis. To have a reliable model performance, the data was split into training (70%) and test data (30%). The fitted model was trained over the training data, while the model performance was measured over the test data. The split of the data into training and test was done randomly. To measure reliability and accuracy of different PLS models, validation and calibration of the proposed methods are being observed. Model validation over test data and model calibration over training data were measured for all PLS algorithms with and without filter factor selection methods to compare the discriminant accuracy of new and existing PLS methods. In order to remove the effect of randomness the data was split 10 times, in each split the model was trained on training data and was evaluated on test data by computing validation and calibration accuracy. Six PLS based models called Cramer’s V PLS (CV-PLS), Phi Coefficient PLS (PC-PLS), Tschuprow’s T Coefficient PLS (TC-PLS), Pearson’s Contingency Coefficient PLS (CC-PLS), Yule’s Q PLS (YQ-PLS) and Yule’s Y PLS (YY-PLS) are proposed and compared them with standard PLS through validation and calibration. Each PLS method is evaluated through five filter subset selection methods, including loading weights (LW), regression coefficients (RC), variable importance in projection (VIP), selectivity ratio (SR) and significance multivariate correlation (SMC) for factor selection.The validation accuracy of all PLS methods with and without factor selection methods is presented in the upper panel of Fig 3.

Fig 3. The validation accuracy of PLS methods including, Cramer’s V PLS (CV-PLS), Phi coefficient PLS (PC-PLS), Tschuprow’s T coefficient PLS (TC-PLS), Pearson’s contingency coefficient PLS (CC-PLS), Yule’s Q PLS (YQ-PLS) and Yule’s Y PLS (YY-PLS) models against the filter subset selection methods including LW, RC, VIP, SR and SMC by using lattice plot is presented in the upper panel, while the calibration accuracy is presented in the lower panel.

The plot for standard PLS without filter method is presented as ‘None’ in Fig 3. It indicates that the average validation performance of five introduced PLS algorithms is higher than the standard PLS without any filter measure while PC-PLS exhibits similar performance as standard PLS. All newly introduced PLS regression algorithms combined with LW, VIP and SR filter methods also showed higher validation performance than standard PLS regression combined with these filter methods. Equal accuracy of CV-PLS, CC-PLS, YQ-PLS and standard PLS is observed for RC filter method. Interestingly, it is noted that only YQ-PLS combined with SMC showed lower performance than standard PLS combined with same filter method. The CC-PLS combined with LW showed highest validation accuracy in differentiating the two classes of mothers.

The calibration accuracy of all PLS methods combined with filter methods is presented in the lower panel of Fig 3. In case of calibration performance all proposed PLS algorithms combined with RC and SR filter methods and also without considering any filter method improved the accuracy of dealing categorical variables than standard PLS with same condition of factor selection measures.It is observed that the CC-PLS algorithm combined with a LW factor selection method appears to be most efficient amongst all other methods having highest median validation performance and hence, considered for further analysis.

To strengthen these findings, analysis of variance test was conducted where the significance of PLS methods and factor selection measures were assessed in explaining the variation in accuracy of the models. Anova results are presented in Table 1. This indicates that the CC-PLS has ≈ 24% accuracy in differentiating the CS group, which is 2.39% more compared to standard PLS with (p < 0.001). Similarly LW factor selection method has ≈ 30% accuracy for differentiating the CS group, which is 5.29% more compared without selection measure (p < 0.001). Hence CC-PLS coupled with LW is being applied for further analysis of CS group and selection of influential factors.

Table 1. Anova results showing the significance of PLS methods and factor selection measures in explaining the variation in accuracy of the models are presented.

For modeling the dataset, CC-PLS coupled with LW was executed and coefficients are presented in Table 2 where inflectional factors are extracted.

Table 2. CC-PLS coefficients are presented where inflectional factors are extracted by coupling CC-PLS with LW.

After analysis, 20 influential factors which best differentiate the CS group and vaginal delivery group were found. The negative association of region and type of place of residence with the CS group showed that for every additional unit in region and type of residence, the CS group decreased by an average of 0.250 and 0.237 units respectively. A significant positive association of mother’s education level with CS method is observed demonstrating 0.114 unit increase in CS group due to this factor. On the other hand, negative association of mother’s partner education level is observed. Wealth index and year of birth are observed to be positively associated with the CS group showing an average increase of 0.056 units. The results further demonstrate that the unit change in earlier terminated pregnancy decreases the CS group by 0.09 units and contraceptive use increase the CS group by 0.107 units. CS group is expected to decrease by 0.089 units by a unit change in prenatal care by nurse/midwife/LHV while positive association of size of the child at the time of birth with delivery method is observed showing 0.199 unit change in CS group by a unit increase in this factor. Furthermore, if assistance given by a nurse/midwife/LHV changes by one unit, CS group decreased by 0.072 units. Prenatal care provided by a doctor increases the CS group by 0.092 units. Antenatal care provided by government hospital is negatively associated with CS group and availability of transport facility is positively associated with this group. New born birth status and preceding birth interval are found to be positively associated with CS group. CS group is predicted to increase by 0.102 units when the mother’s age at first birth goes up by one respectively. The present study found that vaccination of Hepatitis B-1 and B-2 grows up the CS group by 0.208 and 0.264 units respectively, but no previous study was found in this context.


This study identified the factors associated with CS using a representative sample data extracted from Pakistan demographic and health survey (PDHS) 2012-13. Presence of multicollinearity prompted the use of PLS as one of the popular substitute of linear regression. Data is processed for elimination of outliers and clustering through k-means before further analysis. The resulting sample is then split randomly into test and training data sets. Six PLS algorithms based on correlation coefficients are proposed to specifically deal the categorical factors and compared with standard PLS to evidence the improvement in model building. The proposed algorithms include Cramer’s V PLS (CV-PLS), Phi Coefficient PLS (PC-PLS), Tschuprow’s T Coefficient PLS (TC-PLS), Pearson’s Contingency Coefficient PLS (CC-PLS), Yule’s Q PLS (YQ-PLS) and Yule’s Y PLS (YY-PLS). Furthermore, five well-known filter based subset factor selection measures were incorporated with each PLS algorithm and then, compared with standard PLS to observe variation in the efficiency of proposed and existing PLS algorithms with and without filter selection measures. The filter based subset factor selection measures considered in this study are; loading weights (LW), regression coefficients (RC), variable importance in projection (VIP), selectivity ratio (SR) and significance multivariate correlation (SMC).

Validation and calibration accuracy is measured over 10 iterations to compare the performance of seven PLS algorithms with and without filter selection measures.

Regarding validation and calibration accuracy, two important and interesting facts are observed. Firstly, without considering any filter-based factor sub-set selection method, CV-PLS, TC-PLS, CC-PLS, YQ-PLS, YY-PLS evidenced improved validation performance compared to standard PLS for dealing categorical factors. This significant improvement suggested application of proposed PLS algorithms for model building specifically managing such type of data. While PC-PLS showed equal performance as standard PLS for validated data without filter measure. This uniformity in efficiency supported PC-PLS to be an alternative choice of standard PLS in the specific case of categorical response factor. All proposed PLS algorithms reflected higher accuracy compared to standard PLS for calibrated data without any filter measure. The higher calibration performance showed increased reliability and accuracy of proposed PLS algorithms. Secondly, and more significantly, increased efficiency is observed for all PLS algorithms combined with factor selection measures compared to without these measures for validated as well as calibrated data. Overall, the proposed PLS algorithms with and without factor selection measures enhanced the accuracy for validated and calibrated data compared to standard PLS with and without these measures, respectively. For current data set, the CC-PLS algorithm combined with LW factor selection measure is observed to be most efficient model amongst all other models having highest median validation accuracy performance.

The CC-PLS coupled with LW was recommended for modeling the dataset and 20 influential factors are observed to identify the CS group. The association of region and type of place of residence with CS group is observed for the present data. A study using the data of 150 countries consistently evidenced that developed regions have the highest rate of CS [8]. Another study conducted in Bangladesh showed that place of residence was an important predictor of CS for childbirth [25]. A significant association of mother’s and her partner’s education level with CS group is identified. Along with parent’s education, wealth index and year of birth are also observed to be associated with CS group. Previous studies evidenced that parent’s level of education and wealth index effected the CS rates [6, 10, 61].

Among factors related to pregnancy history, mother’s age at first birth, preceding birth interval, earlier terminated pregnancy and contraception were found associated with the CS group for the current study. Results of other studies that investigated the relationship of terminated pregnancy history, use of contraceptive methods, mother’s age and birth intervals with CS ratio were consistent with the present study [1, 3, 49, 50]. Regarding maternal care factors, prenatal care provided by a doctor and nurse/midwife/LHV, assistance given by a nurse/midwife/LHV, antenatal care provided by government hospital and availability of transport facility to get medical help are evidenced to be related to identify the CS group. Concerning child related factors, the present data established association of new born birth status and size of the child at the time of birth with CS group. Several other studies pointed the association of cesarean section with prenatal care, facilities and antenatal visits. Moreover, significant association between CS delivery method and newborn status, weight, size and head circumference was also reported previously [1, 39, 46, 49]. The present study found that vaccination of Hepatitis B-1 and B-2 are significantly associated with CS group, but no previous investigation was found in this context.


Proposed PLS algorithms were a better choice regarding model performance and factor selection of categorical health data. It indicates that these correlation coefficients based algorithms produce models with superior interpretation potential. Using CC-PLS with LW, the factors identified as the significant predictors of CS were commensurate with other studies. So, correlation coefficient based PLS regression algorithms have the potential as a multivariate technique in public health research to treat high-dimensional categorical data more efficiently.

Supporting information

S1 File. Variable description.

Complete description of response and explanatory factors including each category is presented.


S2 File. Notes on DHS data sets.

Information about data sets, questionnaires, codes and data files is presented.


S3 File. Minimal data set.

Original data set having first 50 observations is provided to replicate the study findings.



The authors would like to acknowledge DHS (Demographic and Health Surveys) who made their data available for free.


  1. 1. Al Busaidi Ibrahim, Al-Farsi Yahya, Ganguly Shyam, and Gowri Vaidyanathan. Obstetric and non-obstetric risk factors for cesarean section in oman. Oman medical journal, 27(6):478, 2012. pmid:23226819
  2. 2. Althabe Fernando, Sosa Claudio, Belizán José M, Gibbons Luz, Jacquerioz Frederique, and Bergel Eduardo. Cesarean section rates and maternal and neonatal mortality in low-, medium-, and high-income countries: an ecological study. Birth, 33(4):270–277, 2006. pmid:17150064
  3. 3. Au H-K, Liu C-F, Tzeng C-R, and Chien L-W. Association between ultrasonographic parameters of cesarean scar defect and outcome of early termination of pregnancy. Ultrasound in Obstetrics & Gynecology, 47(4):506–510, 2016.
  4. 4. Bailey Patsy, Lobis Samantha, Maine Deborah, and Fortney Judith A. Monitoring emergency obstetric care: a handbook. World Health Organization, 2009.
  5. 5. Bastien Philippe, Esposito Vinzi Vincenzo, and Tenenhaus Michel. Pls generalised linear regression. Computational Statistics & data analysis, 48(1):17–46, 2005.
  6. 6. Begum Tahmina, Rahman Aminur, Nababan Herfina, Dewan Md Hoque Emdadul, Khan Al Fazal, Ali Taslim, and Anwar Iqbal.Indications and determinants of caesarean section delivery: Evidence from a population-based study in matlab, bangladesh. PloS one, 12(11):e0188074, 2017. pmid:29155840
  7. 7. Betrán Ana P, Merialdi Mario, Lauer Jeremy A, Bing-Shun Wang, Thomas Jane, Look Paul Van, and Wagner Marsden. Rates of caesarean section: analysis of global, regional and national estimates. Paediatric and perinatal epidemiology, 21(2):98–113, 2007. pmid:17302638
  8. 8. Betrán Ana Pilar, Ye Jianfeng, Moller Anne-Beth, Zhang Jun, Gülmezoglu A Metin, and Torloni Maria Regina. The increasing trend in caesarean section rates: global, regional and national estimates: 1990-2014. PloS one, 11(2):e0148343, 2016. pmid:26849801
  9. 9. Cavallaro Francesca L, Cresswell Jenny A, França Giovanny VA, Victora Cesar G, Barros Aluisio JD, and Ronsmans Carine. Trends in caesarean delivery by country and wealth quintile: cross-sectional surveys in southern asia and sub-saharan africa. Bulletin of the World Health Organization, 91:914–922D, 2013. pmid:24347730
  10. 10. Cesaroni Giulia, Forastiere Francesco, and Perucci Carlo A. Are cesarean deliveries more likely for poorly educated parents? a brief report from italy. Birth, 35(3):241–244, 2008. pmid:18844650
  11. 11. Chong Il-Gyo and Jun Chi-Hyuck. Performance of some variable selection methods when multicollinearity is present. Chemometrics and intelligent laboratory systems, 78(1-2):103–112, 2005.
  12. 12. Cramér Harald. Mathematical methods of statistics (PMS-9), volume 9. Princeton university press, 2016.
  13. 13. Pakistan Demographic. Health survey 2012–13. islamabad and calverton, ma: National institute of population studies and icf international; 2013, 2015.
  14. 14. Eriksson Lennart, Johansson Erik, Kettaneh-Wold Nouna, and Wold S. Multi-and megavariate data analysis: principles and applications. Umetrics, 2001.
  15. 15. Friendly Michael and SAS Institute. Visualizing categorical data. Sas Institute Cary, NC, 2000.
  16. 16. Gil Juan A and Romera Rosario. On robust partial least squares (pls) methods. Journal of Chemometrics, 12(6):365–378, 1998.
  17. 17. Gosselin Ryan, Rodrigue Denis, and Duchesne Carl. A bootstrap-vip approach for selecting wavelength intervals in spectral imaging applications. Chemometrics and Intelligent Laboratory Systems, 100(1):12–21, 2010.
  18. 18. Helland Inge S. On the structure of partial least squares regression. Communications in statistics-Simulation and Computation, 17(2):581–607, 1988.
  19. 19. Helland Kristian, Berntsen Hans E, Borgen Odd S, and Martens Harald. Recursive algorithm for partial least squares regression. Chemometrics and intelligent laboratory systems, 14(1-3):129–137, 1992.
  20. 20. Houweling Tanja AJ, Ronsmans Carine, Campbell Oona MR, and Kunst Anton E. Huge poor-rich inequalities in maternity care: an international comparative study of maternity and child care in developing countries. Bulletin of the World Health Organization, 85:745–754, 2007. pmid:18038055
  21. 21. Huang Xiaohong, Pan Wei, Han Xinqiang, Chen Yingjie, Miller Leslie W, and Hall Jennifer. Borrowing information from relevant microarray studies for sample classification using weighted partial least squares. Computational biology and chemistry, 29(3):204–211, 2005. pmid:15979040
  22. 22. Hubert Mia and Vanden Branden K. Robust methods for partial least squares regression. Journal of Chemometrics, 17(10):537–549, 2003.
  23. 23. Indahl Ulf. A twist to partial least squares regression. Journal of Chemometrics, 19(1):32–44, 2005.
  24. 24. Indahl Ulf G, Liland Kristian Hovde, and Tormod Næs. Canonical partial least squares—a unified pls approach to classification and regression problems. Journal of Chemometrics, 23(9):495–504, 2009.
  25. 25. SM Mostafa Kamal. Preference for institutional delivery and caesarean sections in bangladesh. Journal of health, population, and nutrition, 31(1):96, 2013.
  26. 26. Ron Kohavi and Dan Sommerfield. Feature subset selection using the wrapper method: Overfitting and dynamic search space topology. In KDD, pages 192–197, 1995.
  27. 27. Daphne Koller and Mehran Sahami. Toward optimal feature selection. Technical report, Stanford InfoLab, 1996.
  28. 28. Krishnan Anjali, Kriegeskorte Nikolaus, and Abdi Hervé. Distance-based partial least squares analysis. In New perspectives in Partial Least Squares and Related Methods, pages 131–145. Springer, 2013.
  29. 29. Kunst Anton E and Houweling Tanja. A global picture of poor-rich differences in the utilisation of delivery care. Safe motherhood strategies: a review of the evidence, 2001.
  30. 30. Lê Cao Kim-Anh, Rossouw Debra, Robert-Granié Christele, and Besse Philippe. A sparse pls for variable selection when integrating omics data. Statistical applications in genetics and molecular biology, 7(1), 2008. pmid:19049491
  31. 31. Leardi Riccardo. Genetic algorithms in feature selection. In Genetic algorithms in molecular modeling, pages 67–86. Elsevier, 1996.
  32. 32. Lindgren Fredrick, Geladi Paul, Rännar Stefan, and Wold Svante. Interactive variable selection (ivs) for pls. part 1: Theory and algorithms. Journal of Chemometrics, 8(5):349–363, 1994.
  33. 33. Lindgren Fredrik, Geladi Paul, Berglund Anders, Sjöström Michael, and Wold Svante. Interactive variable selection (ivs) for pls. part ii: Chemical applications. Journal of Chemometrics, 9(5):331–342, 1995.
  34. 34. Long Qian, Kempas Taina, Madede Tavares, Klemetti Reija, and Hemminki Elina. Caesarean section rates in mozambique. BMC pregnancy and childbirth, 15(1):253, 2015.
  35. 35. Lumbiganon Pisake, Laopaiboon Malinee, Gülmezoglu A Metin, Souza João Paulo, Taneepanichskul Surasak, Ruyan Pang, Attygalle Deepika Eranjanie, Shrestha Naveen, Mori Rintaro, Hinh Nguyen Duc, et al. Method of delivery and pregnancy outcomes in asia: the who global survey on maternal and perinatal health 2007–08. The Lancet, 375(9713):490–499, 2010.
  36. 36. Lumbiganon Pisake, Laopaiboon Malinee, Gülmezoglu A Metin, Souza JP, Taneepanichskul S, Ruyan P, Attygalle DE, Shrestha N, Mori R, Nguyen DH, et al. World health organization global survey on maternal and perinatal health research group. method of delivery and pregnancy outcomes in asia: the who global survey on maternal and perinatal health 2007-08. Lancet, 375(9713):490–499, 2010.
  37. 37. Martens Magni. Sensory and chemical quality criteria for white cabbage studied by multivariate data analysis. Lebensmittel-Wissenschaft+ Technologie = Food science+ technology, 1985.
  38. 38. Mehmood Tahir, Liland Kristian Hovde, Snipen Lars, and Sæbø Solve. A review of variable selection methods in partial least squares regression. Chemometrics and Intelligent Laboratory Systems, 118:62–69, 2012.
  39. 39. Mendoza-Sassi Raúl Andrés, Cesar Juraci Almeida, da Silva Patricia Rodrigues, Denardin Giovana, and Rodrigues Mariana Mendes. Risk factors for cesarean section by category of health service. Revista de saúde pública, 44(1):80–89, 2010. pmid:20140332
  40. 40. Miller Suellen, Abalos Edgardo, Chamillard Monica, Ciapponi Agustin, Colaci Daniela, Comandé Daniel, Diaz Virginia, Geller Stacie, Hanson Claudia, Langer Ana, et al. Beyond too little, too late and too much, too soon: a pathway towards evidence-based, respectful maternity care worldwide. The Lancet, 388(10056):2176–2192, 2016.
  41. 41. Mumtaz Sarwat, Bahk Jinwook, and Khang Young-Ho. Rising trends and inequalities in cesarean section rates in pakistan: Evidence from pakistan demographic and health surveys, 1990-2013. PloS one, 12(10):e0186563, 2017. pmid:29040316
  42. 42. Naes Tormod and Helland Inge S. Relevant components in regression. Scandinavian journal of statistics, pages 239–250, 1993.
  43. 43. Norgaard L, Saudland A, Wagner J, Pram Nielsen J, Munck L, and Balling Engelsen S. Interval partial least-squares regression (ipls): a comparative chemometric study with an example from near-infrared spectroscopy. Applied Spectroscopy, 54(3):413–419, 2000.
  44. 44. Oladapo Olufemi T, Lamina Mustafa A, and SULE-ODU Adewale O. Maternal morbidity and mortality associated with elective caesarean delivery at a university hospital in nigeria. Australian and New Zealand journal of obstetrics and gynaecology, 47(2):110–114, 2007. pmid:17355299
  45. 45. Olusanya Bolajoko O and Solanke Olumuyiwa A. Maternal and neonatal factors associated with mode of delivery under a universal newborn hearing screening programme in lagos, nigeria. BMC pregnancy and childbirth, 9(1):41, 2009.
  46. 46. O’Neill Sinéad M, Kearney Patricia M, Kenny Louise C, Henriksen Tine B, Lutomski Jennifer E, Greene Richard A, and Khashan Ali S. Caesarean delivery and subsequent pregnancy interval: a systematic review and meta-analysis. BMC pregnancy and childbirth, 13(1):165, 2013. pmid:23981569
  47. 47. World Health Organization et al. Trends in maternal mortality: 1990-2015: estimates from who, unicef, unfpa, world bank group and the united nations population division: executive summary. 2015.
  48. 48. World Health Organization et al. Who statement on caesarean section rates. Technical report World Health Organization, 2015.
  49. 49. Patel Roshni R, Peters Tim J, and Murphy Deirdre J. Prenatal risk factors for caesarean section. analyses of the alspac cohort of 12 944 women in england. International journal of epidemiology, 34(2):353–367, 2005.
  50. 50. Rajabi Abdolhalim, Maharlouei Najmeh, Rezaianzadeh Abbas, Rajaeefard Abdolreza, and Gholami Ali. Risk factors for c-section delivery and population attributable risk for c-section risk factors in southwest of iran: a prospective cohort study. Medical journal of the Islamic Republic of Iran, 29:294, 2015. pmid:26913257
  51. 51. Rajalahti Tarja, Arneberg Reidar, Berven Frode S, Myhr Kjell-Morten, Ulvik Rune J, and Kvalheim Olav M. Biomarker discovery in mass spectral profiles by means of selectivity ratio plot. Chemometrics and Intelligent Laboratory Systems, 95(1):35–48, 2009.
  52. 52. Sæbø Solve, Almøy Trygve, Aarøe Jørgen, and Aastveit Are H. St-pls: a multi-directional nearest shrunken centroid type classifier via pls. Journal of Chemometrics, 22(1):54–62, 2008.
  53. 53. Saeys Yvan, Inza Iñaki, and Larrañaga Pedro. A review of feature selection techniques in bioinformatics. bioinformatics, 23(19):2507–2517, 2007. pmid:17720704
  54. 54. Tran Thanh N, Afanador Nelson Lee, Buydens Lutgarde MC, and Blanchet Lionel. Interpretation of variable importance in partial least squares with significance multivariate correlation (smc). Chemometrics and Intelligent Laboratory Systems, 138:153–160, 2014.
  55. 55. Trygg Johan and Wold Svante. Orthogonal projections to latent structures (o-pls). Journal of chemometrics, 16(3):119–128, 2002.
  56. 56. Aleksandr Aleksandrovich Tschuprow and MA Kantorowitsch. Principles of the mathematical theory of correlation. Technical report, William Hodge, 1939.
  57. 57. Villar J, Valladares E, Wojdyla D, Zavaleta N, Carroli G, and Velazco . Global survey on maternal and perinatal health research group: Caesarean delivery rates and pregnancy outcomes: the 2005 who global survey on maternal and perinatal health in latin america. Lancet, 367:1819–1829, 2006.
  58. 58. Walczak B and Massart DL. The radial basis functions—partial least squares approach as a flexible non-linear regression technique. Analytica Chimica Acta, 331(3):177–185, 1996.
  59. 59. Wold S, Jonsson J, Sjörström M, Sandberg M, and Rännar S. Dna and peptide sequences and chemical processes multivariately modelled by principal component analysis and partial least-squares projections to latent structures. Analytica Chimica Acta, 277(2):239–253, 1993.
  60. 60. Wold Svante, Kettaneh-Wold Nouna, and Skagerberg Bert. Nonlinear pls modeling. Chemometrics and intelligent laboratory systems, 7(1-2):53–65, 1989.
  61. 61. Yaya Sanni, Uthman Olalekan A, Amouzou Agbessi, and Bishwajit Ghose. Disparities in caesarean section prevalence and determinants across sub-saharan africa countries. Global health research and policy, 3(1):19, 2018. pmid:29988650
  62. 62. Udny Yule G. On the methods of measuring association between two attributes. Journal of the Royal Statistical Society, 75(6):579–652, 1912.