Predictors of singleton preterm birth using multinomial regression models accounting for missing data: A birth registry-based cohort study in northern Tanzania

Background Preterm birth is a significant contributor of under-five and newborn deaths globally. Recent estimates indicated that, Tanzania ranks the tenth country with the highest preterm birth rates in the world, and shares 2.2% of the global proportion of all preterm births. Previous studies applied binary regression models to determine predictors of preterm birth by collapsing gestational age at birth to <37 weeks. For targeted interventions, this study aimed to determine predictors of preterm birth using multinomial regression models accounting for missing data. Methods We carried out a secondary analysis of cohort data from the KCMC zonal referral hospital Medical Birth Registry for 44,117 women who gave birth to singletons between 2000-2015. KCMC is located in the Moshi Municipality, Kilimanjaro region, northern Tanzania. Data analysis was performed using Stata version 15.1. Assuming a nonmonotone pattern of missingness, data were imputed using a fully conditional specification (FCS) technique under the missing at random (MAR) assumption. Multinomial regression models with robust standard errors were used to determine predictors of moderately to late ([32,37) weeks of gestation) and very/extreme (<32 weeks of gestation) preterm birth. Results The overall proportion of preterm births among singleton births was 11.7%. The trends of preterm birth were significantly rising between the years 2000-2015 by 22.2% (95%CI 12.2%, 32.1%, p<0.001) for moderately to late preterm and 4.6% (95%CI 2.2%, 7.0%, p = 0.001) for very/extremely preterm birth category. After imputation of missing values, higher odds of moderately to late preterm delivery were among adolescent mothers (OR = 1.23, 95%CI 1.09, 1.39), with primary education level (OR = 1.28, 95%CI 1.18, 1.39), referred for delivery (OR = 1.19, 95%CI 1.09, 1.29), with pre-eclampsia/eclampsia (OR = 1.77, 95%CI 1.54, 2.02), inadequate (<4) antenatal care (ANC) visits (OR = 2.55, 95%CI 2.37, 2.74), PROM (OR = 1.80, 95%CI 1.50, 2.17), abruption placenta (OR = 2.05, 95%CI 1.32, 3.18), placenta previa (OR = 4.35, 95%CI 2.58, 7.33), delivery through CS (OR = 1.16, 95%CI 1.08, 1.25), delivered LBW baby (OR = 8.08, 95%CI 7.46, 8.76), experienced perinatal death (OR = 2.09, 95%CI 1.83, 2.40), and delivered male children (OR = 1.11, 95%CI 1.04, 1.20). Maternal age, education level, abruption placenta, and CS delivery showed no statistically significant association with very/extremely preterm birth. The effect of (<4) ANC visits, placenta previa, LBW, and perinatal death were more pronounced on the very/extremely preterm compared to the moderately to late preterm birth. Notably, extremely higher odds of very/extreme preterm birth were among the LBW babies (OR = 38.34, 95%CI 31.87, 46.11). Conclusions The trends of preterm birth have increased over time in northern Tanzania. Policy decisions should intensify efforts to improve maternal and child care throughout the course of pregnancy and childbirth towards preterm birth prevention. For a positive pregnancy outcome, interventions to increase uptake and quality of ANC services should also be strengthened in Tanzania at all levels of care, where several interventions can easily be delivered to pregnant women, especially those at high-risk of experiencing adverse pregnancy outcomes.

Introduction Every year, an estimated 15 million babies (11%) are born preterm (before 37 completed weeks of gestation) globally [1,2], majority (81.1%) of these occurs in Asia and sub-Saharan Africa (SSA) [1]. The rates of preterm birth in SSA are notably high in Nigeria (6.9%), Ethiopia (12.0%), and Tanzania (16.6%) [1]. Tanzania ranks the tenth country with the highest preterm birth rates in the world, and shares a 2.2% of the global proportion of all preterm births [1]. The country specific estimates shows that the proportion of preterm birth ranged between 12-13% in Mwanza region [3][4][5][6] to as high as 24% among HIV infected women in Dar es Salaam [7].
Preterm birth is a syndrome with a variety of causes, which can be classified into two broad clinical sub-types: spontaneous preterm birth (spontaneous onset of labour or following prelabour premature rupture of membranes) and provider-initiated preterm birth (induction of labor or elective caesarean birth before 37 completed weeks of gestation for maternal or fetal indications, both "urgent" or "discretionary", or other non-medical reasons) [2,[8][9][10][11].
A higher risk of preterm birth is reported among women with a history of preterm delivery, those with low (�24) or high maternal age (�40), short inter-pregnancy intervals (<24 months), low maternal body mass index (BMI), multiple pregnancies, maternal infections such as urinary tract infections, malaria, bacterial vaginosis, HIV and syphilis and those with inadequate (<4) ANC visits [5,9,[12][13][14][15]. Stress and excessive physical work or long times spent standing, drug abuse such as smoking and excessive alcohol consumption, sex of the child (more among males compared to females), hypertensive disorders of pregnancy such as pre-eclampsia or eclampsia, placental abruption, cholestasis, fetal distress, fetal growth restriction, small for gestational age (a birth weight below the 10th percentile for the gestational age), and early induction of labor or cesarean birth (before 39 completed weeks of gestation) whether for medical or non-medical reasons also increases the risk of preterm birth [2,5,9,[16][17][18].
Globally, preterm birth is a leading cause of deaths among children under five years of age [1,2,10,19]. SSA is one of the regions with the highest under five deaths in the world [19,20]. In 2018, preterm birth complications accounted for 18% of death of children under the age of five and 35% of all newborn deaths globally [20]. Preterm birth also increases the risk of babies dying from other causes, especially neonatal infections [9]. Despite modern advances in obstetric and neonatal management, the rate of preterm birth are on the rise in both low-, middleand high-income countries [1,2,21,22], while in many low-and middle-income countries, preterm newborns are reported to die because of a lack of adequate newborn care [1].
Despite a substantial progress in improving child survival since 1990 [1,23], preterm birth remains a crucial issue in child mortality and improving quality of maternal and newborn care [1]. To increase child survival and reduce preterm birth complications, the World Health Organization (WHO) recommends essential care during childbirth and postnatal period for every mother and baby (i.e. routine practice for the safe childbirth before, during and after birth), provision of antenatal steroid injections, magnesium sulfate for prevention of cerebral palsy in the infant and child, kangaroo mother care, and antibiotics to treat newborn infections [2,24]. Tanzania has also adopted these strategies [25,26] and is one of the five countries where WHO implements a clinical trial on the immediate kangaroo mother care (KMC) for preterm and babies weighing <2000 grams [2,26].
Epidemiologists are often interested in estimating the risk of adverse events originally measured on an interval scale (such as gestational age in weeks), but they often choose to divide the outcome into two or more categories in order to compute an estimate of effect (risk or odds ratio) [27]. In this study, we applied the multinomial logistic regression models, to show the effect of covariates on several preterm birth categories [2,22] to avoid the bias that might be introduced by performing a binary analysis. A number of previous studies to assess predictors of preterm birth collapsed all preterm birth categories and performed a binary regression analysis [6,7,12,18,[28][29][30][31][32][33]. This may introduce potential bias in estimating the effect of covariates on the risk of preterm birth due to a loss of information resulting from collapsing these categories. For a more focused care in the high-risk pregnancies, it is essential to estimate the risk factors for preterm birth, which may differ by the gestational age at birth.
Furthermore, missing data are common in epidemiological and clinical research [34]. Ignoring missing values in the analysis of such data potentially produces biased parameter estimates [34][35][36][37]. Stern et. al., [34] further indicated that "missing data in several variables often leads to exclusion of a substantial proportion of the original sample, which in turn causes a substantial loss of precision and power". Therefore, data analysis in this study accounted for missing data, for more precise parameter estimates. The rest of the paper is organized as follows.

Study design, setting and participants
We utilized secondary birth registry data from a prospective cohort of women who delivered singletons in the Kilimanjaro Christian Medical Center (KCMC) between the years 2000-2015.
A detailed description of the KCMC Medical birth registry is also available elsewhere [38][39][40][41][42][43]. Briefly, KCMC is one of the four zonal referral hospitals in the country and is located in the Moshi municipality, Kilimanjaro region, northern Tanzania. The centre primarily receives deliveries of women from the nearby communities, but also referral cases from within and outside the region. On average, the hospital has approximately 4000 deliveries per year [41,42,44].
The study population in this study was singleton deliveries for women of reproductive age (15-49 years) recorded in the KCMC birth registry between 2000-2015, a total of 55,003 deliveries from 43,084 mothers. We excluded 3,316 multiple deliveries, 49 records missing hospital numbers (i.e. unique identification number used to link mothers and their subsequent births), 791 observations with a mismatch between dates of births of children from the same mother or were of unknown sequence (i.e. whether a singleton or multiple births), and 6,730 deliveries with gestational age <20 weeks and >42 weeks. Data was, therefore, analyzed for 44,117 deliveries born from 35,871 mothers (Fig 1).

Data collection methods
As we have also described the data collection methods elsewhere [43], birth data at KCMC have been recorded using a standardized questionnaire and is collected by specially trained project midwives. The KCMC Medical birth registry collects prospective data for all mothers and their subsequent deliveries in the hospital's department of obstetrics and gynecology. Following informed consent, mothers were interviewed within the first 24 hours after birth given a normal delivery or as soon as a mother has recovered from a complicated delivery. The questionnaire used for data collection is available elsewhere [45]. Although the printed https://doi.org/10.1371/journal.pone.0249411.g001 questionnaires were in the English language, the Project Midwives performing the interviews were well versed in English, Swahili, and one other tribal language. Furthermore, additional information during data collection were extracted from patient files and antenatal cards for more clarification of prenatal information. Data are then transferred, entered and stored in a computerized data base system at the birth registry located at the reproductive health unit of the hospital. A unique identification number was assigned to each woman at first admission and used to trace her medical records at later admissions. Access to data analyzed in this study followed ethical approval granted on June 26, 2019.

Study variables and variable definitions
The response variable was preterm birth, defined as any birth before 37 completed weeks of gestation and further categorized based on gestational age as <28 weeks (extremely preterm), [28,32) weeks (very preterm), [32,37) weeks (moderate to late preterm), and �37 weeks (term) for a full-term pregnancy [2]. Gestational age was estimated from the date of last menstrual period of the mother and recorded in completed weeks [4].

Statistical and computational analysis
Data were analyzed using STATA version 15.1 (StataCorp LLC, College Station, Texas, USA). The primary unity of analysis was singleton deliveries for women recorded in the KCMC Medical Birth Registry between the years 2000 and 2015. We summarized numeric variables using means and standard deviations, and categorical variables using frequencies and percentages. The Chi-square test was used to compare the proportion of preterm birth by participants characteristics. We used multinomial logistic regression models to determine the predictors of preterm birth as opposed to previous studies [4, 6, 7, 12, 18, 28-30, 32, 33, 48] that performed a binary regression analysis.
The multinomial/polytomous regression model is an extension of the logistic model for binary responses to accommodate multinomial responses which does not have any restrictions on the ordinality of the response [27]. Let Y i denote a nominal response variable for the ith subject, and Y i = c (the response variable occuring in category c), while Pr(Y i ) defines the probability that Y i = c. The multinomial logit model can be written as A nominal model to allow for any possible set of c − 1 response categories is written as where the multinomial logit Z ic ¼ X 0 ic b c . In this model, all of the effects β c vary across categories (c = 1, 2, . . ., C) and makes comparisons to a reference category compared to the ordinal regression model that uses cumulative comparisons of the categories [49]. We used robust standard errors adjusted for clusters to account for nested observations/ deliveries within mothers.
We would like to indicate here that we performed preliminary analysis using the binary and ordinal logistic regression models. There were a couple of variables that did not satisfy the proportional odds (PO) assumption, hence the ordinal logistic regression model could not be used. The close alternative model that relaxes the PO assumption are the generalized ordered logistic regression models. However, we encountered a non-convergence problem, especially with four preterm birth categories and appropriate interpretation of results. For instance, the order of gestational age categories is <28 weeks (extremely preterm), [28,32) weeks (very preterm), [32,37) weeks (moderate to late preterm), and 37+ weeks (term/normal). Assuming the variable is coded as 0 to 3 (with 0 being term birth), the first panel of coefficients will be interpreted as; 0 vs. 1+2+3, then 0+1 vs 2+3 etc [50]. This will imply modeling the probability of delivering at a normal gestational age (category 0) compared to preterm (categories 1-3), probability of delivering term and very preterm vs other preterm categories, etc. Similar interpretations will apply even if preterm birth is coded from extremely preterm (0) to term (3). Such interpretation could be somehow misleading given the nature of this outcome and may not be appealing to clinicians or public health practitioners. Nevertheless, the choice of regression models often depends on the research question one would like to address. In this study, the choice of multinomial regression model was relevant to determine preterm birth predictors across different preterm birth categories, other than performing a binary or an ordinal regression analysis.
As previously indicated, data analysis in this study considered missing values in the covariates. A description of how missing data were imputed is also reported in [43]. Data were imputed using a multiple imputation technique, which is a commonly used method to deal with missing data, which accounts for the uncertainty associated with missing data [34,37,51]. We assumed the missing data were missing at random (MAR) where the probability of data being missing does not depend on the unobserved data, conditional on the observed data [34][35][36][37]; hence the variables in the dataset were used to predict missingness [43]. We also assumed a nonmonotone pattern of missingness in which some subject values were observed again after a missing value occurs [35,43,51]. Under a nonmonotone pattern of missingness, it is recommended to use chained equations, which goes with several names such as the Markov chain Monte Carlo (MCMC), and the fully conditional specification (FCS), to impute missing values [37, 51-55]. Furthermore, the FCS method allows imputation of all types of variables simultaneously, namely some continuous and other categorical.
For the illustration of FCS algorithm, we let Y denote the fully observed outcome in this study i.e., preterm birth, X denote the partially observed covariates X = X 1 , . . ., X p , and W denote the fully observed covariates W = W 1 , . . ., W q . Let X o and X m denote the vectors of observed and missing values of X for n subjects. For each partially observed covariate X j , we posit an imputation model f( . This according to [56] is typically a generalized linear model chosen according to the type of X j (e.g. continuous, binary, multinomial, and ordinal). Furthermore, a noninformative prior distribution f(θ j ) for θ j is specified. We further let x o j and x m j denote the vectors of observed and missing values in X j for the n subjects and y and w denote the vector and matrix of fully observed values of Y and W across n subjects.
Let x m(t) denote imputations of the missing values x m j at iteration t and Þ denote vectors of observed and imputed values at iteration t. Let The tth iteration of the algorithm consists of drawing from the following distributions (up to constants of proportionality) [56]; The FCS starts by calculating the posterior distribution p(θ|x o ) of θ given the observed data. This is followed by drawing a value of θ � from p(θ|x 0 ) given ðx o ; x ðtÞ À j ; w; yÞ, which is the product of the prior f(θ j ) and the likelihood corresponding to fitting the imputation model for X j to subjects for whom X j is observed, using the observed and most recently imputed values of X −j [56]. Missing values in X j are then imputed from the imputation model using the parameter value drawn in the preceding step [56]. Finally, a value x � is drawn from the conditional posterior distribution of x m given θ = θ � . The process is then repeated depending on the desired number of imputations [36, 53, 55, 56]. Within each imputation, there is an iterative estimation process until the distribution of the parameters governing the imputations have converged in the sense of becoming stable, although more cycles may be required depending on certain conditions such as the amount of missing observations in the data [55, 56]. Rubin's rule is then used to provide the final inference forŷ by averaging the estimates across M imputations given by [56];ŷ while the estimate of the variance ofŷ M is given by; which is a combination of within and between imputation variances. Detailed descriptions on implementation of the FCS/MICE algorithm in STATA is well-presented elsewhere [54, 57]. Maternal age and education level were imputed as ordinal variables, while maternal occupation, marital status, and BMI (because normal weight (18.5-24.9 Kg/m 2 ) was a reference category) as multinomial variable [43]. The rest of the variables were binary, and so imputed using the binomial distribution. Preterm birth (the outcome in this study), parity, pre-eclampsia/ eclampsia, anemia, malaria, systemic infections/sepsis, PROM, PPH, abruption placenta, placenta previa, and year of birth did not contain any missing values, hence used as auxiliary variables in the imputation model. The imputation model generated 20 imputed datasets after 500 iterations (imputation cycles). A random seed of 5000 was specified for replication of imputation results each time a multiple imputation analysis is performed [51].
We developed a multivariable analysis model by including all covariates in the multinomial logit analysis model [54]), with standard errors adjusted for clusters (i.e., deliveries nested within mothers). We then performed stepwise regression, in which variables with p < 0.1 or p < 10% were retained in the model. The next steps entailed performing a series of adjusted analysis to test the effect of retaining and dropping variables in the multivariable model. Variables in the final model were evaluated at p-value<0.05 level of statistical significance. We used AIC to compare model performance and non-nested models [58], and Likelihood ratio test to compare nested models. After the imputation of missing values, we estimated parameter estimates adjusting for the variability between imputations [54,57]. Before the analysis of imputed data, we firstly performed complete case analysis using multivariable multinomial regression model. The final model from this analysis was then compared to those from the multiply imputed dataset. We followed the recommendations suggested by Sterne et al., [34] for reporting and analysis of missing data.

Ethical consideration
As described in [43], this study was approved by the Kilimanjaro Christian Medical University College Research Ethics and Review Committee (KCMU-CRERC) with approval number 2424. For practical reasons, since the interview was administered just after the woman had given birth, consent was given orally. The midwife-nurse gave every woman oral information about the birth registry, the data needed to be collected from them, and the use of the data for research purposes. Women were also informed about the intention to gather new knowledge, which will, in turn, benefit mothers and children in the future. Participation was voluntary and had no implications on the care women would receive. Following consent, mothers were free to refuse to reply to single questions. For privacy and confidentiality, unique identification numbers were used to both identity and then link mothers with child records. There was no any person-identifiable information in any electronic database, and instead, unique identification numbers were used. Necessary measures were taken by midwives to ensure privacy during the interview process.

Maternal background characteristics by gestational age categories
The overall proportion of preterm birth in this study was 12.8%, of which 9.8% children were born at [32,37) weeks (moderate to late preterm), 1.6% at [28,32) weeks (very preterm), and 0.4% at <28 weeks (extremely preterm) of gestation. The proportions of preterm birth differed significantly by maternal background and obstetric care characteristics (Tables 1 and 2, respectively). Among adolescent mothers (15-19 years), 12.3% delivered at [32,37) weeks and 1.8% at [28,32) weeks of gestation, which is almost similar to that among older mothers (40+ years). The proportion of women who delivered at [32,37) weeks of gestation was 10.8% among rural residents, 11.0% among those with primary education level, 9.6% among those employed, and 9.6% among mothers who were married (Table 1).

Diseases and complications during pregnancy and delivery by gestational age categories
The diseases and complications during pregnancy and delivery by gestational age categories are shown in (Table 2). There were statistically significant differences in the proportion of preterm birth categories by diseases and complications during pregnancy and delivery except for anaemia, infections/ sepsis and child's sex. Significantly higher proportion of deliveries born at [32,37) weeks of gestation was among mothers who experienced placenta previa (39.6%), abruption placenta (37.3%), delivered LBW baby (37.1%), perinatal death (28.1%), preeclampsia/eclampsia mothers (24.3%), PROM (18.9%) with <4 ANC visits (17.0%), and postpartum hemorrhage (14.8%). Also, the proportion of deliveries born at [28,32) weeks of gestation was significantly higher among mothers with pre-eclampsia/eclampsia (6.2%), abruption

Distribution of missing values
Percentage distribution of missing values in this study are summarized in

Predictors of preterm birth
Due to a small number of deliveries 161 (0.4%) at <28 weeks of gestation recorded at the KCMC Medical birth registry between 2000 and 2015, we combined this category with deliveries at [28,32) weeks of gestation, 714 (1.6%). This gives a total of 875 (2.0%) in the new <32 (very/extremely preterm) category. The collapsed categories increased statistical power and improved model performance, given a non-convergence problem of models with all three preterm birth categories. Results before imputation of missing values. Findings from the adjusted analysis of the multinomial regression model before imputation of missing values are shown in Table 4   Moreover, in the adjusted analysis, maternal age, referral status, pre-eclampsia/eclampsia, number of ANC visits, placenta previa, LBW, perinatal status, child's sex, and year of birth remained significantly associated with delivering at <32 weeks of gestation (very/extremely preterm). Notably, the odds of delivering at <32 of gestation were nearly forty times (OR = 36.23, 95%CI 29.91, 43.89) among deliveries born with LBW compared to normal weight at birth. This is more than four times higher odds compared to the effect in the  1.06, 95%CI 1.04, 1.09), which is three-times higher than the effect in the [32,37) weeks of gestation. These results demonstrate the advantage of the multinomial regression as opposed to the simple binary regression models. We see that the effect of some covariates (LBW, inadequate ANC visits, placenta previa, and perinatal death) are more pronounced for the extreme preterm birth category than the moderately to late preterm birth category (Table 4).

Results after imputation of missing values.
After imputation of missing values (in the covariates), the standard errors were relatively lower while the coefficients (odds ratios) ( Table 5) were either lower or higher compared to those in the complete case analysis

Discussion
Globally, the trends of preterm birth rate has been increasing over time [1,2,9,48]. Findings in the current study also revealed the rising trends of both moderate to late preterm (32 to <37 weeks of gestation) and very/extremely preterm birth (<32 weeks of gestation) between the years 2000-2015. A recent systematic review and modelling analysis revealed that Tanzania is among the top 10 countries (10 th position) with the highest preterm birth rate (16�6%) and contributed to 2.2% of the global preterm birth estimates [1]. Based on the estimates released seven years ago (2013) by Blencowe et. al., [9], Tanzania was not in the top 10 countries with the highest (>15%) preterm birth rates globally. By then, Malawi had the highest preterm birth rate (18%) in SSA and South East Asia [9,12].
Previous studies at the KCMC zonal referral hospital [4,5] and Bugando Medical Center in Mwanza region [6] reported the preterm birth rate of 14%; where [4] utilized cohort data between the years 2000-2008 while [5] and [6] conducted case-control studies. The rising trends and relatively high preterm birth rates in Tanzania are alarming, given the documented short-and long-term consequences, particularly an increased risk of recurrence in subsequent pregnancies, stillbirths, and neonatal mortality [4,11,12,19,32,59,60]. In fact, mothers who experienced perinatal death in this study were more likely to deliver preterm. The effect of perinatal death almost doubled in the very/ extremely preterm category.
Multiple imputation was performed to increase precision of parameter estimates, as it accounts for the uncertainty associated with missing data [34,35,37,51]. After the imputation of missing values, the standard errors are relatively lower and coefficients (odds ratios) were either lower or higher than those in the complete case analysis. Although the direction of associations remained the same, precision of parameters estimates is increased after imputation of missing data. It has been reported that "multiple imputation provides unbiased and valid estimates of associations based on information from the available data-ie, yielding estimates similar to those calculated from full data" [37]. Data analysts should consider accounting for missing data in their analysis using proper techniques to reduce the bias associated with simple analysis (such as analyzing available or complete cases) that ignore missing values [37,51,52].
Results from the imputed data revealed that adolescent (15-19 years) mothers and mothers aged 20-24 years had higher odds of delivering moderately to late preterm births (32 to <37 weeks) as well as very/extremely preterm (<32 weeks though this association was not statistically significant) compared to mothers aged 25-34 years. Our findings are consistent with previous studies [7,12,15,29,48]. Authors in these studies revealed that younger (<24 years) mothers are at increased risk of delivering preterm. A previous study in Canada indicated that women aged 20-24 years were more at risk of delivering spontaneous preterm birth [15]. However, authors in this study did not include adolescent mothers. Data from the Tanzania Demographic and Health Survey 2015/16 revealed the rising trends of teenage childbearing (15-19 years) from 23% in 2010 to 27% in 2015/16 [61]. Younger age at first pregnancy is a public health concern due to an increased risk of complications during pregnancy and child birth as well as maternal and neonatal mortality [15,61]. A systematic review and meta-analysis in SSA documented an association between adolescent child-bearing and an increased risk of low birth weight, pre-eclampsia/eclampsia, preterm birth and maternal and perinatal mortality [62]. Our findings suggests that interventions in Tanzania should emphasize on delayed age at first pregnancy and provision of adolescent and youth friendly sexual and reproductive health services [26,63,64], for positive pregnancy experiences.
Mothers referred for delivery at the KCMC zonal referral hospital were more likely to deliver preterm compared to those who had self-referred (normal clinic attendance). Similar findings has been reported elsewhere [43,62], where women referred for delivery are more likely to have more pregnancy-related complications such as pre-eclampsia, which increases the risk of preterm birth. Close clinical follow-up is recommended to this group of women during prenatal care to minimize pregnancy-related complications, such as preterm birth and associated consequences. Mothers with primary education compared to higher (college/university) education level had significantly higher odds of delivering moderately to late, but not very/extremely preterm. These findings were consistent to a meta-analysis of 12 European Cohorts, where poor health at birth was higher among babies born from mothers with low education levels [65]. Low socio-economic status, including low education level is reported to affect pregnancy outcomes and complications [60,66]. Policies and programs to improve maternal and child care in Tanzania should address health inequalities and prioritize the marginalized groups taking a multi-sectoral approach.
Furthermore, male children were more likely to be delivered preterm compared to females. This might be associated with shorter gestational duration for male compared to female fetuses [67]. A study in the UK found no significant relationship between fetal gender and the risk of preterm birth among women at high risk of delivering preterm (ie, with a history of miscarriage, preterm birth or cervical surgery) [18]. We also found that primiparous women were less likely to deliver preterm compared to multiparous. Findings from a meta-analysis using data from cohort studies in LMIC indicated that nulliparous, aged <18 years and parity �3 aged �35 years women were more likely to experience adverse neonatal outcomes, including preterm birth [63]. Other studies found no significant association between parity and the risk of preterm birth [5,12,32,68]. Despite that, interventions to improve maternal and child care should be delivered through out the course of woman's reproductive period.
Among the factors associated with the rise in trends of preterm birth is the iatrogenic early delivery (i.e. following labour induction and/or caesarean delivery) carried out for fetal or maternal indications [69]. In this study, women who delivered moderately to late preterm were more likely to deliver through caesarean section (CS). It is possible that these women had other obstetric complications such as a previous CS, severe pre-eclampsia/eclampsia, placenta praevia, preterm premature rupture of membranes, and high birthweight that contributed highly to CS delivery and hence preterm birth [70,71]. The odds of delivering both moderately to late and very/extremely preterm was high among mothers with pre-eclampsia/eclampsia, experienced placenta previa, and abruption placenta, as also reported elsewhere [5,21,60]. The effect of placenta previa on delivering very/extremely preterm were almost twice compared to the moderately to late preterm birth category. These conditions are both the risk factors as well as common indications for preterm birth [48,60]. PROM increases the risk of preterm birth [9,21,48,72], which is consistent to the findings in this study. Previous studies have shown that PROM is among the common indications of spontaneous preterm birth [21,22,72].
LBW was associated with eight-fold higher odds of moderately to late preterm ([32,37) weeks of gestation) and nearly 40 times higher odds of very/extremely preterm (<32 weeks of gestation). In fact, the proportions of moderately to late and very/extremely preterm birth were significantly higher among deliveries born with LBW than in the normal birth weight deliveries (37.1% and 14.3%, vs 6.4% and 0.4%, respectively) (results before imputation). Our findings agree with a previous case-control study in northern Tanzania, where LBW was associated with over 34-folds risk of preterm delivery [5]. The observed increase in preterm birth due to LBW could be attributed to two factors; the fact that preterm birth is also a risk factor for LBW (low birth weight but appropriate for gestation age) and intrauterine growth retardation or small for gestational age. Literature shows that extremely preterm babies are more likely to be born with LBW, while newborns small for gestational age are at a higher risk of experiencing morbidity and mortality [73,74]. In this study, 81.2% (688/847) of very/extreme preterm newborns were born with both LBW and preterm compared to 41.6% (1779/4279) among moderately to late preterm (results before imputation). On the other hand, babies born preterm are at an increased risk of being born with LBW [44] and experiencing perinatal and neonatal morbidity and mortality [20,43]. Care for the LBW and preterm babies is a critical intervention for improving child survival. Special attention should be given to babies born with LBW at <32 weeks of gestation.
According to the WHO recommendations, antenatal care visit remains to be a critical entry point where high-risk pregnancies can be identified and managed [24,72,75]. We found that women with inadequate (<4) ANC visits are more likely to deliver moderately to late and very/extremely preterm. Similar findings were also reported in other studies [5,6,48,76]. However, these studies estimated the association between the number of ANC visits in the overall preterm birth categories (<37 weeks of gestation) compared to our study that showed different risk patters in two sub-categories of preterm birth (<32 and [32,37) weeks of gestation). In Tanzania, over half (51%) of pregnant women had at least four ANC visits during their last pregnancy [61]. Considering the current WHO recommendations of eight or more visits [75], different strategies are needed to promote health care seeking behaviors for pregnant women, and provision of quality ANC services at all levels of care. The timing and number of ANC visits is as important as the content and quality of care [77].
In this study, we applied the multinomial regression models with two categories of preterm birth (<32 and [32,37) weeks of gestation) due to rarity of cases in the <28 gestational weeks category. Eventually, the collapsed categories increased statistical power. Nevertheless, it is also possible that there may be under-reporting of extreme premature deliveries in the KCMC Medical birth registry. Despite the low accuracy of gestational age estimation based on the date of last menstrual period [9,10,60], it remains the widely used method in resource-limited settings like Tanzania. Even where ultrasound is available, this method "requires skilled technicians, equipment and for maximum accuracy, first-trimester antenatal clinic attendance" [9], which is still a challenge in Tanzania [61]. There are alternative gestational age estimation methods, such as a combination of ultrasound and LMP [9,10,60], but the question remains on the feasibility and applicability of these options in resource-limited settings.
Another limitation of this study is that it was hospital-based, utilizing the KCMC Medical Birth Registry data from the KCMC zonal referral hospital in northern Tanzania, hence suffers from referral bias. Nearly a quarter of all women were referred for delivery during the study period. This may affect the generalization of the results. Nevertheless, this is the only birth-registry in the country (and potentially one of the few in SSA) providing critical information for pregnancy monitoring, administrative, and research purposes. Such registries allows for routine and inter-generational linkage and analysis of mother-child records. The KCMC hospital and its partners should promote routine data quality checks, resolve data quality and reporting challenges to ensure a sustainable operation of the birth registry, for current and future use.

Conclusion
The findings from this study support other studies showing improved precision of parameter estimates after imputation of missing values and the rising trends of preterm birth rates. The multinomial regression models allowed for the simultaneous assessment of predictors of different preterm birth categories as opposed to binary regression analysis. Policy decisions should intensify efforts on improved maternal and child care throughout the course of pregnancy and childbirth, towards prevention of preterm birth. Interventions to increase the uptake and quality of ANC services should also be strengthened in Tanzania at all levels of care, where several interventions can easily be delivered to pregnant women [75], especially those at high-risk of experiencing adverse pregnancy outcomes. The number of ANC visits is as important as the content of care [77].