On the development of a semi-nonparametric generalized multinomial logit model for travel-related choices

A semi-nonparametric generalized multinomial logit model, formulated using orthonormal Legendre polynomials to extend the standard Gumbel distribution, is presented in this paper. The resulting semi-nonparametric function can represent a probability density function for a large family of multimodal distributions. The model has a closed-form log-likelihood function that facilitates model estimation. The proposed method is applied to model commute mode choice among four alternatives (auto, transit, bicycle and walk) using travel behavior data from Argau, Switzerland. Comparisons between the multinomial logit model and the proposed semi-nonparametric model show that violations of the standard Gumbel distribution assumption lead to considerable inconsistency in parameter estimates and model inferences.


Introduction
The Gumbel distribution (also referred to as the Type-I extreme value distribution) plays a central role in discrete choice models for travel demand analysis [1]. This can be attributed to two major reasons. First, the Gumbel distribution closely resembles the normal distribution, which is often the preferred distribution to characterize the random disturbance term in an econometric model that accounts for the effect of unobserved factors. Second, when the Gumbel distribution is assumed for random components of utility functions, a closed-form likelihood function is obtained in the context of the application of the microeconomic utility maximization principle. With a closed-form likelihood function, maximum likelihood estimation (MLE) methods can be applied with ease to estimate model coefficients consistently and efficiently. Due to these appealing features of the Gumbel distribution, the Multinomial Logit (MNL) model is widely applied in practice and preferred over its counterpart that is based on the assumption of a normally distributed random error component (i.e., Multinomial Probit or MNP model) [2][3][4]. In the context of discrete-continuous choice behaviors, the Multiple Discrete-Continuous Extreme Value (MDCEV) model [5][6][7][8][9]  Gumbel distribution has a neat closed-form log-likelihood expression while others based on the normal distribution assumption do not have this feature [10][11][12][13][14][15][16][17]. However, according to the theory of maximum likelihood estimation, the consistency and efficiency of maximum likelihood estimators depend on the validity of the distributional assumption made on the random error term. It is important to ensure that the distributional assumptions on the random error terms are valid when applying the MLE method to estimate model coefficients of a discrete choice model. Methods to test for violations of the normal distribution are currently available in the economic literature [18]. Recently, the authors developed a practical method to test the validity of the distributional assumption on the random disturbance term in an MNL model and obtained significant statistical evidence to reject the standard Gumbel distribution assumption in a very commonly encountered empirical setting dealing with long distance travel mode choice [19]. That finding motivates this particular study which aims to develop and present the formulation for a Semi-nonparametric Generalized Multinomial Logit Model (SGMNL) for travel-related choices. The objective of this study is to generalize the MNL model by relaxing the assumption of a Gumbel distribution using a seminonparametric approach, and then demonstrate the efficacy of the approach by applying the generalized model to an empirical setting of travel mode choice. It should be noted that this generalization essentially differs from other extensions of the MNL that have yielded the Nested Logit, Cross-nested Logit, Heteroskedastic Logit or Multinomial Probit models [20]. Those models are generalized extensions that persistently employ the unimodal Gumbel or normal marginal distributions, whereas the proposed semi-nonparametric model presented in this paper allows the marginal error distribution to have multiple modes. Thus, the proposed model provides the ability to examine potential bias in model coefficients, marginal effects and elasticities in a discrete choice model that may arise when a unimodal distribution like the standard Gumbel distribution is violated in random components of utility functions.
Discrete choice models are widely used in transportation planning practice to predict travel mode choice behavior; the choice of transport mode has important implications for traffic congestion, energy consumption and air pollution. The study of mode choice behavior and its determinants can help transportation planning professionals design alternatives and implement policies that enhance sustainability, livability, and public health while reducing delays due to congestion. There are a number of recent studies in the literature that have focused on a study of travel mode choice behavior. For example, Shen et al. (2016) found that proximity to metro stations has a significant positive effect on the choice of rail transit as a primary commuting mode [4]. Ding et al. (2017) applied an integrated structural equation model and discrete choice model to investigate how the built environment affects travel mode. In their model system, they account for the mediating effects of car ownership and travel distance, thereby capturing both the direct and indirect effects of built environment attributes on travel mode choice [2]. Ding et al. (2014) proposed a cross-classified multilevel probit model of travel mode choice [21]. Comparisons with a traditional mode choice model not only revealed the effects of residential and workplace location on tour-based commute mode choice behavior, but also revealed the presence of spatial heterogeneity across home location and workplace in mode choice behavior. In this paper, a semi-nonparametric choice modeling method is proposed and applied to model commute mode choice among four alternatives (auto, transit, bicycle and walk) using data from Argau, Switzerland. The proposed approach is motivated by the desire to offer a more flexible and robust methodological framework for activity-travel behavior analysis.
The remainder of the paper is organized as follows. In Section 2, the literature on semi-nonparametric choice models is reviewed. In Section 3, the orthonormal Legendre polynomial is introduced and then applied to extend the standard Gumbel distribution, thus enabling the development and formulation of the Semi-nonparametric Generalized Multinomial Logit Model (SGMNL). In Section 4, data used for the empirical study is described, and empirical estimation results are presented and discussed. Finally, conclusions and directions for future research are presented in the last section.

Literature review
As early as the time when McFadden initially proposed the MNL model [22], econometricians have been questioning the validity of the distributional assumption on the error term in random utility functions [23]. When a violation of the standard Gumbel distribution assumption is found, alternative modelling approaches may be explored to overcome the ill-effects. Adopting an alternative parametric distribution for random utilities may prove to be a solution; for example, the Weibull or logistic distribution recently proposed in the literature [24,25] could serve as appropriate distributional assumptions on the random error term. In addition, a generalized multinomial logit model or a discrete-continuous choice model that allows heteroscedastic variance may also prove to be superior to the standard MNL and MDCEV model [26,27]. However, all of these alternative distributions are unimodal in nature and therefore cannot capture potential multimodalities in random errors.
Concerns about the adverse effects of violations of distributional assumptions on the random error components have motivated the development of semi-parametric and semi-nonparametric choice models. The semi-parametric choice model employs the kernel density method to estimate the distribution of random errors, and therefore does not rely on any parametric distributional assumptions [28][29][30][31][32]. The semi-nonparametric (SNP) choice model, on the other hand, is developed based on a polynomial approximation of a probability density function (PDF) that takes a flexible form [33]. Because the likelihood function has an explicit analytical expression, the SNP choice modeling method appears to be more widely applied in practice than the semi-parametric approach [34][35][36][37].
Similar to a binary probit model, the SNP binary choice model formulation also starts with a random utility (U), which can be expressed as U = V + ε, where "V" is the systematic component and "ε" is the random component. If a dummy variable "y" indicates whether an alternative is chosen or not, then P(y = 1) = P(U > 0) = P(V + ε > 0) = P(ε > −V). The probability density function of "ε" takes the following form: In Eq (1), φ(ε) represents the PDF of the standard normal distribution and is referred to as the "a priori distribution". The denominator ensures that R þ1 À 1 f ðεÞdε ¼ 1. Eq (1) can be extended as follows: On the development of a semi-nonparametric generalized multinomial logit model for travel-related choices To evaluate the probability value above, recursion formulas may be applied to derive the indefinite integral of R ε i+j φ(ε)dε. The above SNP choice model is limited to a binary choice situation due to its computational complexity in the context of a multinomial choice situation.

Modeling methodology
In Eq (5), a n ¼ ffiffiffiffiffiffiffiffi ffi . The advantage of using this polynomial is that it ensures ( According to Gallant and Nychka[33], the prior distribution in the semi-nonparametric approach can be a distribution other than the standard normal distribution. In this paper, the orthonormal Legendre polynomial is used to construct a semi-nonparametric (SNP) probability density function that extends the standard Gumbel distribution as follows: where g(x) = exp(−e −x ) Á exp(−x), G(x) = exp(−e −x ), δ k are scalar parameters and K represents the total number of polynomials. Using Eq (6), it can be shown that R þ1 À 1 f ðxÞ ¼ 1. As f(x) is positive, it qualifies as a probability density function. Fig 1 compares the semi-nonparametric probability densities when the number of polynomials is 1 (K = 1) and the parameter δ 1 takes a value of -2, 0, 1 or 2. When δ 1 is 0, the distribution reduces to a standard Gumbel distribution, as shown by the red curve. When δ 1 takes a value of -2, 1 or 2, the distributions are bimodal, although the secondary peak in the distribution is rather flat when δ 1 is equal to -2 or 1. Fig 2 compares the semi-nonparametric probability densities when the number of polynomials is 2 (K = 2) and two scalar parameters δ 1 and δ 2 are involved. With two polynomials, and where the highest power term of "G(x)" increases to 2, the SNP function represented in Eq (7) can generate a more flexible probability density distribution. It can be seen that, when δ 1 is 2 and δ 2 is -2, the distribution exhibits two modes with almost equal probability densities. When δ 1 is 0 and δ 2 is 2, the distribution shows three modes. It may further be expected that, when the number of polynomials (K) or the highest power term of "G(x)" increases, the SNP function with a flexible form can effectively represent the probability density function for a large family of distributions with multiple modes. Such flexibility allows for a better representation of the distribution of the error term in a random utility function of a choice model, and therefore provides the ability to obtain more consistent estimates of model coefficients.

Simplifying the semi-nonparametric (SNP) probability density function (PDF)
Following Gallant and Nychka [33], it is possible to employ the SNP PDF in Eq (7) to construct random components in utility functions so that multiple modes may be accommodated in their distributions. Before the choice probability can be derived, the SNP PDF needs to be simplified first. Using Eqs (4) and (5), it is possible to write the polynomial in a general form as: where c n,k is a constant coefficient for the term "x k " in the n th polynomial. When k > n, c n, . Then, L 0 = 1 and L 1 = ax + b. When n ! 2, as per Eq (5), Then, it is possible to write: In the equation above; c n;0 ¼ À a n c nÀ 1;0 þ b n c nÀ 2;0 ; c n;k ¼ a n ð2c nÀ 1;kÀ 1 À c nÀ 1;k Þ þ b n c nÀ 2;k Þ; 0 < k < n; c n;n ¼ 2a n c nÀ 1;nÀ 1 : When n = 0 or 1, define c 0,0 = 1, c 1,0 = b, and c 1,1 = a. For any integer "n" (n ! 2), the recursion equations (10) can be applied to compute the coefficients c i,j and all of the c i,j values form a lower triangular matrix, called the "c" matrix in this paper. Table 1 provides an example of such a "c" matrix when "n" reaches 6. With the "c" matrix, the general form of the orthonormal Legendre polynomial (given the "n" value) may be obtained. For example, when n = 4, the fourth row vector of coefficients in the "c" matrix can be extracted to write the polynomial as L 4 (x) = 3x 0 − 60x 1 + 270x 2 − 420x 3 + 210x 4 .
After the "c" matrix is generated, δ 0 needs to be defined as 1 and the numerator in the SNP probability density function in Eq (7) can be rewritten as: The SNP probability density function in Eq (7) may then be rewritten as: In the f ormula above; Essentially, the SNP PDF in Eq (7) has been simplified to be: where ξ m is a function with respect to parameters δ k , and M (= 2K) is the highest power term of "G(x)" in the formula. The relationship between ξ m and δ k is described by Eqs (11) and (12). The cumulative distribution function (CDF) of the extended probability density function may be formulated as:

Derivation of choice probabilities and likelihood function
Suppose there are "J" alternatives in the choice set and their random utility functions are U 1 , U 2 , . . ., U J . Let the utility U j be expressed as the sum of the systematic component V j and the random component ε j (i.e., U j = V j + ε j ). Assume that ε j independently follows the extended distribution and its semi-nonparametric PDF and CDF are given as: On the development of a semi-nonparametric generalized multinomial logit model for travel-related choices The subscript "j" is added to allow ε j in various random utilities to have different SNP distributions. In addition, three Lemmas, whose proofs are furnished in S1 Appendix, are used in the subsequent derivation of choice probabilities. Based on the utility maximization principle, where "y" is a categorical choice variable indicating the specific alternative that is chosen. Then, P(y = 1) = P(ε 2 < V 12 + ε 1 ,ε 3 Let the integral part in the formula be defined as "Int", i.e., According to Lemma 3 in S1 Appendix, By substituting "Int" into the choice probability expression, an elegant closed-form equation for the choice probability may be obtained: The derivation above is shown for the case when y = 1, but can be generalized to the situation where y = k. Without loss of generality, The log-likelihood function over the entire sample may be formulated as: where I() is an indicator function; the subscript "i" is the index for an observed choice in the sample and "N" is the sample size. The log-likelihood function can be maximized to estimate model coefficients in the systematic component V j as well as parameters in the vector δ j that have been incorporated into x i;m j . When all M j = 0, P y ¼ k   Table 2. Both level of service (LOS) attributes and commuters' demographic and socioeconomic attributes are included as explanatory variables in the utility functions. Travel times, including auto in-vehicle time, transit invehicle time, and bicycle and walk times, exhibit significantly negative coefficients in the respective utility functions. Transit service frequency takes a significantly positive coefficient, indicating that a high service frequency would increase propensity of commuters to use transit. Model coefficients associated with demographic and socioeconomic attributes show that female commuters are less likely to use auto and bicycle modes. Low-income commuters are more likely to use transit or bicycle modes, while high-income commuters are less likely to use the transit mode. Commuters with lower education level are less likely to use auto than those with high education level. Older commuters are less likely to use public transit. All of the estimation results are behaviorally intuitive and consistent with expectations. The model's loglikelihood value at convergence is -2495.646, corresponding to an adjusted likelihood ratio index of 0.1923 for the overall goodness-of-fit measure of the model.

Data and modeling procedure
Next, the proposed SGMNL (semi-nonparametric generalized multinomial logit) model is estimated to relax the standard Gumbel distribution for random components in modal utility functions. First, consider the specification in which K j is set at 1, where "K" is the number of polynomials in Eq (7) and "j" is an index for travel mode (i.e., j = 1, 2, 3 or 4). When K 1 = 1, it is found that the log-likelihood value improves from -2495.646 to -2488.037. As the current model nests the original MNL model, the likelihood ratio chi-square test may be applied to show that the improvement is statistically significant [i.e., (2495.646-2488.037) ×2 = 15.22 > 3.84, the critical chi-square value for one degree of freedom at a 95% confidence level]. This result strongly rejects the assumption of a standard Gumbel distribution for the random component in the auto utility function.
Model estimation results are presented in the second part of Table 2 and denoted as "SGMNL-11". In this model, the signs of explanatory variable coefficients do not change from those obtained in the standard MNL model, but the magnitudes of coefficients in the auto utility function are found to differ. As expected, the alternative specific constant in the auto utility function changes substantially from -0.0919 to 0.9242 because the expectation of the new SNP distribution is very different from the expectation of the standard Gumbel distribution (Euler constant % 0.577), and the alternative specific constant reflects this difference. An interesting finding is that the significance level of the single coefficient δ 1,1 (as indicated by the t-statistic) is not as strong as that implied by the χ 2 test for the overall model fit. However, it should be noted that the likelihood ratio test should be applied to determine whether a semi-nonparametric choice model form is more appropriate because the significance of multiple coefficients, and their contribution to overall goodness-of-fit, needs to be tested in most occasions.  On the development of a semi-nonparametric generalized multinomial logit model for travel-related choices After one significant coefficient δ 1,1 is found for the first utility function, K 2 in the second utility function is then set to 1 and δ 2,1 is estimated. Estimation results for this model, denoted as "SGMNL-21", are presented in the third part of Table 2. It can be seen that, after δ 2,1 is introduced in the model specification, δ 1,1 becomes insignificant but δ 2,1 becomes highly significant as indicated by the t-statistics. The likelihood ratio test indicates that the model "SGMNL-21" with additional coefficient δ 2,1 is significantly better than the model "SGMNL-11", which does not include parameter δ 2,1 [(2488.037-2472.741) × 2 % 30.59 > 3.84]. The likelihood ratio test also shows that "SGMNL-21" is significantly better than the regular MNL model specification [(2495.646-2472.741) × 2 % 45.81 > 5.99, the critical χ 2 value corresponding to two degrees of freedom at a 95% confidence level]. Given that both "SGMNL-11" and "SGMNL-21" performed significantly better than the regular MNL model, both δ 1,1 and δ 2,1 should be retained in the SNP model. A comparison of coefficient estimates shows considerable differences across the "SGMNL-21", "SGMNL-11", and "MNL" models, particularly for the transit utility functions. This is consistent with the notion that the introduction of δ 1,1 and δ 2,1 will change the expectation and standard deviation of random components; both alternative specific constants and coefficients of explanatory variables change accordingly.
When δ 3,1 or δ 4,1 for bicycle and walk modes are introduced, no significant improvement is observed. In the interest of brevity, those estimation results are not presented here. The modeling effort now moves to the second stage, where the "K" value is increased to 2 and the coefficients δ 1,2 , δ 2,2 , δ 3,2 and δ 4,2 are introduced into the model one by one. In this stage, it is found that only the introduction of δ 2,2 in the transit utility function significantly improves the overall model fit (χ 2 test value = 6.57 > 3.84) while all other δ values do not. A final model estimation effort is performed, in which the "K" value is increased to 3 and parameter δ 2,3 is introduced in the model. The maximum likelihood estimation procedure fails to converge, indicating that the sample of 2,756 choice observations may not be sufficient to support model estimation where the "K" value is increased to 3. Thus, the final best model is considered to be that which adopts a "K" value of 2 and introduces parameter δ 2,2 , in addition to parameters δ 1,1 and δ 2,1 introduced in "SGMNL-21". This final model is designated "SGMNL-22". If its model coefficients are compared with those in "SGMNL-21", there is no substantial difference observed, except for the alternative specific constant and the coefficient associated with the "high-income" dummy variable in the transit utility function. As this is considered the final model, all subsequent comparisons are conducted between the MNL model and the final "SGMNL-22" model.  (11) and (12) are used to convert the estimated δ values to ξ values and then Eq (13) is used to compute the probability densities based on ξ values. The green curve represents the standard Gumbel distribution for random components in bicycle and walk mode utility functions (i.e., e3 and e4 in Fig 3). The blue curve represents the distribution of the random component in the auto utility function. The coefficient δ 1,1 not only reduces the variance of the distribution of the random component but also shifts its mode towards the negative side by about 0.6 units. This helps explain why the alternative specific constant in the auto utility of the "SGMNL-22" model is substantially more positive than that in the MNL model. The positive alternative specific constant offsets the negative expectation of the new random component. The lower variance of the error distribution for the auto utility may be due to the existence of fewer unspecified or unobserved random factors associated with auto mode choice than with other mode choices. The distribution of the random component in the transit utility function (i.e., e2) presents an interesting pattern in the context of this study. With the inclusion of parameters δ 2,1 and δ 2,2 in the model (both of which are significant), "e2" depicts a bimodal distribution as shown by the red curve. The major mode on the right side is located near 0.6 and the minor one on the left side is near -1.2 on the coordinate axis. Based on this finding, it may be conjectured that there are two key groups of commuters mixed in the sample. One group of commuters has a positive attitude and inclination towards using transit and is associated with the major mode of the distribution. Meanwhile, a smaller group of commuters has a negative attitude towards transit and comprises the distribution near the minor mode. Although the exact source of the bimodal distribution is uncertain, the proposed SNP modeling method depicts the existence of such a phenomenon and exposes the potential limitation of using conventional MNL choice models that are based on unimodal distributional assumptions. Capturing the bimodal distribution in the choice model can help realize more consistent coefficient estimates and reliable policy sensitivities.

A comparison of aggregate marginal effects and elasticities
Coefficients in choice models usually do not directly reflect the impact of an explanatory variable on choice probabilities, particularly when the standard deviations of random components are scaled up or down, as in the transit or auto utility in the SGMNL model estimated in this study. To better understand differences in model sensitivity between MNL and SGMNL models, marginal effects and elasticities are computed and compared. In this subsection, aggregate marginal effects (AME) and aggregate elasticities (AE) with respect to level of service (LOS) variables are computed based on the following two equations: In the above equations, "P" represents the choice probability expression of the MNL or SGMNL model. "x i " represents a vector of explanatory variables except the one (i.e., z i ) whose marginal effect or elasticity is being computed. "Δ" takes a value of 0.01 in this study as it is found that such a small interval provides sufficiently accurate estimates for "AME" and "AE" in both MNL and SGMNL models. Table 3 presents a comparison of computed "AME" and "AE" values between MNL and SGMNL-22 models. Relative differences in "AME" and "AE" are found to be considerable, which validates the notion that maximum likelihood estimators are inconsistent when distributional assumptions are violated. Such differences have important policy implications for transportation planning and management. For example, suppose a transportation authority intends to shift commuters from the auto mode to the transit mode by increasing transit service frequency. In predicting the number of commute drivers who will shift from auto to transit in response to the transit improvement, the conventional MNL model underestimates the elasticity with respect to transit service frequency by 25% (-0.082 vs -0.110).

A comparison of disaggregate marginal effects and elasticities
The "AME" or "AE" presented in the previous subsection provide sample sensitivity to explanatory variables at the aggregate level and show how a level of service (LOS) variable, for example, affects market shares of alternatives based on the assumption that the sample is randomly drawn and can therefore represent the population shares well. However, aggregate measures of effects mask an important difference between MNL and SGMNL models. The MNL model has the IIA (Independence of Irrelevant Alternatives) property while the SGMNL model does not have this property. In order to illustrate this important difference between the two models, disaggregate marginal effects and elasticities are computed and compared for a specific individual commuter who is a 40 year old male with medium-level income and education level above middle school. The multimodal transportation level of service variables for this individual's commute are as follows: auto in-vehicle time is 5 minutes; transit in-vehicle time is 8 minutes; transit service frequency is 6 times per hour; bicycle travel time is 12 minutes; and walk travel time is 35 minutes. Given these input variables for this specific commuter, both MNL and SGMNL-22 models are applied to compute choice probabilities of alternative travel modes. Results are shown in Table 4. There is a substantial difference in the choice probability of transit mode between the two models. The computations show that the MNL model returns a transit choice probability that is higher than that provided by the SGMNL-22 model by 41.8%, presumably because the model does not capture and reflect the bimodal distribution of the random component in the transit utility function. Table 4 also presents a comparison of predicted means of market shares (i.e., P N i¼1Pi =N) over the entire sample. An appealing property of the MNL model is that it can replicate the observed sample shares perfectly using alternative specific constants in utility functions [1]. The SGMNL model does not have this feature, but the greatest difference occurs in the transit share where the relative difference is found to be only 1.5%, which is quite reasonable and acceptable.
The IIA property, which is a key feature of the MNL model, also manifests in the form of equal cross-elasticities [40]. Formulations similar to those expressed in Eqs (20) and (21) are applied to compute disaggregate marginal effects and elasticities with respect to LOS variables. The only difference is that the equations are applied to the specific individual commuter as opposed to all of the commuters in the sample. Results of the computations are presented in Table 5. On the development of a semi-nonparametric generalized multinomial logit model for travel-related choices It can be seen that cross-elasticities are equal in the MNL model, which reflects its IIA property. However, with unequal variances in auto and transit utilities in the SGMNL model, crosselasticities for auto and transit choice probabilities are not equal, thus demonstrating that the SGMNL model does not possess the IIA property. However, because the random components in bicycle and walk utilities have equal variance, cross-elasticities for these two alternatives are still equal and therefore the IIA property holds for the bicycle and walk modes even in the case of the SGMNL model. This is similar to the situation where two alternatives belong to the same nest in a nested logit model.

A comparison of changes in transit choice probability in response to a service frequency improvement
To further illustrate the policy implications of alternative model forms, changes in transit choice probability predicted by the two models in response to a service frequency improvement are On the development of a semi-nonparametric generalized multinomial logit model for travel-related choices compared for the specific individual commuter considered previously. The result of this comparison is presented in Fig 4. Relative to the SGMNL model, the MNL model overestimates the transit choice probability when the service frequency is low (<18 per hour) but underestimates it when the service frequency is high (!18 per hour). A service frequency of 18 transit vehicles per hour is quite high, reflecting a headway of just over three minutes. Given that most realworld transit services operate at frequencies less than 18 vehicles per hour, it appears that the MNL model is likely to overestimate the transit choice probability relative to the SGMNL model. In this particular example, when the service frequency is very low ( 4 per hour), the relative difference between the predicted transit choice probabilities computed from the MNL and SGMNL models can exceed 50%.

Conclusions
In this paper, a semi-nonparametric generalized multinomial logit (SGMNL) model is formulated and developed by applying orthonormal Legendre polynomials to extend the standard Gumbel distribution that lies at the core of multinomial logit models applied in practice. The semi-nonparametric function with flexible forms can represent a probability density function for a large family of multimodal distributions. Unlike the existing semi-nonparametric modeling method which is applied to binary choice situations in the econometric literature, the proposed method allows for modeling multinomial choices, which are typically encountered in On the development of a semi-nonparametric generalized multinomial logit model for travel-related choices travel-related choice behavior analysis and travel demand modeling. The advantage of the proposed method is that the formulation results in a closed-form likelihood function and standard maximum likelihood estimation methods can be applied for parameter estimation. Thus, the model estimation procedure is computationally efficient and free from simulation-based complexity or errors. The proposed modeling method is applied to an empirical setting of commute travel mode choice among four alternatives (auto, transit, bicycle and walk), based on travel survey and network skim (level of service) data from the Canton of Argau in Switzerland. It is found that the distribution of the random component in the auto utility function is similar to a Gumbel distribution, but has substantially smaller variance. More notably, the random component in the transit utility function follows a bimodal distribution, which indicates a significant departure from and violation of the assumption of a Gumbel distribution. Unequal variances accommodated in the formulation allow the semi-nonparametric model to be free of the limitations of the IIA property that are inherent to the multinomial logit model. The semi-nonparametric model specifications are found to offer superior goodness-of-fit when compared with the MNL model. The violation of the standard Gumbel distribution assumption in the multinomial logit model leads to inconsistent coefficient estimates, marginal effects, elasticities and choice probabilities. In the empirical context considered in this study, the multinomial logit model is found to overestimate the predicted transit choice probability relative to the semi-nonparametric model for transit service scenarios commonly encountered in the real world.
A few limitations of the proposed method and directions for future research are worthy of note. First, it may be challenging to directly apply the proposed method to model choice behaviors in the context of a large choice set (e.g. [41]). The likelihood function, depicted in Eq (18), involves multiple levels of summations and the number of levels is dependent on the number of alternatives in the choice set. Thus, the computational complexity will increase geometrically with an increase in the number of alternatives in the choice set. Future research should focus on reducing computational complexity in the context of large choice sets. Second, On the development of a semi-nonparametric generalized multinomial logit model for travel-related choices the proposed model is developed based on the assumption that random components in utility functions are mutually independent. However, this assumption may not hold in empirical settings. In future research, there may be the potential to introduce correlations in joint seminonparametric distributions and develop nested or cross-nested versions of the proposed semi-nonparametric multinomial choice model. Third, it is uncertain whether the empirical results of this study, in which the random component of the transit utility is found to follow a bimodal distribution, are valid in different geographical and modal contexts. Conducting studies similar to this one in different contexts would help shed light on the generalizability of results reported in this paper. On the development of a semi-nonparametric generalized multinomial logit model for travel-related choices