Investigating influence factors of traffic violations at signalized intersections using data gathered from traffic enforcement camera

To effectively reduce traffic violations that often cause severe crashes at signalized intersections, exploring their contributing factors seems hugely urgent and essential. This study attempted to investigate the influence factors of wrong-way driving (WWD), red-light-running (RLR), violating traffic markings (VTM), and driving in the inaccurate oriented lane (DIOL) at signalized intersections by using data collected from traffic enforcement camera in Hohhot, China. To this end, an ordinary multinomial logit model was developed. By considering the unobserved heterogeneity between observations, a random effects multinomial logit model was proposed as well. After that, the marginal effects of explanatory variables were computed. The outcomes showed that non-local vehicles were more likely to commit WWD and VTM than local vehicles. WWD and RLR frequently occurred in the daytime and evening (6:00–23:59), and on most days within a week. RLR and DIOL mainly happened in June and July. The left-turn lane ratio significantly increased RLR and DIOL. The cloudy, partly cloudy, and rainy days obviously increased WWD and VTM. The temperature from 21 to 30 degrees centigrade was apparently associated with the higher likelihoods of RLR and DIOL. According to the findings of this study, some intervention measures, targeting different vehicle types and considering temporal factors, road, and weather conditions, were recommended to reduce WWD, RLR, VTM, and DIOL at signalized intersections.


Introduction
Intersection safety has been becoming an international concern. In Australia, almost 33% of major casualties are caused by intersection crashes [1]. In China, approximately 30% of total road fatalities happen at intersections [2]. Such serious intersection-related crashes are a result of complex interactions between road user behaviors, vehicle factors, geometric road characteristics, and environmental factors [3,4], and primarily attributed to traffic violations, especially at signalized intersections. For example, red-light-running (RLR) is the primary cause of accidents at signalized intersections. Specifically, RLR had resulted in 4,227 severe injury crashes and 789 fatalities, according to the data gathered from January 2012 to October 2012 in China [5]. Hence, without a comprehensive analysis of causal factors, any effort to implement countermeasures to prevent or mitigate the frequency of traffic violations may be misguided. If the traffic violations at signalized intersections could be decreased or controlled efficiently, then the corresponding severe injuries and fatalities would be reduced accordingly. To effectively mitigate signalized intersection traffic violations, the first critical step is to identify their significant influence factors.
In addition to RLR, there are some other traffic violations usually taking place at signalized intersections, such as wrong-way driving (WWD), violating traffic markings (VTM), and driving in the inaccurate oriented lane (DIOL). The WWD is usually defined as the phenomenon that a vehicle intentionally or unintentionally travels in the opposite direction of traffic flow along with the physically divided facilities such as freeways, expressways, and the corresponding ramps [24]. The VTM includes illegally changing lane, driving over the lane marking, making U-turning at the prohibited place, and so on. The DIOL refers to the situation that a vehicle travels in an inappropriate oriented lane to pass through the intersection [25]. For instance, one vehicle in the left-turn protected lane travels straightly pass through the intersection.
As to WWD, the existing literature focus on its related crashes, injuries, and interventions on freeways and their access ramps [24,26]. In practice, the WWD also often occurs on urban roads and at the intersections, particularly at signalized intersections with the channelized approaches. Nevertheless, to the authors' knowledge, no studies have explored the contributing factors of WWD at signalized intersections. As for VTM and DIOL, only very few studies have examined their influence factors. Based on the self-reported questionnaires, Wang et al [27] suggested that attitude, subjective norms, and perceived behavioral control obviously affected the lane change violation at urban intersections. According to the video data collected from traffic enforcement cameras at signalized intersections, Fu et al [25] classified the behaviors of DIOL and explored their influence factors. They stated that the number of vehicles in the queue, percentage of large size vehicles, time period, traffic volume, and lighting conditions evidently influenced this violation.
It is worth noting that more possible influence factors of RLR, particularly road and environment factors, require to be further explored. Furthermore, it is still unclear as to the effects of driver demography, vehicle characteristics, road, environment, and weather conditions on WWD, VTM, and DIOL at signalized intersections. In addition, some previous studies are conducted according to the self-reported data [28], observational data [21] and questionnaires [29]. These data are somewhat subjective, which may lead to some unobserved bias in the analysis results. Besides, several studies explore the influence factors of traffic violations based on the crash data [30][31][32]. The problem is that, as known to all, some traffic violations cause crashes, whereas some do not. Hence, the outcomes from crash data-based studies may magnify or reduce the effects of various factors on traffic violations. To address the aforementioned issues, carrying out studies on contributing factors of specific traffic violations is vital, according to the objective and comprehensive data collected from traffic control and management tools, such as traffic enforcement cameras, since the technique of video analysis [33,34] is economical and practicable.
Accordingly, this study, which is an extension of the authors' previous study [25], attempted to employ the data obtained from traffic enforcement camera to examine the influences of road geometric characteristics, traffic enforcement and management measures, vehicle and temporal attributes, and weather conditions on traffic violations at signalized intersections. In our dataset, there were four traffic violations, namely RLR, WWD, VTM, and DIOL. Therefore, the ordinary multinomial logit model (OMLM) was developed to uncover the effects of contributing factors to traffic violations. By considering the unobserved heterogeneity between observations, a random effects multinomial logit model was proposed as well. Simultaneously, the marginal effects of independent variables were applied to figure out the extent to which a particular factor affected traffic violations.

Intersection selection
In order to adequately reveal the effects of influence factors on traffic violations at signalized intersections, the following criteria were adopted: i. Selected intersections should be located in different areas of the city [5].
ii. The layout of each signalized intersection should be different.
iii. The traffic management and enforcement facilities should be set.
iv. There should be channelization at each intersection approach.
Accordingly, four signalized intersections in Hohhot, China, were selected. Their layouts are illustrated in Fig 1. The specific characteristics of these intersections are presented in Table 1 and Fig 2. It is observed that I2 and I4 have the smallest and highest number of lanes at each approach, respectively, while the ratios of left-turn lanes and enforcement cameras at I2 are higher than that at I4. However, the ratios of channelization and guardrail at I4 are higher than those at I2.

Data collection and processing
A total of 13,008 records of traffic violations from May 1st to July 31st, 2018, were collected from traffic enforcement cameras, which saved and recorded the traffic violations in the form of screenshots and corresponding documents at the selected signalized intersections. The records included license plate number, vehicle type, occurrence position, occurrence time, and specific behavior of traffic violation. In the dataset, there were only four types of traffic violations, including WWD, RLR, VTM, and DIOL.
In order to mine more possible influence factors according to the collected dataset, the information mentioned above was further processed. Hence, the vehicle type was classified into small cars and other types. Given ownership, the vehicle included local and non-local vehicles. In the light of occurrence time, temporal factors were specifically expanded, and then categorized into time of day, day of week, and month.
Moreover, the weather conditions were considered and collected from the 2345 Weather Forecast website (http://tianqi.2345.com/wea_history/53463.htm) by using the occurrence location and time of traffic violations. The weather status and temperature (here refers to the average temperature) were obtained. The weather status consisted of sunny, cloudy, partly cloudy, and rainy. The temperature was categorized into three groups: less than 10 degrees centigrade, from 11 to 20 degrees centigrade, and between 21 and 30 degrees centigrade.

Characteristics of traffic violations by vehicle conditions
The number of traffic violations by vehicle type and ownership at each signalized intersection is shown in Table 2. It is observed that small cars and local vehicles had a higher proportion of traffic violations than other vehicle types and non-local vehicles at four signalized intersections. Besides, both small cars and other types of vehicles at I1 had more RLR behaviors. So were the local and non-local vehicles at I1. Small cars, local and non-local vehicles at I2, I3, and I4 had more VTM than others. Meanwhile, for local and non-local vehicles, DIOL at I3 was relatively common.

Characteristics of traffic violations by temporal distributions
The number of traffic violations by temporal distributions at four signalized intersections is presented in Fig 3. As for time of day, traffic violations at four intersections mainly occurred in the morning (6:00-11:59) and afternoon (12:00-17:59), while rarely happened in the early morning (0:00-5:59). With regard to day of week, traffic violations at I1 primarily concentrated on Tuesday and weekend. Traffic violations at I3 frequently occurred on Thursday and Friday, whereas at I2 and I4 evenly distributed within a week. As to month, the distribution of traffic violations at four signalized intersections was quite different. The number of traffic violations at I1 had little difference in three months. At I3, it was relatively stable in the first two months but greatly reduced in July. The number of traffic violations at I2 apparently decreased with the month. Inversely, the number of traffic violations at I4 evidently increased with the month, especially in June and July. It may be related to the actuality that the I4 has more approaches in each direction.

Characteristics of traffic violations by weather conditions
The number of traffic violations under various weather conditions is shown in Fig 4. In regard to weather status, it indicated that most violations happened on partly cloudy day, following by on rainy, sunny, and cloudy day. With respect to temperature, most drivers committed violations under the temperature between 21 and 30 degrees centigrade, following by under the temperature from 11 to 20 degrees centigrade and less than 10 degrees centigrade. Moreover, traffic violations primarily occurred at I4 no matter what the weather status and temperature were. Dissimilarly, the number of traffic violations at I2 and I3 was approximately alike under each condition, while at I1 was the smallest.

Methodology
In this study, the OMLM was developed to investigate the influence factors of signalized intersection traffic violations, with the DIOL as the reference level. The OMLM is expressed as follows [35,36]: Where, Y denotes the dependent variable with I categories; P n (Y = i) is the probability of observation n having the discrete outcome i, i2I; α i represents the intercept corresponding to the outcome i; x ik is the kth independent variable corresponding to the outcome i; β ik denotes the kth regression parameter corresponding to outcome i, which is estimated by using maximum likelihood estimation.
The OMLM can be rewritten as: Where, P n (Y = I) is the probability of observation n having the baseline category I. Although ordinary multinomial logit models have been widely applied during the past years, people found some limitations of this model, such as, not considering observed and unobserved heterogeneity in parameter effects [37,38]. Since the data were collected at different approaches of four signalized intersections, the REMLM was necessarily adopted to account for the unobserved heterogeneity between observations. Then, the REMLM is expressed as [39]: Where, v ni is the variable of random effects between observations assumed to be distributed as Nð0; s 2 vi Þ. The parameters in REMLM were estimated by using Markov Chain Monte Carlo sampling.
The Stata 15 was applied to run OMLM and REMLM. In both models, the independent variables included vehicle factors, temporal factors, road conditions, traffic enforcement and management facilities, and weather conditions. All these independent variables were categorical variables. Descriptions of independent variables are presented in Table 3.
Moreover, McFadden pseudo R 2 and AIC (Akaike Information Criterion) were adopted to compare OMLM and REMLM. The McFadden ρ 2 is given by [36]: Where, LL β is the log-likelihood at convergence with parameters; LL 0 is the log-likelihood at convergence without parameters.
The AIC is defined as: Where, q is the number of estimated parameters. The smaller AIC value indicates a betterfitted model [40].
In addition, the marginal effects of independent variables were calculated to interpret the influences of estimated coefficients on the probabilities of all categories of the dependent variables. In this study, all independent variables were indicator variables. Hence, the marginal effects were calculated as the difference in the estimated probabilities with the indicator variable varying from zero to one, whereas all other variables are equal to their means [36]. The marginal effects are given by:

Results of model estimation
The estimation results of OMLM and REMLM of WWD, RLR and VTM are presented in Table 4, Table 5, and Table 6, respectively. According to the McFadden pseudo R 2 and AIC values listed in Tables 4-6, the REMLM outperformed the OMLM (5009.102 vs 5340.518). All the estimated parameters included in both models were statistically significant at the 90 percent confidence interval at least. It is observed that the two models shared similar significant factors, including vehicle factor, temporal factors, road characteristics, and weather conditions. It is noteworthy that the effect of channelization ratio on traffic violations was not ignored, although it was only significant in OMLM. The channelization ratio of 100% evidently increased the probabilities of WWD and VTM, while decreased the probability of RLR. Table 7 lists the marginal effects of each explanatory variables for the four types of signalized intersection traffic violations in the REMLM. If one vehicle was a local vehicle, the probabilities of it committing RLR and DIOL increased by 2.7% and 3.7%, respectively. On the contrary, the probabilities of this type of vehicle committing WWD and VTM decreased by 0.4% and 5.9%, respectively. With respect to time of day, in the morning (6:00-11:59), the probabilities of vehicles driving in wrong way and running red light increased by 1.2% and 5.1%, respectively. However, the probabilities of vehicles committing VTM and DIOL decreased by 5.7% and 0.5%, respectively. Similarly, in the afternoon (12:00-17:59), the probabilities of vehicles driving in wrong way and running red signal increased by 1.3% and 6.5%, respectively. Nevertheless, the probabilities of vehicles committing VTM and DIOL reduced by 6% and 1.7%, respectively. When it was in the evening (18:00-23:59), the probabilities of vehicles driving in wrong way and running red light rose by 1.2% and 5.9%, respectively. Nonetheless, the probabilities of vehicles committing VTM and DIOL declined by 5.4% and 1.7%, respectively. In terms of day of week, Tuesday, Wednesday, Thursday, and Saturday were related to the higher likelihoods of WWD and RLR, but the lower likelihoods of VTM and DIOL. Friday and Sunday were associated with the higher probability of RLR, while the lower probabilities of WWD, VTM, and DIOL.
With regard to monthly difference, in June, the probabilities of vehicles committing RLR and DIOL ascended by 4.1% and 1.3%, respectively. Nevertheless, the probabilities of vehicles committing WWD and VTM descended by 0.5% and 5%, respectively. Similarly, in July, the likelihoods of vehicles committing RLR and DIOL increased by 5.1% and 3.9%, respectively. Nonetheless, the likelihoods of vehicles committing WWD and VTM decreased by 0.3% and 8.7%, respectively.
As for specific road conditions, the left-turn lane ratio of 26% increased the probabilities of WWD, RLR, and DIOL occurrence by 1.3%, 29.4%, and 12.7%, respectively, whereas decreased the probability of VTM occurrence by 43.4%. The left-turn lane ratio of 28% increased the likelihoods of RLR and DIOL occurrence by 21.5% and 5.5%, respectively, while decreased the likelihoods of WWD and VTM occurrence by 1.2% and 25.8%, respectively. The left-turn lane ratio of 40% increased the probabilities of RLR and DIOL happening by 16.9% and 13.7%, respectively, whereas decreased the probabilities of WWD and VTM happening by 0.2% and 30.4%, respectively.
In terms of weather status, cloudy and rainy days increased the probabilities of WWD, VTM, and DIOL, while reduced the probability of RLR taking place. However, the partly cloudy days increased the likelihoods of WWD and VTM occurring, while reduced the probabilities of RLR and DIOL occurring.
As to temperature conditions, under the temperature of less than 10 degrees centigrade, the likelihood of vehicles committing WWD ascended by 8.8%, while the likelihoods of vehicles committing RLR, VTM, and DIOL descended by 2.7%, 1.4%, and 4.7%, respectively. Dissimilarly, under the temperature of from 21 to 30 degrees centigrade, the likelihoods of WWD and VTM occurrence reduced by 0.3% and 0.9%, respectively, whereas the probabilities of RLR and DIOL occurrence increased by 0.8% and 0.4%, respectively.

Discussions
The outcomes of this study manifested that local vehicle, time of day, day of week, month, leftturn lane ratio, channelization ratio, weather status, and temperature had varying influences on four types of traffic violations at signalized intersections.
Unlike the previous studies [41,42], our data displayed that vehicle type did not exhibit a significant effect on traffic violations at signalized intersections. However, vehicle ownership was found to impact signalized intersection traffic violations significantly. As compared to local vehicles, non-local vehicles were more likely to commit WWD and VTM. This may be ascribed to non-local drivers' unfamiliarity with road conditions, or road lack of traffic control devices such as signs and pavement markings [43].
It was found that time of day affected signalized intersection traffic violations to a certain extent. The drivers were more likely to commit WWD and RLR in the morning (6:00-11:59), afternoon (12:00-17:59), and evening (18:00-23:59). Duing nighttime hours, poor lighting conditions, and lack of signage and pavement markings are probably the causes of WWD [44].
Furthermore, this finding supports the uneven distribution of traffic violations over time of day in the literature [41]. The differences regarding RLR distribution over time of day in this study demonstrate the findings in one previous research [12]. In the morning (6:00-11:59) and afternoon (12:00-17:59), it is probably because of that sunlight reduces the visibility of the color of signal lights, which leads to RLR [41]. In the evening (18:00-23:59), bad visibility may also cause RLR. Additionally, RLR is more likely to happen in the evening (18:00-23:59) than in the morning (6:00-11:59) [15].

PLOS ONE
The findings of this study indicated that there existed obvious differences in the distribution of four types of traffic violations over day of week. Consistent with previous studies [21,41,45], RLR was more likely to occur on the weekend. Moreover, RLR was more likely to happen from Tuesday to Friday than on Monday. High-frequency occurrence and being detected probably may be on account of mild penalties and weak enforcement. Similarly, WWD frequently took place on Tuesday, Wednesday, Thursday, and Saturday.
Interestingly, RLR and DIOL were more likely to occur in June and July. This probably was related to high temperatures in these two months. Furthermore, our results also revealed that the likelihoods of vehicles committing RLR and DIOL under the temperature from 21 to 30 degrees centigrade were much higher than that under the temperature between 11 and 20 degrees centigrade. The drivers may be easily irritable under the high temperature condition and become impatient to wait for green lights at signalized intersections. And thus, they may either run red lights or drive in an inaccurate oriented lane to pass through the signalized intersection. A number of studies have verified the significant effects of bad emotion on traffic violations [22,46,47].
Similarly, weather status was found to significantly impact the likelihoods of WWD, RLR, VTM, and DIOL occurrence. On cloudy, partly cloudy, and rainy days, the likelihoods of vehicles committing WWD and VTM apparently increased. These findings are in accord with the results of existing literature [22,41], showing that bad weather significantly increases traffic violations. It was also found that the left-turn lane ratio was associated with traffic violations at signalized intersections. Unlike the previous study [48], exclusive left-turn lanes are commonly used as a common traffic engineering measure to reduce conflicts with through traffic. In our study, left-turn lane ratio evidently increased RLR and DIOL, whereas distinctly decreased WWD and VTM. This may be by reason of the situation where, with the increase in the left-turn lane ratio, the drivers have more lane-specific conditions and opportunities to run red lights and drive in the inaccurate oriented lanes.
In addition, the estimation results of OMLM indicated that the channelization ratio obviously influenced traffic violations at signalized intersections. In other words, adding the number of lanes at the entry of signalized intersection approaches could increase WWD and VTM, while decrease RLR. As approaching the intersection, the number of lanes in one traveling direction increasing, the drivers may either cross the solid line when they find themselves traveling in a wrong oriented lane or travel over the solid line. Likewise, under such a situation, the drivers may commit WWD.
It should be noted that the relationships between traffic violations and influence factors reflected by the estimated coefficients of some factors in REMLM are opposite to that reflected by the marginal effects. This reason is that REMLM needs to set one type of traffic violation as the baseline category, which makes what the coefficients reflect is the changes in each outcome probability relative to the baseline category probability. In other words, the estimated coefficients may magnify, reduce, or even reverse the effects of every single factor. However, such issues could be avoided by calculating the marginal effects of influence factors [49]. Accordingly, the marginal effects of influence factors in the REMLM were employed to interpret the relationships between traffic violations and contributing factors.
The findings of current study imply that traffic violation intervention at signalized intersection should target different vehicle types and consider temporal factors, road, and weather conditions. As to non-local vehicles, they should be provided with more guidances at intersection approaches, such as traffic marking systems, sound intelligent traffic guidance sign systems, and other mass media to develop education and awareness programs [43]. Besides, visual intervention also has a signifcant influence on drivers' behavior [50]. Since the WWD and RLR frequently happen in the daytime and evening (6:00-23:59), and most days within one week, the violators should be confronted with tougher penalties and enforcements.
Additional measures targeted RLR and DIOL during June and July are also needed, such as increasing traffic policemen at signalized intersections. As left-turn lane ratio significantly increases RLR and DIOL, an appropriate number of left-turn lanes should be considered during the process of intersection layout design. In addition, channelization ratio (i.e., increasing lane number at intersection approaches) was found to increase WWD and VTM evidently. Hence, if the lane number at one intersection approach is increased, WWD dynamic warning sign [51], improved pavement markings [26], as well as guardrail installation at the corresponding intersection exit is recommended to prevent WWD; traffic oriented arrow ahead, smart traffic markings [25], and colored pavement [52] are suggested to deter VTM.
Moreover, since the adverse weather status was found to increase WWD and VTM distinctly, additional measures to improve the legibility under the cloudy, partly cloudy, and rainy days are required as well. The hot weather ruins drivers' mood and withdraws attention from the driving related information [22], and then induces unsafe driving behaviors, like RLR and DIOL in this study. Accordingly, educational programs regarding the consequences of negative emotional responses to hot weather should be conducted to target RLR and DIOL violators. Furthermore, psychological interventions should teach these drivers on how to effectively regulate emotions under the hot weather condition and how to comply with traffic lights and oriented lane guidance regulations.

Conclusions
In this paper, a total of 13,008 records of traffic violations in Hohhot from May 1st to July 31st, 2018, were collected from traffic enforcement cameras. In our dataset, there were four traffic violations, namely RLR, WWD, VTM, and DIOL. After preliminarily analyzing the characteristics of traffic violations at signalized intersections by road geometric, traffic enforcement and management measures, vehicle and temporal attributes, and weather condition, the ordinary multinomial logit model (OMLM) was developed to uncover the effects of contributing factors to traffic violations. By considering the unobserved heterogeneity between observations, a random effects multinomial logit model was proposed as well, which outperformed the former. At the same time, the marginal effects of independent variables were applied to figure out the extent to which a particular factor affected traffic violations. The major conclusions obtained from this study are summarized as follows: i. Non-local vehicles were more likely to commit WWD and VTM than local vehicles.
ii. WWD and RLR frequently occurred in the daytime and evening (6:00-23:59), and on most days within a week.
iii. RLR and DIOL mainly happened in June and July.
iv. The left-turn lane ratio significantly increased RLR and DIOL.
v. The cloudy, partly cloudy, and rainy days obviously increased WWD and VTM.
vi. The temperature from 21 to 30 degrees centigrade was apparently associated with the higher likelihoods of RLR and DIOL.
vii. Some intervention measures, targeting different vehicle types and considering temporal factors, road, and weather conditions, were recommended to reduce WWD, RLR, VTM, and DIOL at signalized intersections.
However, there exist some limitations in the present study. Although the dataset used in the current study is gathered from traffic enforcement cameras at signalized intersections, it contains very limited information. First, some other possible traffic conditions, road, and environmental factors of these four types of traffic violations cannot be explored. This can be addressed with more enriched data. Second, our dataset does not include some other intersection traffic violations, such as mobile phone usage, failure to wear a seatbelt, and pedestrian violations. Whether the factors identified by this study also influence these violations cannot be determined. Future research work will detect these violations and appraisal their corresponding influence factors based on the data collected by novel technologies and methods, like using video sensors [53]. Third, spatial factors cannot be examined because of the collected data only from four signalized intersections in Hohhot, so that it is difficult to uncover the relationship between traffic violations and spatial factors. Fourth, since the data is only from May 1st to July 31st, the influences on traffic violations of weather conditions during the whole year cannot be investigated. Last but not the least, in current research, each traffic violation type is not separated for further analysis. The relationship between left-turn lane ratio and left-turn RLR and go-through RLR needs to be further investigated, respectively. The VTM needs further investigation on violations of lane change and driving over lane line. Moreover, the DIOL can be classified into nine classifications [25]. Consequently, the influence factors of type-specific traffic violations at signalized intersections merit further studies.