Lane-keeping ability evaluation for driving skill tests: A multi-indicator fusion approach

Mengmeng Duan; Hao Wu; Shulin Zhang; Huiqing Jin; Susu Liu

doi:10.1371/journal.pone.0329257

Abstract

Traditional driver’s skill tests primarily assess whether candidates meet specific standards in prescribed tasks, which often fails to fully reflect their overall driving performance in real-world scenarios. This can lead to suboptimal driving outcomes. Lane-keeping ability is a key indicator for evaluating a driver’s overall competence, as it reflects their proficiency in vehicle control, road environment perception, and emergency handling. However, due to the complex and varied factors influencing lane-keeping ability, there is currently a lack of effective methods for assessing this skill during drive skill tests. To address this gap, this paper proposes a multi-indicator fusion (MIF) method for evaluating lane-keeping ability in driver skill tests. First, to accommodate real-world lane-keeping scenarios in drive skill tests, multidimensional indicators representing lane-keeping ability are extracted from real low-speed naturalistic driving data, considering both lateral and longitudinal safety and stability. Next, by analyzing the distribution characteristics of these indicators using the K-means clustering method, groups of indicators with similar characteristics are identified. Furthermore, the Youden index, Boxplot, and statistical measures are then employed to determine the threshold values for each indicator, enhancing the accuracy of the evaluation. Finally, a comprehensive evaluation model for lane-keeping ability is constructed using the Analytic Hierarchy Process (AHP) based on a combination of subjective and objective weightings. The proposed MIF-based lane-keeping assessment method for drive skill tests was effectively validated in terms of its rationality and feasibility using naturalistic driving data. This study provides valuable reference points for assessing lane-keeping ability in the context of future autonomous driving environments.

Citation: Duan M, Wu H, Zhang S, Jin H, Liu S (2025) Lane-keeping ability evaluation for driving skill tests: A multi-indicator fusion approach. PLoS One 20(8): e0329257. https://doi.org/10.1371/journal.pone.0329257

Editor: Gen Li, Nanjing Forestry University, CHINA

Received: February 7, 2025; Accepted: July 15, 2025; Published: August 6, 2025

Copyright: © 2025 Duan et al. This is an open access article distributed under the terms of the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are credited.

Data Availability: All relevant data are available at the following public repository: https://github.com/HaoWu997/AD4CHE_dataset_V1.0.git.

Funding: This work was supported by Anhui Provincial University Outstanding Young Research Project (2022AH030161), Anhui Postdoctoral Scientific Research Program (No.2025B1094), and Anhui Provincial Universities-Key Laboratory of Traffic Information and Safety Open Project (JTX202302).

Competing interests: The authors have declared that no competing interests exist.

1. Introduction

Driver’s skill tests have long been regarded as a fundamental component in ensuring road traffic safety [1,2]. Their effectiveness directly impacts overall traffic safety levels and plays a crucial role in accident prevention. It has been demonstrated that severe traffic accidents have occurred as a result of the driving behaviors of untrained drivers [3–5]. Traditionally, driver skill tests have primarily focused on evaluating a driver’s performance in specific, predefined tasks, assessing whether they meet the established standards. However, this testing approach mainly targets individual skills and often fails to comprehensively reflect a driver’s overall ability and adaptability in real-world driving conditions. As a result, even though a driver may pass the road test, their performance and safety awareness in the complex and dynamic real-world driving environment may still be insufficient, leading to potential traffic safety risks. Therefore, we believe that finding a way to comprehensively assess a driver’s overall driving ability during the road test has become a critical issue that needs urgent attention.

Typically, a driver’s driving process involves stages such as car-following, lane-changing, and free driving. Therefore, most scholars currently focus on driving behavior and driving ability during different stages of the driving process, exploring the factors that influence driver behavior and driving ability. For example, Ge et al. found that a driver’s stress levels and personality significantly affect their driving behavior [6]. Zhang et al. revealed the relationship between different driving styles and vehicle-following characteristics [7], while another study by Zhang et al. used LSTM to explore the interaction between car-following and lane-changing behaviors, highlighting their impact on driving performance [8]. Chen et al. investigated the response of the new follower to a lane-changing maneuver in order to accurately describe vehicle movements in large-scale traffic flow [9]. In addition, safety issues have always been a key focus for many researchers. Shin et al. studied the heterogeneity between vehicle types and driving behaviors, analyzing safe driving behaviors during lane-changing maneuvers [10]. Tan et al. constructed a risk field-based driving behavior model to quantify the safety driving risks during lane-changing and car-following processes [11].

With the development of advanced traffic technologies, recent years have seen a growing focus among scholars on research in the field of intelligent transportation systems (ITS). Several studies have highlighted the impact of advanced technologies on driving behavior and abilities. For example, Wu et al. developed a trajectory control method for CAVs, aiming to enhance the vehicle’s control capabilities in terms of speed and acceleration [12]. Wang et al. applied the Acceleration-Based Collision Criticality Metric for real-time safety capability assessment in autonomous vehicles [13]. Zhang et al. studied safety performance indicators, focusing on speed and acceleration, for Connected and Automated Vehicles (CAVs) at freeway crash hotspots [14]. Chai et al. explored the significance of driver assistance in mitigating distracted driving [15]. Papadoulis et al. examined the impact of autonomous vehicles on highway traffic safety [16]. Additionally, some researchers have investigated methods to enhance driving abilities through advanced technologies. For instance, Robert proposed a data-driven approach to predict driver behavior intentions. Zhao et al. considered the impact of tunnel brightness and noise on drivers and proposed a lane-change model for autonomous vehicles in tunnels based on V2X technology [17]. Schöner et al. introduced a method for safety scoring based on driving style influences, including vehicle distances, time headways, and time to collision, providing potential benefits for the safe operation of autonomous vehicles [18].

Lane-keeping is considered a key factor influencing road traffic safety [19]. Research by Utriainen et al. has also demonstrated the potential safety improvements that lane keeping can bring [20]. As a result, more studies aim to enhance driving safety by improving lane-keeping ability. For example, Chen et al. proposed a lane-keeping control method for autonomous vehicles based on Human-Simulated Intelligent Control (HSIC) to improve the robustness of lane control [21]. Although a few studies focus on the lateral lane-keeping levels of vehicles [22], safety factors in the longitudinal dimension are rarely evaluated, as most studies implicitly treat them as kinematic constraints imposed by the driver to control the distance from the vehicle ahead and avoid collisions. Some studies also evaluate drivers’ driving skills by testing their performance in real-world scenarios. Song et al. assessed and tested key lane-keeping scenarios to validate drivers’ skills [23]. In addition, Zhang et al. proposed an evaluation method in the whole parameter space of a logical scenario [24].

Overall, although research on driving behavior and driving ability has made some progress, most studies have primarily focused on the normal driving behavior of licensed drivers, who are often classified as skilled drivers. However, due to ethical and regulatory issues, data on the driving behavior of unlicensed drivers is scarce, and there is less research on drivers without a license. Currently, driver skill tests are mainly limited to assessing whether a driver can meet basic driving standards in specific tasks. The test content is relatively simple and cannot effectively reflect a driver’s comprehensive driving skills. A driver’s lane-keeping ability is related to their reaction speed, judgment, and vehicle dynamic control. Excellent lane-keeping ability can reflect a driver’s overall driving ability to some extent and has a potential impact on ensuring road traffic safety [25]. However, numerous factors affect a driver’s lane-keeping ability, which requires consideration from multiple dimensions. This helps avoid the limitations and bias in evaluating driving performance. Currently, there is a lack of effective methods to quantitatively assess lane-keeping performance from multiple perspectives, preventing a comprehensive evaluation of a driver’s lane-keeping abilities. To fill this gap, the work considers both lateral and longitudinal dimensions of lane-keeping and proposes a method based on multi-indicator fusion (MIF) to quantify the driver’s lane-keeping performance to assess their lane-keeping ability, providing a reference for driver skill test evaluation, overcoming the limitations of the existing driving skill test program in terms of drivers’ comprehensive ability.

The main contributions of this study are as follows:

(1) In this study, we extracted a dataset that aligns with the lane-keeping characteristics in driving skill tests. We applied the k-means clustering algorithm and spearman correlation coefficient for data analysis. From both the horizontal and vertical dimensions, we identified 10 key metrics to evaluate lane-keeping ability in driving skill tests.
(2) In this study, we determined the optimal thresholds for each indicator by integrating the Youden index, boxplot techniques, and statistical methods.
(3) In this study, we employed the AHP to determine the subjective weights and the entropy weight method to establish the objective weights. The coefficient of variation method was then utilized to derive the combined subjective and objective weights, and developing a comprehensive assessment model for evaluating drivers’ lane-keeping ability.

The rest of this paper is organized as follows: In Section 2, the extraction and processing of the Aerial Dataset for China Congested Highway and Expressway (AD4CHE) are introduced. Section 3 presents a method for determining the evaluation indicators of lane-keeping ability during driver skill tests. In Section 4, we first determined the thresholds and weights of each indicator and then introduced the method for constructing a specific road test lane-keeping ability evaluation model. In Section 5, we discuss the practical value of this work in traditional driver skill tests and future autonomous driving tests. Finally, in Section 6, we draw our conclusions and discuss future research directions.

2. Dataset extraction and processing

Due to low-speed driving scenarios being key features in driver skill tests, and it is highly correlated with driving characteristics in congested areas, we use the dataset sourced from the AD4CHE project [26]. By selecting segments from this dataset, we are able to extract samples that meet the research requirements with a higher probability, thus avoiding the issue of evaluation model accuracy caused by sparse data. This dataset was collected via drones and contains traffic data from congested highways and expressways in China. It includes 5.12 hours of aerial footage covering a total distance of 6,540.7 kilometers across four different cities in China, with information on 53,761 vehicles.

The structured data within the dataset includes various parameters such as vehicle position, speed, acceleration, and lane information. Additionally, the dataset provides newly added parameters like lane angle, direction, yaw rate, and vehicle offset, which are valuable for studying subtle behavioral changes in human drivers. Fig 1 shows the actual road environment from one of the aerial data segments. This dataset can be applied to the extraction and analysis of typical driving scenarios, such as lane-keeping, cut-ins, and lane changes. For this work on assessing lane-keeping ability during drive skill tests, we selected the lane-keeping data and processed it to meet the research requirements for evaluating lane-keeping ability.

Download:

Fig 1. An example of an AD4CHE segment.

https://doi.org/10.1371/journal.pone.0329257.g001

2.1 Data extraction

To ensure that the data meets the requirements for evaluating lane-keeping ability during drive skill tests, we first performed data cleaning to correct any errors in the source dataset. This process included identifying and rectifying anomalies, such as abnormal speed values, acceleration values, and lane offset values. Due to the randomness in the data collected by AD4CHE aerial photography, there may be a few observed sample values that differ significantly from the majority of the samples. Therefore, it is necessary to remove outliers. The study primarily uses the quartile method to eliminate outliers. Specifically, the 25th percentile , 50th percentile , and 75^th percentile of the data are identified, and the interquartile range (IQR) of the data sequence is calculated as equation (1):

(1)

The study defines data outside the range as outliers and removes them.

Next, to align with the characteristics of the road test scenarios in our study, we focused on extracting low-speed lane-keeping scenarios from the dataset. The specific extraction criteria are as follows:

(a) Lane-Keeping Scenario: Only segments where the vehicle is driving straight within the lane, without lane changes or line crossings, were selected.
(b) Low Speed: Vehicle speed was restricted to less than 30 km/h to facilitate the analysis of low-speed traffic flow characteristics.
(c) Time Duration: Segments with a duration of more than 10 seconds were selected to ensure that the segment represents a stable lane-keeping scenario, thereby enhancing the accuracy of subsequent analyses.

The specific extraction process, as shown in Fig 2.

Download:

Fig 2. Data extraction flow chart.

https://doi.org/10.1371/journal.pone.0329257.g002

After the selection process, a dataset containing 2926 low-speed lane-keeping segments was obtained. Fig 3 presents a better example illustrating the time-series variation of four parameters within one of these lane-keeping segments.

Download:

Fig 3. Changes in certain lane-keeping parameters.

https://doi.org/10.1371/journal.pone.0329257.g003

2.2 Data processing

To evaluate drivers’ lane-keeping ability during drive skill tests, we identified 15 indicators that can be used for evaluation by reviewing and summarizing key aspects of driver performance in both lateral and longitudinal operations, as shown in Table 1.

Download:

Table 1. Preliminarily determined lane-keeping ability evaluation indicators.

https://doi.org/10.1371/journal.pone.0329257.t001

The aforementioned indicators include TTC, THW, DHW, etc., which describe the relationship between the ego vehicle and the lead vehicle during lane-keeping, as well as indicators such as speed, acceleration, and lateral offset from the lane centerline that describe the state of the ego vehicle.

Some indicators for lane-keeping capability evaluation can be directly provided, such as THW and DHW. However, some indicators require further explanation, such as risk degree, risk rate, and TLC.

The risk rate and risk degree are related to the vehicle’s speed and acceleration. They can be used to describe the risk of collision with the lead vehicle during lane-keeping, which is crucial for maintaining lane safety.

First, the following distance threshold at time is calculated based on the speed and acceleration of both the ego vehicle and the lead vehicle at time . It can be computed using equation (2).

(2)

where and represent the actual speed and acceleration values of the ego vehicle at time , respectively; represents the actual speed of the ego vehicle; represents the reaction time of the ego vehicle.

Next, compare the actual following distance at time with the minimum following distance , define the function , and , satisfies the function (3).

(3)

The risk rate is represented by the proportion of the time periods with risk relative to the total driving duration, as shown in equation (4).

(4)

The risk degree is characterized by the sum of the ratios of the difference between the real-time safe distance and the actual safe distance to the real-time safe distance during periods of risk occurrence. This is calculated using equation (5).

(5)

In the equation, represents the data sampling interval; represents the actual following distance at time .

TLC represents the time remaining before the vehicle crosses the lane boundary and can be used to describe the lateral safety margin during lane-keeping. It is one of the key indicators of lane-keeping performance, as shown Fig 4. Equation (6) illustrates the relationship between the time to the left and right boundaries and the vehicle’s state.

Download:

Fig 4. Lane-Keeping Parameter Diagram.

https://doi.org/10.1371/journal.pone.0329257.g004

(6)

In the equation, represents the vehicle’s lateral movement speed; represents the vehicle’s lateral acceleration; represents the distance between the vehicle’s left boundary and the left lane line; represents the distance between the vehicle’s right boundary and the right lane line.

To more intuitively represent the distribution of the parameters, the study further presents the data using histograms, as shown in Fig 5.

Download:

Fig 5. Distribution charts of lane-keeping segments.

https://doi.org/10.1371/journal.pone.0329257.g005

The distribution histograms of lane-keeping evaluation indicators shown in Fig 5 reveal that most indicators exhibit distribution shapes that deviate slightly from normal distribution, with noticeable skewness and kurtosis. For example, the average THW displays a right-skewed distribution with a kurtosis greater than 3, indicating that a small number of extreme values less than the mean significantly affect the SD. Similarly, the maximum LCO shows a left-skewed distribution, with 20% of drivers managing to keep the deviation within 0.26 m, which is considered a relatively safe range. Additionally, the histograms of risk rate and risk degree exhibit higher bars at 0, followed by a rapid decline, which is mainly due to the majority of scenarios in the dataset being safety-related. These results indicate that the distribution characteristics of the extracted data effectively meet the needs of the research scenario.

To determine the threshold values for lane-keeping evaluation indicators in driving tests, the work applies the K-means clustering method for classification and clustering analysis of various lane-keeping indicators. K-means clustering is an unsupervised learning algorithm with the core idea of assigning data points to K predetermined clusters (where K is the number of clusters specified in advance) and minimizing the squared distance between data points and their respective cluster centers through an iterative optimization process. The ultimate goal of the algorithm is to ensure that each data point is assigned to the cluster center closest to it, thereby achieving optimal classification of lane-keeping data. The specific implementation steps are as follows:

(1) Initialize Centroids: Randomly select K data points as the initial cluster centers.
(2) Assign Data Points: Assign each data point to the cluster corresponding to the nearest centroid.
(3) Update Centroids: For each cluster, calculate the mean of all data points within the cluster and set it as the new centroid.
(4) Iteratively repeat steps (2) and (3) until the centroids no longer change significantly or a predefined number of iterations is reached.
(5) Convergence: After the algorithm converges, each data point is assigned to a cluster, resulting in K clusters.

In the K-means clustering method, selecting an appropriate value for K is a critical issue, as it significantly impacts the determination of threshold values for lane-keeping evaluation indicators in driving tests. In this study, different K values were tested, and the optimal K value was selected through a combination of the silhouette coefficient, the elbow method, and the minimum sample criterion, allowing for the accurate determination of the K value for clustering data.

First, this study determines the optimal clusters by extracting the silhouette coefficient S for different clusters of lane-keeping indicators. The silhouette coefficient S reflects the quality of the clustering, with values ranging from 0 to 1. The closer the value is to 1, the better the clustering performance. The silhouette coefficients S for the indicators across different clusters are shown in Fig 6.

Download:

Fig 6. Silhouette coefficients for different clusters of indicators.

https://doi.org/10.1371/journal.pone.0329257.g006

As shown in Fig 6, the silhouette coefficient S for the lane-keeping evaluation indicators mostly reaches its maximum value when the number of clusters is set to 2. Additionally, the silhouette coefficients are relatively high across all cluster numbers (ranging from 0.6 to 0.9), indicating good clustering performance.

The elbow plot is used to determine the appropriate number of clusters by plotting the trend of the sum of squared errors (SSE) for different numbers of clusters. Typically, as the number of clusters increases, the SSE gradually decreases, but after a certain number of clusters, the rate of decrease slows significantly, forming an “elbow” shape. This point indicates the optimal number of clusters. Fig 7 shows the elbow plot of the sum of squared errors for the lane-keeping evaluation indicators across different numbers of clusters.

Download:

Fig 7. Elbow plot of lane-keeping indicators.

https://doi.org/10.1371/journal.pone.0329257.g007

As shown in Fig 7, the sum of squared errors for each lane-keeping indicator decreases gradually as the number of clusters increases. The decrease is particularly significant when the number of clusters is 2 or 3, after which the reduction slows down. This indicates that in the elbow plot, the lane-keeping evaluation indicators achieve optimal clustering performance when the number of clusters is 2 or 3.

Based on the analysis results of the silhouette coefficient in Fig 6 and the SSE in Fig 7, further analysis was conducted by combining the minimum sample size for each lane-keeping indicator under different numbers of clusters (with the number of clusters being 2, 3, or 4). The results are shown in Fig 8.

Download:

Fig 8. Minimum sample size for different numbers of clusters.

https://doi.org/10.1371/journal.pone.0329257.g008

As shown in Fig 8, the minimum sample size for each lane-keeping evaluation indicator is largest when the number of clusters is 2, and it decreases progressively, reaching the smallest size when the number of clusters is 4.

Considering the parameter variation characteristics of lane-keeping indicators in Figs 6, 7, and 8, although the clustering performance is better with 2 clusters, limiting the analysis to only 2 clusters may result in a loss of detailed information, leading to an oversimplification of the data. On the other hand, when the number of clusters is set to 3, the clustering performance is only slightly inferior to that of 2 clusters. Moreover, compared to 2 clusters, the 3-cluster division better captures intermediate state samples within the data, more fully reflecting the data’s variability and hierarchical structure. Therefore, the final number of clusters for each evaluation indicator was determined to be 3. The clustering results of the indicators are shown in Fig 9.

Download:

Fig 9. Lane-keeping clustering results for indicators.

https://doi.org/10.1371/journal.pone.0329257.g009

From Fig 9, it can be seen that setting the number of clusters for lane-keeping ability indicators to 3 achieves a better clustering result. This is beneficial for determining the thresholds for lane-keeping ability assessment.

3. Lane-keeping ability evaluation indicator

In view of the potential issue of feature overlap among the 15 lane-keeping evaluation indicators summarized in the literature. The work employs the Spearman correlation coefficient method to conduct an in-depth analysis of the correlations between the indicators. The aim is to select the highly correlated indicators, thereby avoiding redundancy or repetitive evaluation during the assessment process. The Spearman correlation coefficient method performs well in handling tied ranks and outliers in sequences. We use the correlation coefficient to represent the correlation between data set and data set , as shown in equation (7).

(7)

In this equation, represents the rank difference between column and column ; denotes the length of each column.

This paper calculated the correlation coefficients among the 15 indicators, resulting in the Spearman correlation matrix shown in Fig 10. To determine the correlation between two parameters, we used a threshold value of 0.7 as the standard [34]. When the correlation coefficient between two parameters exceeds 0.7, it indicates a strong correlation between them.

Download:

Fig 10. Spearman correlation matrix.

https://doi.org/10.1371/journal.pone.0329257.g010

Fig 10 shows that the correlation coefficient between the THW minimum and the average THW is 0.80, between the DHW SD and the THW SD is 0.71, between the minimum DHW and the average DHW is 0.91, and between the risk rate and risk degree is 0.77. These results indicate a strong correlation between the THW minimum and average THW, the DHW SD and THW SD, the minimum DHW and mean DHW, and the risk rate and risk degree. Consequently, the minimum THW, average DHW, minimum DHW, and risk degree indicators were ultimately excluded. The selected indicators will be used in the subsequent construction of the evaluation model and will provide a reference standard for lane-keeping evaluation in drive skill tests.

4. Driver lane-keeping ability evaluation model based on MIF

Through the correlation analysis of low-speed lane-keeping data from AD4CHE, we further identified the indicators that impact a driver’s lane-keeping ability. These indicators are TTC-1, average THW, risk rate, THW SD, DHW SD, speed SD, acceleration SD, maximum LCO, minimum TLC, minimum LD, and SDLP. To comprehensively assess a driver’s lane-keeping ability in drive skill tests, we first determine the threshold values for each indicator. Considering the skewed distribution of the data, this study employs Boxplot, the Youden Index, and the 20%−80% percentiles to determine the thresholds, identify outliers and skewed distributions, and ensure that the threshold setting is not overly affected by extreme data or imbalanced distributions. The study constructs an integrated evaluation model for lane-keeping in drive skill tests based on the AHP and entropy weight method, and it finally proposes a threshold-based method for evaluating a driver’s lane-keeping ability in drive skill tests.

4.1 Setting thresholds for evaluation indicators

This work utilizes a combination of the Youden (Y) index, boxplot, and the 20%−80% quantile method to comprehensively determine the optimal thresholds for the evaluation indicators. Relying solely on the Youden index and boxplot to extract outliers for threshold determination may be biased, especially when data distribution is irregular, potentially leading to unreasonable results, such as a negative minimum THW. To ensure the rationality and logical consistency of the data, the optimal thresholds for each indicator were determined by integrating the results from the Youden index and boxplot methods, along with the 20th and 80th percentiles of each indicator. The final optimal thresholds were established by taking the union of these values through logical judgment.

4.1.1 Youden Index.

The Youden Index (Y) [35]is a statistical measure used to evaluate the overall diagnostic performance of a binary classification test or model. It considers both the True Positive Rate (TPR) and the True Negative Rate (TNR), and quantifies the test’s discriminatory ability by providing a single value. The range of the Y index from −1–1, where a value of 1 indicates that the test has perfect discriminatory ability (i.e., both TPR and TNR reach their maximum values). A value of 0 suggests that the test performance is no better than random selection, and a value less than 0 indicates that the test performance is even worse than random selection. Generally, a higher value of the Y index signifies stronger discriminatory ability of the test, with the corresponding threshold being the optimal discrimination threshold [36].The Youden Index can be calculated using Equations (8)-(10).

(8)

(9)

(10)

In the formula, represents the number of positive samples correctly classified as positive by the model or test; represents the number of negative samples correctly classified as negative by the model or test; represents the number of negative samples incorrectly classified as positive by the model or test; represents the number of positive samples incorrectly classified as negative by the model or test; indicates the proportion of actual positive samples correctly classified as positive by the model or test; indicates the proportion of actual negative samples correctly classified as negative by the model or test.

To calculate the maximum Y value for each indicator in this study, the ROC curve was used to traverse all possible thresholds of the continuous variables based on the clustering results of various parameters extracted from lane-keeping segments. For each threshold, the values of TP, TN, FN, and FP were computed, and the threshold corresponding to the maximum Youden index was determined. The specific discrimination thresholds for the remaining indicators are shown in Table 2.

Download:

Table 2. Optimal discrimination thresholds for each indicator based on Y index.

https://doi.org/10.1371/journal.pone.0329257.t002

By comparing the optimal discrimination thresholds corresponding to the maximum Y values for each indicator with the clustering results, it is evident that the discrimination thresholds for each indicator are very close to the boundary values of the clusters. This indicates that the optimal discrimination thresholds calculated based on Y values possess excellent discriminative ability and are also more precise.

4.1.2 Boxplot.

A boxplot is a visual representation of data, where Q1 is the first quartile, Q3 is the median between the median and the highest value, and the interquartile range (IQR) is defined as Q3 - Q1. Typically, outliers are defined as data points lower than Q1 - 1.5IQR or higher than Q3 + 1.5IQR. Therefore, based on the boxplot, outliers for each indicator are identified, and the values of Q1 - 1.5IQR and Q3 + 1.5IQR for the lane-keeping parameter are used as the passing thresholds. Furthermore, by analyzing how the threshold value of each indicator affects the assessment of the driver’s lane-keeping ability, a single discriminative threshold is ultimately determined. The optimal discrimination thresholds for each evaluation indicator, calculated using the Boxplot method, are shown in Table 3.

Download:

Table 3. Optimal discrimination thresholds for each indicator based on boxplot.

https://doi.org/10.1371/journal.pone.0329257.t003

From Table 3, it can be observed that compared to the thresholds calculated using the Y values, the optimal discrimination thresholds for each indicator obtained through outlier detection exhibit varying degrees of fluctuation. For instance, the thresholds for DHW SD and minimum TLC even resulted in negative values. This is due to the inherent data characteristics of the indicators. For example, the DHW SD indicator, as shown in the frequency distribution histogram in Fig 5, has a wide range between the first quartile and the third quartile, with a relatively small first quartile. This results in negative values when calculated using the boxplot method. Therefore, in future research, this paper will further screen and exclude such data and use multiple cross-validation methods to obtain more scientific and accurate thresholds.

4.1.3 20%−80% percentiles.

By using statistical methods to calculate the mean, median, 20th percentile, 80th percentile, minimum, maximum, standard deviation, kurtosis, and skewness of each lane-keeping evaluation indicator, a more comprehensive understanding of the data’s central tendency, distribution shape, variability, and potential outliers can be achieved. This approach helps in obtaining more accurate thresholds. The statistical values for each indicator are shown in Table 4.

Download:

Table 4. Statistical values of each indicator.

https://doi.org/10.1371/journal.pone.0329257.t004

In summary, both the Y value-based threshold determination and the outlier detection method based on boxplot have certain limitations. Therefore, we further combined the 20th and 80th percentiles of the calculated statistical values for each indicator to determine the optimal thresholds by taking the union of these values based on logical judgment. For example, in the case of the DHW SD, the threshold calculated using the Y value is 4.29m, while the outlier method yields a threshold of 5.79m. The 80th percentile obtained from the statistical distribution is 3.358m. Considering that the dataset mainly consists of licensed drivers with better driving abilities, 5.79m was selected as the threshold for the DHW SD in lane-keeping ability evaluation. Additionally, we excluded clearly unreasonable indicator thresholds. For the TTC^-1 maximum, we found that the threshold calculated using the Y value is 0.029/s, the outlier detection method gives 0.040/s, and the 80th percentile from the statistical distribution is 0.022/s. It can be observed that TTC^-1 is significantly lower than the 0.5/s near-collision threshold commonly proposed in most studies [27]. Since the thresholds determined by all three methods are too low, they were deemed unreasonable and excluded. Therefore, the final optimal thresholds for the 10 selected indicators are shown in Table 5.

Download:

Table 5. Optimal discrimination thresholds for lane-keeping indicators.

https://doi.org/10.1371/journal.pone.0329257.t005

4.2 Determination of weights for evaluation indicators

Weights are numerical values used to measure the significance of each indicator within the overall evaluation, describing the importance of individual factors within the factor system. Determining the relative importance of different evaluation indicators allows for a more accurate comprehensive evaluation of a driver’s road test ability. There are many methods for determining indicator weights, generally categorized into subjective weighting methods and objective weighting methods. Subjective weighting is primarily based on the decision-expert’s subjective judgment of the importance of each attribute, with the original data obtained through expert experience and judgment. Objective weighting is primarily determined based on the degree of correlation between indicators or the amount of information provided by each indicator.

To overcome the limitations of using a single weighting method and to fully leverage the strengths of various weighting approaches, this study employs a combination of subjective and objective weighting methods. This combined approach eliminates subjective bias and objective one-sidedness, ensuring that the determined weights reflect both subjective and objective information and accurately and comprehensively represent the actual driving ability assessed in the road test.

4.2.1 Subjective weights based on the AHP.

The AHP, introduced by Thomas L. Saaty of the University of Pittsburgh in 1977 [37], is a method that combines qualitative and quantitative judgments to describe objective evaluations. This study establishes a hierarchical evaluation framework for assessing drivers’ lane-keeping ability in driving skill tests. Based on the characteristics of the evaluation indicators and the final evaluation goals, the problem is developed into a multi-level analysis structure model. This approach scientifically and rationally determines the subjective weights of indicators at each level, enhancing the objectivity and credibility of the evaluation system and providing a more accurate theoretical foundation for subsequent driver training and testing.

The steps for determining the subjective weights of indicators using AHP are as follows.

(1) Construct a hierarchical model

The hierarchical structure model typically includes three levels from top to bottom: the goal hierarchy, the criteria hierarchy, and the indicator hierarchy. The goal hierarchy represents the highest level in the hierarchy, which in this study is the evaluation of drivers’ lane-keeping ability in drive skill tests. The criteria hierarchy consists of the criteria for assessing the quality of the options. In this study, safety and stability are added as judgment criteria, so the criteria hierarchy includes longitudinal safety, longitudinal stability, lateral safety, and lateral stability. The indicator hierarchy consists of the specific influencing factors, which are the 10 determined evaluation indicators. Therefore, this study constructs a hierarchical structure evaluation model for drivers’ lane-keeping ability, divided into four dimensions with a total of 10 indicators, as shown in Table 6.

Download:

Table 6. Hierarchical structure model for driver lane-keeping ability.

https://doi.org/10.1371/journal.pone.0329257.t006

(2) Constructing the fuzzy judgment matrix

The construction of the fuzzy judgment matrix is a key step in the AHP. This study uses the consistent matrix method proposed by Saaty et al. to construct the judgment matrix, which involves comparing factors pairwise rather than all at once. By using relative scales, this method aims to minimize difficulties in comparing factors with different properties and enhance accuracy, as shown in Table 7.

Download:

Table 7. The 9-level scale method.

https://doi.org/10.1371/journal.pone.0329257.t007

In constructing the judgment matrix, the values for each element are determined by comparing elements pairwise under a certain criterion from the previous level. Therefore, for the criteria hierarchy elements longitudinal safety, longitudinal stability, lateral safety, and lateral stability under the goal hierarchy of assessing drivers’ lane-keeping ability, we can construct a 4x4 judgment matrix A.

(11)

Where, represents the relative importance of element with respect to element from the perspective of assessing drivers’ lane-keeping ability. That is . represents the weight of the criterion in the criteria hierarchy with respect to the importance of the goal hierarchy for drivers’ lane-keeping ability. Therefore, the judgment matrix A has the following properties.

Based on the above principles, the criteria hierarchy judgment matrix is constructed as shown in Table 8.

Download:

Table 8. Criteria hierarchy judgment matrix.

https://doi.org/10.1371/journal.pone.0329257.t008

Since the judgment matrix should have complete consistency, in practice, however, when using pairwise comparisons, estimation errors may arise due to the limitations of the evaluator’s knowledge and experience. Therefore, a consistency check is further required.

a) Determine the eigenvector of the judgment matrix A, which represents the relative weights of each factor.

Normalize each column of the judgment matrix A:

Sum the rows of the normalized matrix:

normalized vector : , The elements of W represent the ranking weights of the relative importance of the factors from the previous hierarchy with respect to a given criterion in the hierarchy, specifically the ranking weights of the four elements in the criterion hierarchy regarding the lane-keeping ability of drivers in the objective hierarchy. Thus is the sought characteristic vector, which is also the result of the pairwise comparison matrix in hierarchical sorting.

Calculate the maximum eigenvalue of the pairwise comparison matrix: , is the component of the vector matrix .

The calculated feature vector and weight results for the criterion hierarchy are shown in the Table 9. ( 4.047)

Download:

Table 9. Feature vector and weight results for the criterion hierarchy.

https://doi.org/10.1371/journal.pone.0329257.t009

(3) Hierarchical ranking and consistency check
1. b) consistency check

Since the pairwise comparison matrix is estimated and may contain errors, it is necessary to perform a consistency check. The consistency index (CI) is calculated as follows in formula (12).

(12)

indicates perfect consistency in the pairwise comparison matrix. The larger the value, the greater the degree of inconsistency in the matrix. Generally, if , the consistency of the matrix is considered acceptable. otherwise, pairwise comparisons should be revised. The larger the dimension of the matrix, the worse the consistency tends to be, so a modification value is introduced, as shown in Table 10 [38]. And a more reasonable is chosen as an indicator for measuring matrix consistency.

Download:

Table 10. RI value table (Part).

https://doi.org/10.1371/journal.pone.0329257.t010

(13)

Thus, if ≤ 0.1, the consistency of the pairwise comparison matrix is considered acceptable. otherwise, the pairwise comparisons should be revised [39].

Consistency checks of the weights were performed, and the results are shown in Table 11.

Download:

Table 11. Consistency check results.

https://doi.org/10.1371/journal.pone.0329257.t011

The results show that the maximum eigenvalue is 4.047. According to the Table 11, the corresponding is 0.89. Thus, < 0.1, the consistency check is passed, it is demonstrated that the weight determination method is reasonable, and there is no need to modify the judgment matrix.

(4) Overall ranking and consistency check

In order to determine the final weights of each element in the hierarchical model, it is necessary to perform a comprehensive ranking, which involves calculating the relative importance of all factors with respect to the highest level (the goal level). This process is carried out from top to bottom, progressing sequentially from the highest level to the lowest, the specific representation is as follows.

Assume is the goal level, including factors , The weight coefficients for their hierarchical single ranking relative to are , Then the overall importance of the B hierarchy in the total ranking is given by . That is, the overall importance of criterion hierarchy is the weighted sum of the relative importance of the elements in the upper hierarchy . Its specific representation is shown in Table 12.

Download:

Table 12. Overall importance of hierarchy B.

https://doi.org/10.1371/journal.pone.0329257.t012

The overall importance (total ranking) must also undergo consistency checking. The method involves evaluating from high to low. For example, the consistency indicator for the relative importance of certain factors in hierarchy with respect to is denoted as , and the modification value is . The random consistency ratio for the overall importance of hierarchy B is denoted as, it can be calculated by equation (14). When , the hierarchical overall importance is considered satisfactory. Otherwise, the values in the pairwise comparison matrix need to be adjusted.

(14)

We ultimately obtain the comprehensive weights of lane-keeping ability evaluation indicators based on hierarchical analysis, as shown in Table 13.

Download:

Table 13. Subjective weights of each indicator determined based on AHP.

https://doi.org/10.1371/journal.pone.0329257.t013

4.2.2 Objective weight determination based on the entropy weight method.

The entropy weight method is an objective weighting method that determines objective weights based on the amount of information contained in the indicators. If the entropy of an evaluation indicator is low, it indicates that the variability of the indicator value is high, meaning it provides more information and plays a more significant role in the evaluation of drivers’ lane-keeping ability during drive skill tests. Consequently, its weight should be higher. Conversely, if the entropy of an evaluation indicator is high, it suggests that the variability of the indicator value is low, indicating that it provides less information and plays a less critical role in the evaluation of lane-keeping ability, and thus, its weight should be lower. Therefore, this paper determines the objective weights of each evaluation indicator for lane-keeping based on information entropy, providing a basis for the comprehensive evaluation of drivers’ lane-keeping ability through multi-indicator fusion.

We assume there are segments of drivers’ lane-keeping data and evaluation indicators, which together form the original data evaluation matrix .

(15)

Therefore, the steps for determining the objective weights of multiple indicators based on the entropy weight method are as follows:

(1) Data Normalization

Firstly, the 10 evaluation indicators identified in this work are classified into positive and negative indicators. Positive indicators are those where higher values indicate better characteristics, including the average THW, minimum TLC, and minimum LD. Negative indicators are those where lower values indicate better characteristics, including risk rate, THW SD, DHW SD, speed SD, acceleration SD, maximum LCO, and SDLP. The specific classification of positive and negative indicators, along with some statistical values, is shown in the Table 14.

Download:

Table 14. Statistical values of each evaluation indicator.

https://doi.org/10.1371/journal.pone.0329257.t014

From the Table 14, it can be observed that the scales of the evaluation indicators differ, and some indicators (such as THW and Risk rate) have significantly different maximum values, which greatly affect the weight distribution. Therefore, it is necessary to first standardize the data of each indicator, converting it to the range [0,1].

Let the standardized matrix be denoted as , Let the elements in be denoted as 。

Positive Indicator:

Negative Indicator:

In the formula, min is the minimum value of the evaluation indicator , max is the minimum value of the evaluation indicator , They can be found in Table 13.

(2)The information entropy represents the amount of information contained in the indicator, which indicates the importance of the attribute. It can be calculated using formula (16).

(16)

In the formula, represents the proportion of the current element in the current feature column. In general, . If , then . The information entropy for each evaluation indicator is calculated as shown in Table 15.

Download:

Table 15. Entropy of each evaluation indicator.

https://doi.org/10.1371/journal.pone.0329257.t015

where refers to the indicators corresponding to Table 2.

Therefore, the weights of each indicator can be calculated using formula (17) as shown in the Table below. Specific values can be found in the Table 16.

Download:

Table 16. Objective weights of each evaluation indicator.

https://doi.org/10.1371/journal.pone.0329257.t016

(17)

From Table 16, it can be observed that the highest objective weight is assigned to X₈, which represents the minimum TLC for lateral safety, with a weight of 0.29. This is followed by the average THW, representing longitudinal safety, with a weight of 0.23. The weights of the THW SD, DHW SD, the minimum LD, and SDLP are the smallest, each being 0.04. Additionally, a comparison with the subjective weights reveals that both the objective and subjective weights for the minimum TLC are relatively high, while the weights for the DHW SD are the lowest in both objective and subjective evaluations.

4.2.3 Subjective and objective combined weights based on AHP-entropy.

The linear weighting method is used to combine the determined subjective and objective weights. Specifically, the combined weights for the indicators of drivers’ lane-keeping ability during drive skill tests are calculated using the coefficient of variation [40]. For the weights derived from both the entropy weight and the AHP, as shown in formula (18). This approach effectively overcomes the limitations of relying on a single method, making the final weights for the evaluation indicators more scientifically reasonable. It provides a more accurate, objective, and comprehensive reflection of drivers’ lane-keeping ability.

(18)

In the formula, represents the subjective weights obtained from the AHP; represents the objective weights obtained from the entropy weight method; denotes the combined weights; is the proportion of subjective weights in the combined weights, which can be calculated using formula (19) based on the coefficient of variation method.

(19)

In the formula, represents the vector of subjective weights sorted in ascending order, n is the number of evaluation indicators. The final weight coefficients determined based on AHP-Entropy are shown in Table 17.

Download:

Table 17. Results of subjective and objective combined weighting.

https://doi.org/10.1371/journal.pone.0329257.t017

From Table 17, it is evident that both the subjective and objective weights of the average THW, representing longitudinal safety, and the TLC minimum, representing lateral safety, are relatively high. This underscores their significance in evaluating lane-keeping ability. Although the objective weight of the risk rate is comparatively lower, it is rated highest in terms of subjective importance. Therefore, the use of combined weights results in a more comprehensive and balanced evaluation, mitigating the potential bias of relying on a single weight. Moreover, the consistent weighting across longitudinal and lateral dimensions suggests that these factors are equally important in the context of driver skill tests. Additionally, the higher weights assigned to safety indicators over stability indicators highlight the critical role of safety in the lane-keeping process.

4.3 Threshold-based grading evaluation model for drivers lane-keeping ability

Based on the determined thresholds for each indicator and the combined weights, a threshold-based evaluation model for drivers’ lane-keeping ability during drive skill tests is constructed.

(20)

In the formula, represents the total quantified score of drivers’ lane-keeping ability during drive skill tests; is the combined subjective and objective weight for each indicator; is the value of each indicator after normalization to the [0,1] range using the thresholds applied to natural driving data; represents the passing score for each indicator based on the thresholds. The passing scores for each indicator and the total quantified threshold score of the model are calculated according to the above formula and are shown in Table 18.

Download:

Table 18. Full scores and passing scores for various lane-keeping indicators.

https://doi.org/10.1371/journal.pone.0329257.t018

From the results in Table 18, it can be seen that if the comprehensive score of each indicator in the drivers’ lane-keeping ability evaluation is greater than 25.24, it is considered that the driver has passed the lane-keeping ability test. The evaluation of drivers’ lane-keeping ability in the natural driving dataset is shown in Fig 11.

Download:

Fig 11. Comprehensive evaluation results of drivers’ lane-keeping ability based on natural driving data.

https://doi.org/10.1371/journal.pone.0329257.g011

As shown in the Fig 11, the distribution of driver scores follows a normal distribution. Among them, 99.56% of drivers fall within the passing range, with the highest score being 80.1. Out of 2,926 driving segments, only 13 drivers’ lane-keeping abilities were rated as failing.

5. Conclusion

Due to the current driver skill tests primarily assessing whether drivers meet specific standards in individual tasks, they are limited in evaluating only a single skill and cannot comprehensively reflect the driver’s overall competence. This study considers the importance of lane-keeping ability in assessing a driver’s overall driving skills, investigating a lane-keeping ability evaluation method based on MIF for driver skill tests, aiming to provide a reference for enhancing the current driver skill testing framework. Specifically, the work first analyzes natural driving data of human driving behavior in typical low-speed lane-keeping scenarios. Ten indicators used for assessing driver lane-keeping ability were extracted from both lateral and longitudinal dimensions. Next, a combination weighting model of subjective and objective weights using AHP and entropy weight method was employed to determine the weights of each indicator. The Youden Index, Boxplot, and statistical values of the indicators were analyzed to determine the thresholds for each indicator. Based on the determined weights and thresholds, a comprehensive evaluation model was constructed. Finally, the model was applied to actual natural driving data for evaluating driver lane-keeping ability. The evaluation results validate the accuracy and effectiveness of the proposed method for assessing driver lane-keeping ability. This work provides an effective comprehensive evaluation method for driver skill testing, enhancing the effectiveness of driver skill training. Moreover, it can also contribute to the testing of lane-keeping assistance system (LKAS) functionalities in high-level autonomous vehicles, thereby improving the safety and reliability of LKAS technology.

This study also has several limitations. The validation of our MIF-based lane-keeping ability evaluation method is primarily based on existing natural driving datasets. During actual testing, there is a possibility of long-tail events in which a driver may pass the lane-keeping ability evaluation despite exhibiting an abnormal value in a specific indicator. The scalability and adaptability of this method in complex real-world environments still need to be further improved. With the gradual advancement of autonomous driving technology and the increasing maturity of driver assistance systems, driving skill test vehicles will extensively utilize advanced information technologies. In the future, we will focus on the real-time data collection capabilities of high-level autonomous vehicles, exploring real-time lane-keeping ability assessment methods for driver skill test, it will provide scientific guidance for evaluating the safety and reliability of autonomous vehicles in real-road operations, and ultimately help explore methods to ensure road traffic safety from the source.

Acknowledgments

We thank the editors and the reviewers for their valuable comments and suggestions.

References

1. Tronsmoen T. Associations between driver training, determinants of risky driving behaviour and crash involvement. Safety Science. 2010;48(1):35–45.
- View Article
- Google Scholar
2. Xu J, Liu J, Sun X, Zhang K, Qu W, Ge Y. The relationship between driving skill and driving behavior: Psychometric adaptation of the Driver Skill Inventory in China. Accid Anal Prev. 2018;120:92–100. pmid:30103100
- View Article
- PubMed/NCBI
- Google Scholar
3. Adanu EK, Bullard C, Dzinyela R, Jones S. Unbelted and unlicensed: An investigation of crash clusters and associated injury severities. Journal of Transportation Safety & Security. 2025:1–26.
- View Article
- Google Scholar
4. Møller M, Janstrup KH. Crash involvement among unlicensed 17 year old drivers before and after licensing at 17 was allowed. Accid Anal Prev. 2021;156:106109. pmid:33905895
- View Article
- PubMed/NCBI
- Google Scholar
5. Bates L, Alexander M, Webster J. The link between dangerous driving and other criminal behaviour: a scoping review. Safer Communities. 2022;21(2):137–56.
- View Article
- Google Scholar
6. Ge Y, Qu W, Jiang C, Du F, Sun X, Zhang K. The effect of stress and personality on dangerous driving behavior among Chinese drivers. Accid Anal Prev. 2014;73:34–40. pmid:25171523
- View Article
- PubMed/NCBI
- Google Scholar
7. Zhang Y, Chen X, Wang J, Zheng Z, Wu K. A generative car-following model conditioned on driving styles. Transportation research part C: emerging technologies. 2022; 145:103926.
- View Article
- Google Scholar
8. Zhang X, Sun J, Qi X, Sun J. Simultaneous modeling of car-following and lane-changing behaviors using deep learning. Transportation Research Part C: Emerging Technologies. 2019;104:287–304.
- View Article
- Google Scholar
9. Chen K, Knoop VL, Liu P, Li Z, Wang Y. Modeling the impact of lane-changing’s anticipation on car-following behavior. Transportation Research Part C: Emerging Technologies. 2023;150:104110.
- View Article
- Google Scholar
10. Shin YW, Kim DK, Kim EJ. Impact of driver behavior and vehicle type on safety of vehicle platoon under lane change situation. Transportation Research Record. 2023;2677(5):40–50.
- View Article
- Google Scholar
11. Tan H, Lu G, Wang Z, Hua J, Liu M. A unified risk field-based driving behavior model for car-following and lane-changing behaviors simulation. Simulation Modelling Practice and Theory. 2024;136:102991.
- View Article
- Google Scholar
12. Wu H, Zhao H, Guo Y, Chen Y, Duan M, Zhan X, et al. Trajectory optimisation method for CAVs considering energy–efficiency balance. Proceedings of the Institution of Civil Engineers-Transport. 2025;178(4):1–11.
- View Article
- Google Scholar
13. Wang C, Popp C, Winner H. Acceleration-based collision criticality metric for holistic online safety assessment in automated driving. IEEE Access. 2022;10:70662–74.
- View Article
- Google Scholar
14. Zhang H, Hou N, Zhang J, Li X, Huang Y. Evaluating the safety impact of connected and autonomous vehicles with lane management on freeway crash hotspots using the surrogate safety assessment model. Journal of Advanced Transportation. 2021;1:5565343.
- View Article
- Google Scholar
15. Chai W, Wang J, Chen J. Rethinking the evaluation of driver behavior analysis approaches. IEEE Transactions on Intelligent Transportation Systems. 2024.
- View Article
- Google Scholar
16. Papadoulis A, Quddus M, Imprialou M. Evaluating the safety impact of connected and autonomous vehicles on motorways. Accid Anal Prev. 2019;124:12–22. pmid:30610995
- View Article
- PubMed/NCBI
- Google Scholar
17. Zhao H, Wu H, Lu N, Zhan X, Xu E, Yuan Q. Lane Changing in a Vehicle-to-Everything Environment: Research on a Vehicle Lane-Changing Model in the Tunnel Area by Considering the Influence of Brightness and Noise Under a Vehicle-to-Everything Environment. IEEE Intelligent Transportation Systems Magazine. 2022;15(2):225–37.
- View Article
- Google Scholar
18. Schöner H-P, Pretto P, Sodnik J, Kaluza B, Komavec M, Varesanovic D, et al. A safety score for the assessment of driving style. Traffic Inj Prev. 2021;22(5):384–9. pmid:33881358
- View Article
- PubMed/NCBI
- Google Scholar
19. Sternlund S. Traffic safety potential and effectiveness of lane keeping support. Sweden: Chalmers Tekniska Hogskola. 2020.
20. Utriainen R, Pöllänen M, Liimatainen H. The safety potential of lane keeping assistance and possible actions to improve the potential. IEEE Transactions on Intelligent Vehicles. 2020;5(4):556–64.
- View Article
- Google Scholar
21. Chen J, Sun D, Zhao M, Li Y, Liu Z. A new lane keeping method based on human-simulated intelligent control. IEEE Transactions on Intelligent Transportation Systems. 2021;23(7):7058–69.
- View Article
- Google Scholar
22. Aydin MM. A new evaluation method to quantify drivers’ lane keeping behaviors on urban roads. Transportation Letters. 2023;12(10):738–49.
- View Article
- Google Scholar
23. Song R, Li X, Zhao X, Liu M, Zhou J, Wang FY. Identifying critical test scenarios for lane keeping assistance system using analytic hierarchy process and hierarchical clustering. IEEE Transactions on Intelligent Vehicles. 2023;8(10):4370–80.
- View Article
- Google Scholar
24. Zhang P, Zhu B, Zhao J, Fan T, Sun Y. Performance evaluation method for automated driving system in logical scenario. Automotive Innovation. 2022;5(3):299–310.
- View Article
- Google Scholar
25. Ghasemzadeh A, Ahmed MM. Utilizing naturalistic driving data for in-depth analysis of driver lane-keeping behavior in rain: Non-parametric MARS and parametric logistic regression modeling approaches. Transportation Research Part C: Emerging Technologies. 2018;90:379–92.
- View Article
- Google Scholar
26. Zhang Y, Wang C, Yu R, Wang L, Quan W, Gao Y, et al. The AD4CHE dataset and its application in typical congestion scenarios of traffic jam pilot systems. IEEE Transactions on Intelligent Vehicles. 2023;8(5):3312–23.
- View Article
- Google Scholar
27. Quante L, Zhang M, Preuk K, Schießl C. Human performance in critical scenarios as a benchmark for highly automated vehicles. Automotive Innovation. 2021;4:274–83.
- View Article
- Google Scholar
28. Biswas RK, Friswell R, Olivier J, Williamson A, Senserrick T. A systematic review of definitions of motor vehicle headways in driver behaviour and performance studies. Transportation Research Part F: Traffic Psychology and Behaviour. 2021;77:38–54.
- View Article
- Google Scholar
29. Zhang J, Lu H, Sun J. Improved driver clustering framework by considering the variability of driving behaviors across traffic operation conditions. Journal of Transportation Engineering, Part A: Systems. 2022;148(7):04022033.
- View Article
- Google Scholar
30. Yu R, Long X, Quddus M, Wang J. A Bayesian Tobit quantile regression approach for naturalistic longitudinal driving capability assessment. Accid Anal Prev. 2020;147:105779. pmid:32980786
- View Article
- PubMed/NCBI
- Google Scholar
31. Li K, Yang X, Luo Y, Li H. Road geometry perception without accurate positioning and lane information. IET Intelligent Transport Systems. 2022;16(7):940–57.
- View Article
- Google Scholar
32. Godthelp H, Milgram P, Blaauw GJ. The development of a time-related measure to describe driving strategy. Human Factors. 1984;26(3):257–68.
- View Article
- Google Scholar
33. Verster JC, Roth T. Effects of central nervous system drugs on driving: speed variability versus standard deviation of lateral position as outcome measure of the on-the-road driving test. Hum Psychopharmacol. 2014;29(1):19–24. pmid:24375715
- View Article
- PubMed/NCBI
- Google Scholar
34. Gudmund R. Iversen, mary gergen. Statistics: basic concepts and methods. Higher Education Press. 2000.
35. Yan Y, Wang H, Wan L, Yuan H, Liu G. Driver Load Model Under Different Tunnel Lateral Widths Based on Factor Analysis and Entropy Method. China Journal of Highway and Transport. 2023;36(2):190–202.
- View Article
- Google Scholar
36. Guo Y, Su Y, Fu R, Yuan W. Influence of lane-changing maneuvers on passenger comfort of intelligent vehicles. China Journal of Highway and Transport. 2022;35(5):221–30.
- View Article
- Google Scholar
37. Saaty TL. A scaling method for priorities in hierarchical structures. Journal of Mathematical Psychology. 1977;15(3):234–81.
- View Article
- Google Scholar
38. Franek J, Kresta A. Judgment scales and consistency measure in AHP. Procedia Economics and Finance. 2014;12:164–73.
- View Article
- Google Scholar
39. Brunelli M. Introduction to the analytic hierarchy process. Springer. 2014.
40. Li K. A study on comprehensive assessment of heavy metal pollution in water body sediments based on combination weighting and grey cloud model. Dalian University of Technology. 2019.

[ref1] 1. Tronsmoen T. Associations between driver training, determinants of risky driving behaviour and crash involvement. Safety Science. 2010;48(1):35–45.
View Article
Google Scholar

[2] View Article

[3] Google Scholar

[ref2] 2. Xu J, Liu J, Sun X, Zhang K, Qu W, Ge Y. The relationship between driving skill and driving behavior: Psychometric adaptation of the Driver Skill Inventory in China. Accid Anal Prev. 2018;120:92–100. pmid:30103100
View Article
PubMed/NCBI
Google Scholar

[5] View Article

[6] PubMed/NCBI

[7] Google Scholar

[ref3] 3. Adanu EK, Bullard C, Dzinyela R, Jones S. Unbelted and unlicensed: An investigation of crash clusters and associated injury severities. Journal of Transportation Safety & Security. 2025:1–26.
View Article
Google Scholar

[9] View Article

[10] Google Scholar

[ref4] 4. Møller M, Janstrup KH. Crash involvement among unlicensed 17 year old drivers before and after licensing at 17 was allowed. Accid Anal Prev. 2021;156:106109. pmid:33905895
View Article
PubMed/NCBI
Google Scholar

[12] View Article

[13] PubMed/NCBI

[14] Google Scholar

[ref5] 5. Bates L, Alexander M, Webster J. The link between dangerous driving and other criminal behaviour: a scoping review. Safer Communities. 2022;21(2):137–56.
View Article
Google Scholar

[16] View Article

[17] Google Scholar

[ref6] 6. Ge Y, Qu W, Jiang C, Du F, Sun X, Zhang K. The effect of stress and personality on dangerous driving behavior among Chinese drivers. Accid Anal Prev. 2014;73:34–40. pmid:25171523
View Article
PubMed/NCBI
Google Scholar

[19] View Article

[20] PubMed/NCBI

[21] Google Scholar

[ref7] 7. Zhang Y, Chen X, Wang J, Zheng Z, Wu K. A generative car-following model conditioned on driving styles. Transportation research part C: emerging technologies. 2022; 145:103926.
View Article
Google Scholar

[23] View Article

[24] Google Scholar

[ref8] 8. Zhang X, Sun J, Qi X, Sun J. Simultaneous modeling of car-following and lane-changing behaviors using deep learning. Transportation Research Part C: Emerging Technologies. 2019;104:287–304.
View Article
Google Scholar

[26] View Article

[27] Google Scholar

[ref9] 9. Chen K, Knoop VL, Liu P, Li Z, Wang Y. Modeling the impact of lane-changing’s anticipation on car-following behavior. Transportation Research Part C: Emerging Technologies. 2023;150:104110.
View Article
Google Scholar

[29] View Article

[30] Google Scholar

[ref10] 10. Shin YW, Kim DK, Kim EJ. Impact of driver behavior and vehicle type on safety of vehicle platoon under lane change situation. Transportation Research Record. 2023;2677(5):40–50.
View Article
Google Scholar

[32] View Article

[33] Google Scholar

[ref11] 11. Tan H, Lu G, Wang Z, Hua J, Liu M. A unified risk field-based driving behavior model for car-following and lane-changing behaviors simulation. Simulation Modelling Practice and Theory. 2024;136:102991.
View Article
Google Scholar

[35] View Article

[36] Google Scholar

[ref12] 12. Wu H, Zhao H, Guo Y, Chen Y, Duan M, Zhan X, et al. Trajectory optimisation method for CAVs considering energy–efficiency balance. Proceedings of the Institution of Civil Engineers-Transport. 2025;178(4):1–11.
View Article
Google Scholar

[38] View Article

[39] Google Scholar

[ref13] 13. Wang C, Popp C, Winner H. Acceleration-based collision criticality metric for holistic online safety assessment in automated driving. IEEE Access. 2022;10:70662–74.
View Article
Google Scholar

[41] View Article

[42] Google Scholar

[ref14] 14. Zhang H, Hou N, Zhang J, Li X, Huang Y. Evaluating the safety impact of connected and autonomous vehicles with lane management on freeway crash hotspots using the surrogate safety assessment model. Journal of Advanced Transportation. 2021;1:5565343.
View Article
Google Scholar

[44] View Article

[45] Google Scholar

[ref15] 15. Chai W, Wang J, Chen J. Rethinking the evaluation of driver behavior analysis approaches. IEEE Transactions on Intelligent Transportation Systems. 2024.
View Article
Google Scholar

[47] View Article

[48] Google Scholar

[ref16] 16. Papadoulis A, Quddus M, Imprialou M. Evaluating the safety impact of connected and autonomous vehicles on motorways. Accid Anal Prev. 2019;124:12–22. pmid:30610995
View Article
PubMed/NCBI
Google Scholar

[50] View Article

[51] PubMed/NCBI

[52] Google Scholar

[ref17] 17. Zhao H, Wu H, Lu N, Zhan X, Xu E, Yuan Q. Lane Changing in a Vehicle-to-Everything Environment: Research on a Vehicle Lane-Changing Model in the Tunnel Area by Considering the Influence of Brightness and Noise Under a Vehicle-to-Everything Environment. IEEE Intelligent Transportation Systems Magazine. 2022;15(2):225–37.
View Article
Google Scholar

[54] View Article

[55] Google Scholar

[ref18] 18. Schöner H-P, Pretto P, Sodnik J, Kaluza B, Komavec M, Varesanovic D, et al. A safety score for the assessment of driving style. Traffic Inj Prev. 2021;22(5):384–9. pmid:33881358
View Article
PubMed/NCBI
Google Scholar

[57] View Article

[58] PubMed/NCBI

[59] Google Scholar

[ref19] 19. Sternlund S. Traffic safety potential and effectiveness of lane keeping support. Sweden: Chalmers Tekniska Hogskola. 2020.

[ref20] 20. Utriainen R, Pöllänen M, Liimatainen H. The safety potential of lane keeping assistance and possible actions to improve the potential. IEEE Transactions on Intelligent Vehicles. 2020;5(4):556–64.
View Article
Google Scholar

[62] View Article

[63] Google Scholar

[ref21] 21. Chen J, Sun D, Zhao M, Li Y, Liu Z. A new lane keeping method based on human-simulated intelligent control. IEEE Transactions on Intelligent Transportation Systems. 2021;23(7):7058–69.
View Article
Google Scholar

[65] View Article

[66] Google Scholar

[ref22] 22. Aydin MM. A new evaluation method to quantify drivers’ lane keeping behaviors on urban roads. Transportation Letters. 2023;12(10):738–49.
View Article
Google Scholar

[68] View Article

[69] Google Scholar

[ref23] 23. Song R, Li X, Zhao X, Liu M, Zhou J, Wang FY. Identifying critical test scenarios for lane keeping assistance system using analytic hierarchy process and hierarchical clustering. IEEE Transactions on Intelligent Vehicles. 2023;8(10):4370–80.
View Article
Google Scholar

[71] View Article

[72] Google Scholar

[ref24] 24. Zhang P, Zhu B, Zhao J, Fan T, Sun Y. Performance evaluation method for automated driving system in logical scenario. Automotive Innovation. 2022;5(3):299–310.
View Article
Google Scholar

[74] View Article

[75] Google Scholar

[ref25] 25. Ghasemzadeh A, Ahmed MM. Utilizing naturalistic driving data for in-depth analysis of driver lane-keeping behavior in rain: Non-parametric MARS and parametric logistic regression modeling approaches. Transportation Research Part C: Emerging Technologies. 2018;90:379–92.
View Article
Google Scholar

[77] View Article

[78] Google Scholar

[ref26] 26. Zhang Y, Wang C, Yu R, Wang L, Quan W, Gao Y, et al. The AD4CHE dataset and its application in typical congestion scenarios of traffic jam pilot systems. IEEE Transactions on Intelligent Vehicles. 2023;8(5):3312–23.
View Article
Google Scholar

[80] View Article

[81] Google Scholar

[ref27] 27. Quante L, Zhang M, Preuk K, Schießl C. Human performance in critical scenarios as a benchmark for highly automated vehicles. Automotive Innovation. 2021;4:274–83.
View Article
Google Scholar

[83] View Article

[84] Google Scholar

[ref28] 28. Biswas RK, Friswell R, Olivier J, Williamson A, Senserrick T. A systematic review of definitions of motor vehicle headways in driver behaviour and performance studies. Transportation Research Part F: Traffic Psychology and Behaviour. 2021;77:38–54.
View Article
Google Scholar

[86] View Article

[87] Google Scholar

[ref29] 29. Zhang J, Lu H, Sun J. Improved driver clustering framework by considering the variability of driving behaviors across traffic operation conditions. Journal of Transportation Engineering, Part A: Systems. 2022;148(7):04022033.
View Article
Google Scholar

[89] View Article

[90] Google Scholar

[ref30] 30. Yu R, Long X, Quddus M, Wang J. A Bayesian Tobit quantile regression approach for naturalistic longitudinal driving capability assessment. Accid Anal Prev. 2020;147:105779. pmid:32980786
View Article
PubMed/NCBI
Google Scholar

[92] View Article

[93] PubMed/NCBI

[94] Google Scholar

[ref31] 31. Li K, Yang X, Luo Y, Li H. Road geometry perception without accurate positioning and lane information. IET Intelligent Transport Systems. 2022;16(7):940–57.
View Article
Google Scholar

[96] View Article

[97] Google Scholar

[ref32] 32. Godthelp H, Milgram P, Blaauw GJ. The development of a time-related measure to describe driving strategy. Human Factors. 1984;26(3):257–68.
View Article
Google Scholar

[99] View Article

[100] Google Scholar

[ref33] 33. Verster JC, Roth T. Effects of central nervous system drugs on driving: speed variability versus standard deviation of lateral position as outcome measure of the on-the-road driving test. Hum Psychopharmacol. 2014;29(1):19–24. pmid:24375715
View Article
PubMed/NCBI
Google Scholar

[102] View Article

[103] PubMed/NCBI

[104] Google Scholar

[ref34] 34. Gudmund R. Iversen, mary gergen. Statistics: basic concepts and methods. Higher Education Press. 2000.

[ref35] 35. Yan Y, Wang H, Wan L, Yuan H, Liu G. Driver Load Model Under Different Tunnel Lateral Widths Based on Factor Analysis and Entropy Method. China Journal of Highway and Transport. 2023;36(2):190–202.
View Article
Google Scholar

[107] View Article

[108] Google Scholar

[ref36] 36. Guo Y, Su Y, Fu R, Yuan W. Influence of lane-changing maneuvers on passenger comfort of intelligent vehicles. China Journal of Highway and Transport. 2022;35(5):221–30.
View Article
Google Scholar

[110] View Article

[111] Google Scholar

[ref37] 37. Saaty TL. A scaling method for priorities in hierarchical structures. Journal of Mathematical Psychology. 1977;15(3):234–81.
View Article
Google Scholar

[113] View Article

[114] Google Scholar

[ref38] 38. Franek J, Kresta A. Judgment scales and consistency measure in AHP. Procedia Economics and Finance. 2014;12:164–73.
View Article
Google Scholar

[116] View Article

[117] Google Scholar

[ref39] 39. Brunelli M. Introduction to the analytic hierarchy process. Springer. 2014.

[ref40] 40. Li K. A study on comprehensive assessment of heavy metal pollution in water body sediments based on combination weighting and grey cloud model. Dalian University of Technology. 2019.

Figures

Abstract

1. Introduction

2. Dataset extraction and processing

2.1 Data extraction

2.2 Data processing

3. Lane-keeping ability evaluation indicator

4. Driver lane-keeping ability evaluation model based on MIF

4.1 Setting thresholds for evaluation indicators

4.1.1 Youden Index.

4.1.2 Boxplot.

4.1.3 20%−80% percentiles.

4.2 Determination of weights for evaluation indicators

4.2.1 Subjective weights based on the AHP.

4.2.2 Objective weight determination based on the entropy weight method.

4.2.3 Subjective and objective combined weights based on AHP-entropy.

4.3 Threshold-based grading evaluation model for drivers lane-keeping ability

5. Conclusion

Acknowledgments

References