Skip to main content
Advertisement
Browse Subject Areas
?

Click through the PLOS taxonomy to find articles in your field.

For more information about PLOS Subject Areas, click here.

  • Loading metrics

ECE-VDTDA: A robust and computationally efficient collision avoidance system for driver assistance in foggy weather

  • Naeem Raza ,

    Contributed equally to this work with: Naeem Raza, Muhammad Asif Habib, Mudassar Ahmad

    Roles Conceptualization, Data curation, Formal analysis, Investigation, Methodology, Validation, Visualization, Writing – original draft

    Affiliations Department of Computer Science, National University of Modern Languages, Islamabad, Faisalabad Campus, Faisalabad, Punjab, Pakistan, Department of Computer Science, National Textile University, Faisalabad, Punjab, Pakistan

  • Muhammad Asif Habib ,

    Contributed equally to this work with: Naeem Raza, Muhammad Asif Habib, Mudassar Ahmad

    Roles Conceptualization, Funding acquisition, Methodology, Supervision

    maabid@imamu.edu.sa

    Affiliation College of Computer and Information Sciences, Imam Mohammad Ibn Saud Islamic University (IMSIU), Riyadh, Saudi Arabia

  • Abdullah M. Albarrak ,

    Roles Conceptualization, Project administration, Writing – review & editing

    ‡ These authors also contributed equally to this work.

    Affiliation College of Computer and Information Sciences, Imam Mohammad Ibn Saud Islamic University (IMSIU), Riyadh, Saudi Arabia

  • Mudassar Ahmad ,

    Contributed equally to this work with: Naeem Raza, Muhammad Asif Habib, Mudassar Ahmad

    Roles Conceptualization, Formal analysis, Investigation, Methodology, Supervision, Validation, Writing – review & editing

    Affiliation Department of Computer Science, National Textile University, Faisalabad, Punjab, Pakistan

  • Alaa Eldeen Sayed Ahmed ,

    Roles Conceptualization, Methodology, Project administration, Writing – review & editing

    ‡ These authors also contributed equally to this work.

    Affiliation College of Computer and Information Sciences, Imam Mohammad Ibn Saud Islamic University (IMSIU), Riyadh, Saudi Arabia

  • Muhammad Yasir ,

    Roles Conceptualization, Formal analysis, Methodology, Software, Writing – review & editing

    ‡ These authors also contributed equally to this work.

    Affiliation Department of Computer Science, University of Engineering and Technology Lahore, Faisalabad Campus, Faisalabad, Punjab, Pakistan

  • Habib Ur Rahman ,

    Roles Conceptualization, Data curation, Investigation, Methodology, Resources, Writing – review & editing

    ‡ These authors also contributed equally to this work.

    Affiliation Department of Software Engineering, The University of Lahore, Lahore, Punjab, Pakistan

  • Muhammad Ahsan Latif

    Roles Conceptualization, Formal analysis, Methodology, Software, Validation, Writing – review & editing

    ‡ These authors also contributed equally to this work.

    Affiliation Department of Computer Science, University of Agriculture, Faisalabad, Punjab, Pakistan

Abstract

Advanced Driver Assistance Systems (ADAS) and Collision Avoidance Systems (CAS) are the primary modules of modern human-centric and autonomous driving applications, such as forward and rear-end collision warnings. To enhance the performance of ADAS and CAS systems in foggy weather, an Efficient and Cost-Effective Vehicle Detection and Tracking with Driver Assistance (ECE-VDTDA) system is proposed. The proposed ECE-VDTDA system comprises vehicle detection, tracking, and driver assistance modules. An optimized SimYOLO-V5s_WIOU vehicle detection algorithm is proposed, based on the SimSPPF module, the baseline You Only Look Once (YOLO) algorithm (YOLO-V5s), and the Wise Intersection Over Union (WIOU) localization loss function. State-of-the-art Deep-SORT, Strong-SORT, and optimized Deep-SORT algorithms are utilized for vehicle tracking. The vehicle detection and tracking performance of the ECE-VDTDA system is rigorously evaluated on DAWN, foggy driving, foggy cityscapes, BDD100K, web-collected, and self-collected foggy weather datasets. Optimized SimYOLO-V5s_WIOU algorithm outperformed on the foggy driving dataset with a 17.45% increase in mAP50, and foggy cityscapes dataset with a 0.32%, 1.05%, 1.58%, 2%, 0.54% increase in the multiclass mAP50, mAP50-95, F1 score, precision, and recall scores, respectively, compared to the baseline YOLO-V5s. Furthermore, the SimYOLO-V5s_WIOU algorithm also outperformed the state-of-the-art methods and enables Deep-SORT, Strong-SORT, and optimized Deep-SORT vehicle tracking algorithms to track vehicles with high confidence. The driver assistance module of the ECE-VDTDA system helps prevent imminent road collisions in foggy weather by estimating distance, speed, and time-to-collision and by issuing timely collision warnings. The experimental results demonstrate the robustness and computational efficiency of the proposed ECE-VDTDA system.

1 Introduction

Road traffic crashes cause a significant increase in global fatalities, traffic congestion, and property damage, necessitating the development, testing, and real-world implementation of efficient and cost-effective solutions for driver assistance. Vehicle detection enables a plethora of use cases, including vehicle tracking, counting, speed and distance estimations, traffic surveillance, traffic signs, anomaly detection, intelligent transportation systems, collision detection, and avoidance [1].

  • Road traffic crashes can be effectively minimized by utilizing Advanced Driver Assistance Systems (ADAS) and Collision Avoidance Systems (CAS) with high-accuracy algorithms. Today, available ADAS and CAS systems have several limitations in terms of crash use-cases, false and missed detections, and are less adaptable to complex and specifically foggy weather conditions. Real-time and robust road perception and prediction techniques can only help ADAS systems to be proactively and timely aware of potential road collisions, and generate appropriate collision warnings to avoid them efficiently [2].
  • Traditional CAS and ADAS systems are primarily based on ultrasonic or LiDAR sensors, have limitations in detection range, accuracy, and cost [3,4]. Some advanced CAS systems integrate diverse sensors, such as cameras, Light Detection and Ranging (LiDAR), Radio Detection and Ranging (RADAR), and Vehicle-to-Everything (V2X) communication technologies, along with machine learning algorithms, to autonomously detect and respond to potential road hazards, thereby avoiding collisions. These systems face challenges, including sensor noise, low accuracy in adverse weather conditions, communication delays, and security risks. Significant improvements in detection accuracy of sensors, support of edge computing technologies, and decision-making algorithms are still required for safer and autonomous driving [5].
  • Vehicle detection and tracking are key components for vision-based speed and distance estimation methods, as they enable the detection and tracking of vehicle positions in frames, which requires camera calibration. Efficient vehicle speed estimation helps enforce speed limits in ITS and CAS systems [6]. You Only Look Once (YOLO) algorithm is widely used for vision-based vehicle speed and distance estimations. In a study, nano, small, and medium models of YOLO-V8 algorithm and Region of Interest (ROI) were utilized for precise and efficient bidirectional lane detection. The speed estimation is based on image or video frame coordinates [7].
  • The CAS systems based on warnings can be classified into two categories: Forward Collision Warning (FCW) and Rear-end Collision Warning (RCW) systems [8]. Generally, collision avoidance involves two main steps: path planning and path tracking. In the path planning phase, collision-free path trajectories are generated, and in the path tracking phase, the generated collision-free path is accurately followed [9]. The process of decision-making for accurate trajectory generation, prediction, and path planning is essential for safely navigating roads and avoiding collisions in autonomous vehicles. The static and dynamic object detection, directional information, positions, velocities, and time headway are the basic parameters for future path planning and trajectory generation.

The generation of trajectories through deep-learning algorithms requires data from either cost-effective cameras or high-cost LiDAR sensors. The robust path planning through the Potential Field (PF) algorithm and trajectories generation is affected by the misinterpretation of the object’s occlusion and intension, sensor noise, overfitting, and errors in data collection methods [1012]. Surrogate Safety Measures (SSMs) are the state-of-the-art safety indicators used for proactively assessing the road safety and potential crash risks in ADAS and CAS systems. Several SSM indicators based on temporal positions of the conflicting vehicles are Time-To-Collision (TTC), time exposed TTC, time integrated TTC, modified TTC, post-encroachment time, time to accident, anticipated collision time, headway, and crash index [1315]. A collision avoidance scenario can be handled by braking and by steering [16]. The vehicle driving scenarios are also classified as autonomous driving, cooperative driving, and autonomous and cooperative driving [1719]. The detailed design, functioning of driving scenarios, application specific use-cases of driver assistance, and collision avoidance scenarios are summarized in Table 1, and an abstract-level overview of the proposed Efficient and Cost-Effective Vehicle Detection and Collision Avoidance with Driver Assistance (ECE-VDTDA) system is illustrated in Fig 1, and vehicle detection use-cases are highlighted in Fig 2.

The main contributions of the presented research work are:

  • Robust and Computationally Efficient ECE-VDTDA System: The Efficient and Cost-Effective Vehicle Detection and Tracking (ECE-VDT) with Driver Assistance (ECE-VDTDA) system is proposed for collision avoidance and driver assistance in foggy weather.
  • SimYOLO-V5s_WIOU Vehicle Detection Algorithm: The optimized SimYOLO-V5s_WIOU algorithm is proposed for vehicle detection based on Simplified Spatial Payramid Pooling Fast (SimSPPF) module, baseline YOLO-V5s [20,21] model, and Wise Intersection Over Union (WIOU) [22] localization loss function.
  • Vehicle Tracking Algorithm: Baseline Deep-SORT [23,24], optimized Deep-SORT [25], and Strong-SORT [26,27] algorithms are utilized for vehicle tracking in the ECE-VDT module of the proposed ECE-VDTDA system.
  • Robust Performance Evaluation: The vehicle detection performance is evaluated on diverse Foggy Driving (FD) [28,29], Vehicle Detection in Adverse Weather Nature (DAWN) [30,31], and Foggy Cityscape (FC) [28,32] foggy weather image datasets. The vehicle tracking performance is evaluated on diverse BDD100K datasets [33], web-collected [34], and self-collected [35] foggy weather video datasets.
  • Speed, Distance, and Time-To-Collision (TTC) Estimations: Vision-based Vehicle Distance Estimation (VDE), Vehicle Speed Estimation (VSE), and Time-To-Collision (TTC) estimation methods are proposed. The bounding box height is considered the primary factor in these estimations.
  • Collision Alerts: The VDE, VSE, and TTC estimations based on threshold levels enable the ECE-VDTDA system to generate collision alerts/warnings for enhanced driver assistance to avoid imminent road collisions in foggy weather.

The overall organization of the paper is as follows: The Introduction section discusses the importance of collision avoidance and driver assistance systems, the Literature review section presents the working of the You Only Look Once (YOLO) algorithm, as well as the State-Of-The-Art (SOTA) literature with related works highlighted. The Materials and methods section provides the design and implementation details of the proposed ECE-VDTDA system along with distance, speed, and TTC estimations. The Results and discussion section analyzes and compares the performance of the optimized SimYOLO-V5s_WIOU vehicle detection algorithm, optimized Deep-SORT and Strong-SORT SOTA vehicle tracking algorithms, VDE, VSE, and TTC estimations, and collision alerts. Finally, the Conclusion and future work section concludes the overall contribution of the proposed ECE-VDTDA system and provides directions for future work.

Literature review

The state-of-the-art, single-stage, deep learning, and Convolutional Neural Network (CNN) architecture, You Only Look Once (YOLO) algorithm, consists of convolutional layers to form a Fully Convolutional Neural Network (FCN) architecture. Initially, 24 convolutional layers with two fully connected layers were used in the design of YOLO Version 1 (YOLO-V1). A super-fast, Fast YOLO also proposed, comprises nine convolutional layers. YOLO is invariant to the image size. The YOLO algorithm only looks once at the whole image and provides predictions. Fundamentally, the YOLO algorithm divides the input image into sized grid cells. Each grid cell is subject to 5 predictions based on B bounding boxes. Where bx and by are the coordinates of the center of the bounding box, whereas bw represents the width and bh represents the height of the bounding box. Mathematically, the bounding box elements are expressed in Eqs (1) and (2). The confidence score is associated with each bounding box. It also provides C class probabilities for the object classes being detected. For the on-road scenario, object classes can be vehicles, pedestrians, road signs, traffic lights, etc. The predictions of the YOLO are encoded as a tensor of size [36].

(1)(2)

A vision-based, efficient, and reliable framework for the ADAS system is proposed, utilizing the YOLO algorithm for vehicle detection and collision avoidance. The system supports driver assistance in urban and autonomous driving conditions on highways. It also supports Ultra-Fast Lane Detection (UFLD) for maintaining a safe lane to avoid potential collision risks [4]. A vision-based FCW framework is proposed, utilizing the YOLO-V5s algorithm for vehicle detection and the Kalman Filter (KF) for vehicle tracking. The FCW framework also supports driver assistance by incorporating leading vehicle speed and distance estimation as well as threshold-based TTC warnings. The experiments are performed on the BDD100K and DAWN datasets [37].

Attention Mechanism (AM) based variants, AMYOLO-V5s, were proposed for efficient and cost-effective vehicle detection in foggy weather. They evaluated the vehicle detection performance of AMYOLO-V5s variants on Google Colab and a local workstation system by utilizing the state-of-the-art Foggy Driving (FD) and Vehicle Detection in Adverse Weather Nature (DAWN) datasets [38]. Another updated study proposed an efficient and cost-effective SimYOLO-V5s varinst based on the SimSPPF module and diverse localization loss functions such as Complete Intersection Over Union (CIOU), Distance Intersection Over Union (DIOU), Efficient Intersection Over Union (EIOU), Generalized Intersection Over Union (GIOU), and SCYLLA Intersection Over Union (SIOU). They evaluated the vehicle detection performance of SimYOLO-V5s variants on Google Colab and a local workstation system by utilizing the state-of-the-art FD, DAWN, and Foggy Cityscapes (FC) datasets. They also utilized optimized Deep-SORT and Strong-SORT algorithms for vehicle tracking and evaluated the real-time performance on foggy video and the Burkey Deep Driven (BDD100K) datasets’ video sequences [25]. The vehicle detection performance of the YOLO-V11 algorithmic models was also evaluated on Google Colab by utilizing the FD and DAWN foggy datasets [39]. Another study focused on the social Vehicle-to-Everything (V2X) communication framework based on Software-Defined Networking (SDN) and 5G cellular infrastructure. Their framework highlighted the strength of Vehicle-To-Vehicle (V2V), Vehicle-To-Infrastructure (V2I), Vehicle-To-Network (V2N), and Vehicle-To-Padistrain (V2P) communication scenarios for effective driver assistance, traffic surveillance, and ITS [40]. A similar study considered a cellular and millimeter wave communication model for vehicular cloud computing [41]. The YOLO and Deep-SORT algorithms are widely used for detection and counting tasks [4244]. A vision-based vehicle detection and speed estimation method was proposed by utilizing a monocular camera and the YOLO-V6 algorithm. The performance is primarily evaluated using the BrnoCompSpeed, focusing on detection accuracy in terms of recall, precision, and mAP scores, as well as speed in terms of FPS. Mean and median errors in detecting speed in Km/h are also presented [6]. Another study utilizes YOLO-V5s and Deep-SORT algorithms for speed estimation, employing a camera and RADAR for a multi-sensor methodology. They evaluated the performance on the vehicle re-identification dataset [45]. The distance estimation method based on the YOLO-V8 algorithm was proposed, and its performance was evaluated on the PASCAL VOC dataset [46]. Raspberry Pi and Radxa Zero enable a vision and deep-learning-based distance estimation method, which was proposed for cost-effective ADAS [47]. Another study proposed an object distance estimation method for ADAS using camera optics and image-based [48].

Materials and methods

The Efficient and Cost-Effective Vehicle Detection and Tracking with Driver Assistance (ECE-VDTDA) system

The Efficient and Cost-Effective Vehicle Detection and Tracking (ECE-VDT) with Driver Assistance (ECE-VDTDA) system is proposed, comprising ECE-VDT systems followed by a driver assistance module. Input images/video stream from the vehicle camera passes through the ECE-VDT system for effective and efficient vehicle detection and tracking. Vehicle detection is achieved through the designed and implemented SimYOLO-V5s_WIOU algorithm. The vehicle tracking is achieved through the optimized Deep-SORT algorithm. Using the tracking-by-detection methodology, the Deep-SORT algorithm employs an efficient and cost-effective SimYOLO-V5s_WIOU algorithm to track on-road vehicles. The ECE-VDT system is implemented and evaluated on DAWN, FD, and FC datasets, and tracking is implemented and evaluated on foggy videos. The ECE-VDT system provides vehicle detection and tracking information for four types of vehicles, including cars, buses, trucks, and motorcycles. The driver assistance module is designed to provide the driver with warnings of imminent collisions. The driver is responsible for taking preventive measures to avoid collisions with on-road vehicles. The collision warning or alert levels can be modeled as safe, warning, braking, steering, and pre-crash for human-centric and autonomous vehicles. The ECE-VDTDA system and components are illustrated in Fig 3.

ECE-VDTDA system: Vehicle detection

To achieve high inference speed with competitive detection accuracy, the state-of-the-art You Only Look Once (YOLO) algorithm is selected to optimize for efficient and cost-effective vehicle detection in foggy weather. The state-of-the-art small model of the YOLO Version 5 (YOLO-V5s) algorithm is utilized as the base model for evaluating vehicle detection performance. The baseline YOLO-V5s model is optimized by introducing the Simplified Spatial Pyramid Pooling Fast (SimSPPF) module in the backbone network, and Wise Intersection Over Union (WIOU) localization loss function [22] is utilized for enhanced Bounding Box Regression (BBR) by the detection head in the proposed optimized SimYOLO-V5s_WIOU variant. The original SimYOLO-V5s algorithm and its variant were proposed in the latest and previous work [25]. Five variants of SimYOLO-V5s were proposed in previous work, based on five diverse localization loss functions: Complete Intersection Over Union (CIOU), Distance Intersection Over Union (DIOU), Efficient Intersection Over Union (EIOU), Generalized Intersection Over Union (GIOU), and SCYLLA Intersection Over Union (SIOU). The variants achieved state-of-the-art performance in both speed and accuracy for detecting vehicles in foggy weather. The variants were termed as SimYOLO-V5s_CIOU, SimYOLO-V5s_DIOU, SimYOLO-V5s_EIOU, SimYOLO-V5s_GIOU, and SimYOLO-V5s_SIOU. In this research work, SimYOLO-V5s_WIOU variant of the state-of-the-art SimYOLO-V5s algorithm is proposed in this research work. The architectural diagram of the optimized SimYOLO-V5s_WIOU algorithm and modules is illustrated in Fig 4. SimYOLO-V5s_WIOU algorithm consists of a feature extraction backbone, a feature fusion and aggregation neck, and a final detection head. The backbone consists of convolutional layers, Cross-Stage Partial (CSP) bottleneck C3 layers, and Simplified Spatial Pyramid Pooling (SimSPPF) module. The feature fusion and aggregation neck consists of the Path Aggregation Network (PANET), comprising convolutional layers, Cross-Stage Partial (CSP) bottleneck C3 layers, up-sampling, and concatenation connections. The final detection head performs detection on three scales to handle detections for small, medium, and large objects. The detection head performs detections based on objectness, classification, and localization scores, as well as loss functions based on them. The SimYOLO-V5s_WIOU algorithm for vehicle detection is based on diverse anchor boxes. The details of these anchor boxes based on image width, height, area, aspect ratio, and map size are summarized in Table 2.

thumbnail
Table 2. SimYOLO-V5s_WIOU anchor boxes grouped by detection layer, stride, and object size (input size 640×640).

https://doi.org/10.1371/journal.pone.0342186.t002

thumbnail
Fig 4. Architectural diagram of SimYOLO-V5s_WIOU vehicle detection algorithm.

https://doi.org/10.1371/journal.pone.0342186.g004

IOU-Based loss functions.

Intersection Over Union (IOU) is the classical Bounding Box Regression (BBR) localization loss function that measures the overlap between the anchor-box and the Ground Truth (GT) box. The relationship of overlap between the anchor-box and the Ground Truth (GT) is illustrated in Fig 5. IOU loss function attempts to balance the learning of small and large objects for enhanced object detection [49]. Mathematically, the IOU loss functions (IOU, DIOU, CIOU, EIOU, SIOU) are expressed in Eqs (318).

(3)(4)(5)
thumbnail
Fig 5. Bounding boxes and intersection over union with central points (Red) and smallest enclosing box (blue) [22].

https://doi.org/10.1371/journal.pone.0342186.g005

The Distance Intersection Over Union (DIOU) loss function [50] performed BBR based on the normalized distance of the central points of the bounding boxes. Mathematically, the DIOU loss function is defined as

(6)

The Complete Intersection Over Union (CIOU) loss function [50] enhances the DIOU loss function by incorporating the aspect ratio of bounding boxes. Mathematically, the CIOU loss function is defined as

(7)(8)

Where v represents the aspect ratio.

(9)

The Efficient Intersection Over Union (EIOU) loss function [51] is the enhanced version of the DIOU loss function. In EIOU, the penalty for violating the distance metric is increased. Mathematically, the EIOU loss function is defined as

(10)

The SCYLLA Intersection Over Union (SIOU) loss functions [52] consider the angle cost, distance cost, and shape cost for BBR. Mathematically, the SIOU loss function is defined as

(11)(12)

Where Λ introduces the angle cost. Λ = 0, when center points are aligned to the xaxis or yaxis, and Λ = 0, when aligned to 45 degrees to the xaxis.

(13)

Where Δ introduces the distance cost.

(14)(15)(16)

Where Ω introduces the distance cost.

(17)(18)

WIOU loss functions.

The Wise Intersection Over Union (WIOU) loss function [22], based on the standard IOU loss function and a dynamic non-monotonic Focusing Mechanism (FM), was proposed for BBR/localization. The dynamic non-monotonic FM primarily focuses on the outlier degree to effectively allocate gradients and improve the quality of anchor boxes, thereby enhancing the generalization performance of the object detection model. WIOU focuses on ordinary-quality anchor boxes to maintain a balance between high-quality and low-quality anchor boxes, thereby improving localization and detection performance. The mathematical expressions for the WIOU loss function and variants are provided in Eqs (1925). The balance is achieved by assigning the small gradient gain to high-quality anchor boxes with small FM function β and the small gradient gain to low-quality anchor boxes with large FM function β.

(19)

Where β represents the outlier degree. A small outlier degree β corresponds to the high-quality anchor boxes. Three variants of the WIOU loss function for BBR were proposed as V1, V2, and V3.

WIOU Version 1 (WIOU-V1) is based on a distance attention mechanism. Mathematically, the WIOU-V1 loss function is defined as

(20)(21)

WIOU Version 2 (WIOU-V2) is based on monotonic FM. Mathematically, the WIOU-V2 loss function is defined as

(22)

The gradient gain r is defined as . Or equivalently,

(23)

Where represents the exponential running average with momentum m.

WIOU Version 3 (WIOU-V3) is based on dynamic non-monotonic FM. Mathematically, the WIOU-V3 loss function is defined as

(24)

Where β represents the outlier degree, r is the gradient gain, α and β are the hyperparameters.

(25)

The SimSPPF module and SimYOLO-V5s_WIOU algorithm working in an algorithmic way are presented in Algorithms 1 and 2, respectively.

Algorithm 1 SimSPPF.

Require: Input feature map

Ensure: Enriched multi-scale output feature map O

1: function SimConv()

2:   Apply 2D convolution on X with kernel size k, stride s, padding p

3:   Apply batch normalization

4:   Apply SiLU activation

5:   return activated feature map Y

6: end function

7: Step 1: Channel Reduction

8:      Reduce channels to C/2

9: Step 2: Multi-Scale Pooling

10:

11:

12:

13: Step 3: Feature Fusion

14: Concatenate

15:      Fuse to C channels

16: return O

Algorithm 2 SimYOLO-V5s_WIOU.

Require: Input image

Ensure: Set of detection boxes B, scores S, classes C

1: Input Preprocessing:

   

   

2: Feature Extraction (Backbone - CSPDarknet):

   

3: Feature Aggregation (Neck - SimSPPF + PANet):

   

4: Prediction Head:

   

      • Bounding box coordinates (localization)

      • Objectness score (object presence)

      • Class probabilities (classification)

5: Loss Computation:

    Localization Loss:

    Objectness Loss:

    Classification Loss:

6: Post-processing:

   

   

7: return

ECE-VDTDA system: Vehicle tracking

Simple Online and Realtime Tracking (SORT) [53] is a state-of-the-art, online, simple, and effective algorithm for the high accuracy and high precision Multi-Object Tracking (MOT) at higher Frame Per Second (FPS) rates. In the SORT algorithm, the Kalman Filter (KF) performs the task of State estimation and prediction of the objects in the image space. The Hungarian algorithm performs frame-by-frame data association with the help of an association metric to measure the overlap of bounding boxes. The higher number of Identity Switch (ID) and the miss detection/tracking of objects in occlusions are the major drawbacks of the SORT algorithm. To overcome the challenges of the SORT algorithm, the state-of-the-art Deep-SORT [23] algorithm utilizes a more informed deep association metric by combining the object’s appearance and motion features. The Deep-SORT algorithm is a state-of-the-art, real-time, online, efficient, and easy-to-implement algorithm. The Deep-SORT algorithm also utilizes an offline, pre-trained Convolutional Neural Network (CNN) on a large person re-identification dataset. The CNN network enhances the robustness of the Deep-SORT algorithm to address the challenges of a higher number of IDs and object occlusions over a longer period. Deep-SORT offers measurements to track associations in image space by utilizing nearest neighbor queries. The working of the state-of-the-art Deep-SORT algorithm for Multiple Object Tracking (MOT) in an algorithmic way is presented in Algorithm 3. State vector in the Deep-SORT at time t is defined in Eq (26):

(26)

Where:

  • : center of the bounding box in image coordinates
  • : aspect ratio
  • h: height of the bounding box
  • : velocities of respective state components

Algorithm 3 Deep-SORT: Kalman filtering for MOT.

1: Initialize track set

2: for each frame t do

3:   for each track do

4:    Predict state using Kalman filter

5:    Increment age:

6:   end for

7:   Obtain detections

8:   Compute cost matrix Cij using motion + appearance

9:   Perform assignment: associate tracks with detections

10:   for each matched pair do

11:   Update Kalman filter with detection

12:   Reset age

13:   end for

14:   for each unmatched track k do

15:    if then

16:     Remove track k from

17:    end if

18:   end for

19:   for each unmatched detection j do

20:    Initialize new track k with tentative status

21:    Add k to

22:   end for

23:   for each tentative track k do

24:    if associated for consecutive frames then

25:     Promote to confirmed track

26:    else if not associated within frames then

27:     Delete track k

28:    end if

29:   end for

30: end for

The motion information metric is based on Mahalanobis distance using the ith tracks measurement space and the jth bounding box detection dj.

(27)

The decision is handled using binary variable decision of 1 if ith track to jth detection association is admissible using the Mahalanobis threshold of t(1) = 9.4877 with the given metric.

(28)

The appearance information metric is handled by the use of the smallest cosine distance between ith tracks and the jth bounding box detection dj.

(29)

The decision is handled using a binary variable decision of ith track to jth detection association if admissible.

(30)

The motion and appearance information metrics are combined as

(31)

The final association admissibility is handed as

(32)

The working of the ECE-VDT module of the ECE-VDTDA system is illustrated in Fig 6. The output of the ECE-VDT module shows the vehicle detection scores of the diverse vehicle classes on top of the bounding boxes, along with the tracking IDs. The output of the ECE-VDT module enables human-centric and autonomous vehicles to focus on a particular vehicle throughout the tracking period, facilitating imminent road collision detection and avoidance. The state-of-the-art Strong-SORT [26,27] algorithm, an improved version of the Deep-SORT algorithm, is also utilized to evaluate the vehicle tracking performance empowered by the robust and computationally efficient SimYOLO-V5s_WIOU vehicle detection algorithm in foggy weather. Strong-SORT enhances the Deep-SORT algorithm by introducing detection and embedding, along with inference boosting techniques. The key components of Deep-SORT are the YOLO-X model and BoT tracker with Exponential Moving Average (EMA), Correlation Coefficient maximization (ECC), vanilla Kalman filter, motion cost, and vanilla matching techniques. Strong-SORT utilizes an Appearance-Free Link model (AFLink) for enhancing the global association along with Gaussian-Smoothed Interpolation (GSI) for minimizing the missed detections.

thumbnail
Fig 6. Vehicle detection and tracking in the ECE-VDTDA system.

https://doi.org/10.1371/journal.pone.0342186.g006

ECE-VDTDA system: Driver assistance and collision avoidance

Driver assistance is one of the essential areas of research in modern-day human-centric and autonomous vehicles. Vehicle drivers are the primary actors in human-centric vehicles, making decisions regarding steering, acceleration, navigation, and braking. The constant attention and interaction of human-centric vehicle drivers are required even with supportive driver assistance applications. Numerous ADAS applications introduced by the researchers for driver assistance include lane-keeping assistance, lane-departure warning, forward or rear-end collision warnings to avoid collisions, blind-spot warnings, cruise control, and safe parking, among others. Efficient and cost-effective ADAS systems primarily require single or multi-vehicle camera sensors. Depending on the application, camera sensors can operate as a single monocular camera, a binocular pair, or in a stereo arrangement for depth perception. The real-time vision-based vehicle detection and tracking assist drivers of human-centric vehicles in becoming aware of on-road vehicles for safe driving maneuvers. More specifically, forward collision warnings help drivers of human-centric vehicles to maintain a safe speed and distance from the leading vehicles on the road. The collision warnings can only be generated by utilizing the strength of a vision-based, efficient, and cost-effective vehicle detection and tracking system. The functioning of the designed and implemented ECE-VDTDA system depends on the input(s) from the designed and implemented ECE-VDT system. The more accurate and timely information from the ECE-VDT system will result in fewer false positives, generating more stable imminent collision warnings for drivers. The algorithmic working of the proposed ECE-VDTDA system is presented in Algorithm 4. The vehicle detection module in ECE-VDT can efficiently and cost-effectively detect the leading vehicles on the road, with confidence scores displayed above each vehicle. The vehicle tracking module uses the vehicle detection module predictions and confidence scores to track the leading vehicles. The classical Kalman filter algorithm for state estimation empowers the ECE-VDT vehicle tracking module in ECE-VDT. Hungarian algorithms for data associations assign track IDs to each leading vehicle. The assigned track IDs are displayed above each vehicle. The ECE-VDT system is primarily personalized for cars, buses, and trucks. As we know, highways and expressways only allow cars, buses, and trucks to be on the road. The integration of driver assistance functionality with ECE-VDT will further extend the system’s capabilities, warning human-centric drivers to avoid imminent road traffic collisions. The driver assistance system requires distance and speed estimations for the CAS system to generate alerts. The notifications or alerts can be based on visual, auditory, or tactile signals. The centrally focused CAS systems use case is forward collision warnings in this research work. The overall flow chart illustrates the overall working, and the modules are shown in Fig 7.

Algorithm 4 ECE-VDTDA system (Collision alerts for collision avoidance and driver assistance).

1: Start

2: Phase-1: System Initialization

3: Initialize(camera_params, SimYOLO-V5s_WIOU model, Deep-SORT tracker, hyperparameters, ROI, logging, hardware)

4: dict_info , alert_dict , initial_frame_info

5: Phase-2: Vehicle Detection

6: for each frame do

7:  

8:  

9:  

10: end for

11: Phase-3: Vehicle Tracking

12:

13: for each vehicle do

14:  

15:   Update(dict_info, initial_frame_info, id)

16: end for

17: Phase-4: Driver Assistance (Collision Avoidance)

18: for each tracked id with lifetime ≥ 20 frames do

19:  

20:  

21:  

22:  

23:  

24:  

25:   alert_dict[id]

26: end for

27: for each tracked vehicle o in do

28:   if and Center(o.bbox) then

29:    AnnotateFrame(, alert_dict[])

30:    TriggerAlert(alert_dict[])

31:    if alert indicates collision risk then

32:     Driver can Apply Safe_Braking() or

33:     Driver can Apply Safe_Steering() or

34:     Driver can Maintain Safe_Speed() or

35:     Driver can Maintain Safe_Distance()

36:    end if

37:   else

38:    AnnotateFrame(, “track id: ”)

39:   end if

40: end for

41: Phase-5: Adaptive System Tuning

42: TuneDetectionThresholds(road_condition, weather, traffic_density)

43: UpdateTrackerParams(based on occlusions, ID-switches)

44: OptimizeModelParameters()

45: End.

Several assumptions are considered for estimating the speed and distance of leading vehicles, including a flat road, constant acceleration, and a small distance between the leading vehicle and the ego vehicle.

Vehicle Speed Estimation (VSE).

In this research work, the vision-based leading Vehicle Speed Estimation (VSE) methodology is utilized. The VSE process operates in a continuous loop, processing each video frame by frame. The process starts with efficient and cost-effective vehicle detection followed by vehicle tracking. Vehicle detection is achieved by the optimized SimYOLO-V5s_WIOU algorithm, and vehicle tracking by the optimized Deep-SORT algorithm. The bounding boxes are scaled for better localization. The speed estimation process is directly based on the bounding box height (pixels) for all vehicle detection and tracking classes. The relationship between bounding box height and speed estimation is mathematically expressed in Eq (33).

(33)

This ensures that larger (closer) bounding boxes and smaller (farther) bounding boxes result in a relatively constant speed value. The collision warnings for driver assistance based on speed estimation are configured with threshold levels. The threshold levels are summarized in Table 3. The track IDs, along with speed and collision alert warnings, are displayed on the vehicles’ bounding boxes for better visualization. For a vehicle with a speed above 60 , an imminent collision alert is generated, for a speed in the range of 53 and 60 , an attention required alert will be generated, and for a speed below 53 , a safe alert will be generated for effective driver assistance.

thumbnail
Table 3. Threshold levels and warnings/alerts for vehicle speed, distance, and TTC estimations in the ECE-VDTDA system.

https://doi.org/10.1371/journal.pone.0342186.t003

Vehicle Distance Estimation (VDE).

In this research work, the vision-based leading Vehicle Distance Estimation (VDE) methodology is utilized. The VDE process operates in a continuous loop, processing each video frame by frame. The process starts with efficient and cost-effective vehicle detection followed by vehicle tracking. Vehicle detection is achieved by the optimized SimYOLO-V5s_WIOU algorithm, and vehicle tracking by the optimized Deep-SORT algorithm. The bounding boxes are scaled for better localization. The distance estimation (in meters) is inversely proportional to the bounding box height (in pixels) for all vehicle detection and tracking classes. The bigger bounding boxes correspond to the lower distance and meet the real-world settings of close vehicles, with a smaller distance in meters, to avoid imminent road collisions, and vice versa, with the help of a scaling factor k1. The relationship of bounding box height and scaling factor k1 with distance is mathematically expressed in Eq (34).

(34)

The collision warnings for driver assistance based on distance estimation are configured with threshold levels. The threshold levels are summarized in Table 3. The track IDs, along with distance and collision alert warnings, are displayed on the vehicles’ bounding boxes for better visualization. For a vehicle with a distance less than 10 meters, an imminent collision alert is generated, for a distance in the range of 11 meters and 50 meters, an attention required alert will be generated, and for a distance above 50 meters, a safe alert will be generated for effective driver assistance.

Time-To-Collision (TTC) estimation.

Time-To-Collision (TTC) widely recognized Surrogate Safety Measures (SSMs) indicator is utilized for the collision avoidance and driver assistance in the proposed ECE-VDTDA system. TTC assists both human drivers and autonomous vehicles in taking timely preventive actions by assessing collision risk. TTC is widely recognized, straightforward to compute, and extensively as a reliable early-warning indicator for imminent collisions. It indicates the remaining time before the ego and leading vehicle collide, based on their relative speed and distance. This study focuses on real-time vision-based forward collision risk estimation using continuous frame-by-frame distance and speed estimation, and TTC provides a direct, threshold-based measure of imminent collision risk without requiring additional assumptions. In the proposed system, TTC is estimated in seconds and displayed on the leading vehicle’s bounding boxes. The higher TTC enables human-centric vehicle drivers or autonomous vehicles to safely travel, perform longitudinal or lateral driving maneuvers, and adjust speed and distance profiles more effectively. Other SSM indicators, while valuable in broader traffic safety analyses, were not considered in this study because they either involve cumulative exposure (e.g., TET, TIT) or require complex multi-vehicle interaction modeling (e.g., PET, CI), which is beyond the real-time focus of the proposed system. Furthermore, this study focuses on real-time vision-based forward collision risk estimation using continuous frame-by-frame distance and speed estimation, and TTC provides a direct, threshold-based measure of imminent collision risk without requiring additional assumptions [1315]. The proposed ECE-VDTDA systems collision avoidance and driver assistance module utilizes vision-based distance and speed estimations to compute the TTC. Eqs (35) and (36) present the TTC formulations derived from the estimated speed and distance profiles. Specifically, Eq (35) expresses the fundamental relationship of TTC as the ratio of distance to speed (more specifically relative speed), assuming that the vehicles continue to move at a constant speed. This formulation is applicable to common driving scenarios, including car-following, head-on, and rear-end approaches [5456]. The TTC estimation based on distance and speed is mathematically expressed in Eq (35).

(35)

Where, the speed is assumed with a fixed constant speed of 5m/s or . Eq (35) represents a simplified TTC formulation that does not account for angular collisions. This is because angular collision detection and avoidance require not only the inter-vehicle distance and relative speed, but also detailed information such as vehicle velocities in different directions, trajectory patterns, and path predictions, along with detailed mathematical formulation that fall outside the scope of this study. By tying TTC directly to the calculated distance, the warning system’s thresholds for both metrics are automatically synchronized. This is a deliberate design choice to provide a stable, consistent warning output rather than relying on noisy, frame-to-frame speed calculations. The TTC estimation based on speed is mathematically expressed in Eq (36).

(36)

Where the factor k2 is used to adjust the TTC estimation in seconds. The collision warnings for driver assistance based on TTC estimation are configured with threshold levels. Eq (36) is introduced as an approximation for scenarios where the ego vehicle approaches a reference distance threshold at a nearly constant speed. In this context, the fixed distance represents the predefined safety margin (e.g., minimum safe following distance), while k2 is a scaling factor that converts the speed and distance ratio into time units consistent with the system’s frame rate, measurement scale, and unit normalization. Eq (36) is used as a normalized and computationally lightweight metric by the proposed ECE-VDTDA system for early-warning decisions when the distance is fixed-threshold or when the system monitors how quickly the vehicle is approaching a critical boundary. The threshold levels are summarized in Table 3. The track IDs, along with distance and collision alert warnings, and speed and collision warnings, are displayed on the vehicles’ bounding boxes for better visualization. For a vehicle with a TTC less than 2 seconds, an imminent collision alert is generated; for a TTC in the range of 2 seconds and 10 seconds, an attention required alert will be generated; and for a TTC above 10 seconds, a safe alert will be generated for effective driver assistance.

Collision risk.

The Collision Risk is directly linked with the TTC estimation. The smaller the TTC time, the higher the collision risk and vice versa. Mathematically, the collision risk based on the TTC is expressed in Eq (37):

(37)

Eq (37) is derived from [56,57] used with the ECE-VDTDA systems collision avoidance and driver assistance module as a deterministic surrogate risk indicator widely adopted in real-time ADAS applications to reflect the severity of an imminent collision, rather than as a formal probabilistic crash-risk estimate, which would require statistical frameworks such as extreme value theory. Collision risk continuously changes as the TTC changes, which in turn changes as per the leading vehicle speed or distance changes. The proposed ECE-VDTDA system offers an efficient and cost-effective method for predicting the leading vehicles on the road, thereby enhancing the accuracy of estimations. Collision risk requires human-centric driver intervention to take preventive measures and avoid imminent collisions.

Experimental setup

A diverse set of experiments was performed to evaluate the efficiency and cost-effectiveness of the ECE-VDTDA system and modules. The experiments are performed on a cloud-based Google Colab system and a local workstation system equipped with a Graphical Processing Unit (GPU) and various computer vision libraries and packages. The NVIDIA Tesla T4 GPU with 15 GB of memory is used for the experiments on Google Colab, whereas the NVIDIA GeForce GTX 1050 Ti GPU with 4GB of dedicated memory is used for the experiments on the local workstation system. The 640 image size was used throughout the experiments. The training is performed for 300 epochs on Google Colab and 100 epochs on a local workstation system. The Stochastic Gradient Descent (SGD) optimization algorithm, with a batch size of 16 for training and 1 for validation, utilizing its default training, validation, detection, and tracking hyperparameters, is employed for the experiments with YOLO-V5s. Vehicle tracking is performed with Deep-SORT and Strong-SORT algorithms with default configuration parameters. The confidence threshold of 0.25, the NMS IOU threshold of 0.45, the maximum age of 30, NN_Budget of 100, and the maximum detections of 1000 are considered. The collision avoidance and driver assistance functions require vision-based speed, distance, and Time-To-Collision (TTC) estimations.

Datasets.

The vehicle detection performance of the ECE-VDTDA system, specifically the optimized SimYOLO-V5s_WIOU, was evaluated on diverse foggy weather image datasets. The state-of-the-art Foggy Driving (FD) [28,29], Vehicle Detection in Adverse Weather Nature (DAWN) [30,31], and Foggy Cityscapes [29,32] image datasets are utilized. The FD dataset comprises 101 real foggy weather road vehicle detection images. These images were pre-processed, self-annotated, and customized using the YOLO Label [58] tool, and split into training and validation images with an 80:20 ratio. The DAWN dataset comprises 1000 diverse weather road vehicle detection images, including 300 foggy images. The foggy images were pre-processed, customized, and split into training and validation images with an 80:20 ratio. The FC dataset comprises 5000 synthetic foggy weather road vehicle detection images. These images were pre-processed, self-annotated, and customized using the YOLO Label tool, and split into training, validation, and test images with a 60:10:30 ratio. The visualization of the FD, DAWN, and FC datasets’ vehicle detection classes, instances, and labels distribution is illustrated in Fig 8.

thumbnail
Fig 8. FD, DAWN, and FC datasets class-wise labels distribution.

https://doi.org/10.1371/journal.pone.0342186.g008

The vehicle tracking performance of the ECE-VDTDA system, specifically the optimized Deep-SORT algorithm, as well as collision avoidance and driver assistance capabilities, was evaluated using diverse BDD100K datasets [33], web-collected [34], and self-collected [35] foggy weather video datasets. DAWN, FD, FC, BDD100K, and web-collected datasets used in this research are publicly available and were collected, processed, and analyzed in accordance with the terms and conditions specified by their original providers. In addition, the details of the self-collected [35] dataset have been included in the manuscript to support reproducibility and ensure transparency of the experimental setup. Furthermore, all publicly available datasets (DAWN, FD, FC, BDD100K, and web-collected) used in this study do not require permits. The self-collected [35] dataset was recorded in publicly accessible locations where no special permits or field-site approvals are required for non-invasive visual data collection.

Performance metrics.

A diverse set of state-of-the-art performance metrics was utilized to evaluate the efficiency and cost-effectiveness of the ECE-VDTDA system and modules.

a) Precision (P): Accuracy of correctly predicting true or positive objects over all objects predicted positive, including false positives, as expressed in Eq (38).

(38)

b) Recall (R): Sensitivity of correctly predicting true or positive objects, overall predicted objects, including false negatives, as expressed in Eq (39).

(39)

c) Precision versus Recall Curve: It maps the relationship between the precision and recall scores of an object detector, and its optimal point lies near (1.0, 1.0) of the precision-recall curve.

d) F1 score: Harmonic mean or a balanced measure of precision and recall. An enhanced F1 score for object detection requires higher precision and recall scores, as expressed in Eq (40).

(40)

e) Average Precision (AP): It is the area under the interpolated precision-recall curve, measuring the precision scores at various recall (r) scores. AP is mathematically expressed in Eq (41).

(41)

f) Mean Average Precision (mAP): It is the mean value of the AP over N classes for multi-class object detection. Mainly, mAP50 is expressed at an IOU threshold of 50. mAP is mathematically expressed in Eq (42).

(42)

g) Frame Per Second (FPS): It represents the number of frames processed per second to meet the real-time inference speed of the object detection algorithms. FPS is based on the overall processing time required by the algorithm. Higher FPS helps object detection algorithms to achieve high computational efficiency and real-time performance [59]. Mathematically, FPS is expressed as Eq (43):

(43)

h) GPU Memory Utilization (GB): It represents the total GPU memory the vehicle detection model utilizes during training. It depends on the model size, images, and batch size, etc.

h) Training Time (Hours): It represents the total training time required by the model to complete the training epochs. It depends on the model size, images, epochs, batch size, etc.

Results and discussion

ECE-VDTDA system: Vehicle detection

The proposed ECE-VDTDA system utilizes an optimized SimYOLO-V5s_WIOU algorithm for efficient and cost-effective vehicle detection in foggy weather conditions. The vehicle detection performance of the optimized SimYOLO-V5s_WIOU algorithm is evaluated using diverse metrics across various FD, DAWN, and FC foggy weather vehicle detection datasets.

SimYOLO-V5s_WIOU vehicle detection performance on the FD dataset.

The optimized SimYOLO-V5s_WIOU algorithm offers a multiclass vehicle detection mAP50 score of 47.3%, and a multiclass vehicle detection mAP50-95 score of 24% on the FD dataset. The SimYOLO-V5s_WIOU overall precision score is 26.7%, the recall score is 56.7%, and the balanced F1 score is 40%. SimYOLO-V5s_WIOU detects cars with a mAP50 score of 68.3%, buses with a mAP50 score of 37.4%, and trucks with a mAP50 score of 36.1%. When compared with the baseline YOLO-V5s algorithm, SimYOLO-V5s_WIOU outperforms as an effective and efficient vehicle detection algorithm in recall, multiclass mAP50, multiclass mAP50-95, and F1 scores, as well as detecting buses and trucks. Moreover, SimYOLO-V5s_WIOU also outperforms in pre-processing time, inference time, and post-processing time, and offers higher FPS compared to baseline YOLO-V5s. SimYOLO-V5s_WIOU requires less training time with moderately higher GPU memory requirements compared to baseline YOLO-V5s. The comparison of vehicle detection accuracy and speed results on the FD dataset are summarized in Table 4 and Table 5, respectively.

thumbnail
Table 4. Comparison of vehicle detection performance (accuracy) of baseline YOLO-V5s and SimYOLO-V5s_WIOU on the FD, DAWN, and FC datasets.

https://doi.org/10.1371/journal.pone.0342186.t004

thumbnail
Table 5. Comparison of vehicle detection performance (speed) of the baseline YOLO-V5s and SimYOLO-V5s_WIOU on the FD, DAWN, and FC datasets.

https://doi.org/10.1371/journal.pone.0342186.t005

In comparison with State-Of-The-Art (SOTA) methods, the efficiency and cost-effectiveness of the optimized SimYOLO-V5s_WIOU algorithm is evident from the results summarized in Tables 6 and 7, which show that SimYOLO-V5s_WIOU outperforms in multiclass mAP50, mAP50-95, and F1 scores, and also detects trucks with a higher mAP50 score on the FD dataset. However, it also achieves competitive results when compared with the SOTA in recall, detecting cars and buses. Moreover, SimYOLO-V5s_WIOU also outperforms in pre-processing time, inference time, and post-processing time, and offers higher FPS compared to SOTA.

thumbnail
Table 6. Comparison of vehicle detection performance of SOTA and SimYOLO-V5s_WIOU on FD dataset.

https://doi.org/10.1371/journal.pone.0342186.t006

thumbnail
Table 7. Comparison of vehicle detection performance (speed) of SimYOLO-V5s_WIOU with SOTA on the FD dataset.

https://doi.org/10.1371/journal.pone.0342186.t007

SimYOLO-V5s_WIOU vehicle detection performance on the DAWN dataset.

The optimized SimYOLO-V5s_WIOU algorithm offers a multiclass vehicle detection mAP50 score of 73.4%, and a multiclass vehicle detection mAP50-95 score of 39.4% on the DAWN dataset. The SimYOLO-V5s_WIOU overall precision score is 77.5%, the recall score is 67%, and the balanced F1 score is 68%. SimYOLO-V5s_WIOU detects cars with a mAP50 score of 87.2%, buses with a mAP50 score of 30.8%, trucks with a mAP50 score of 76.2%, and motorcycles with a mAP50 score of 99.5%. When compared with the baseline YOLO-V5s algorithm, SimYOLO-V5s_WIOU shows competitive results and underperforms in precision, recall, mAP50, mAP50-95, and F1 scores. However, SimYOLO-V5s_WIOU outperforms in pre-processing time, inference time, and post-processing time, and offers higher FPS compared to baseline YOLO-V5s. SimYOLO-V5s_WIOU requires less training time with moderately higher GPU memory requirements compared to baseline YOLO-V5s. The comparison of vehicle detection accuracy and speed results on the DAWN dataset is summarized in Table 4 and Table 5, respectively.

In comparison to SOTA methods, the efficiency and cost-effectiveness of the optimized SimYOLO-V5s_WIOU algorithm is evident from the results of accuracy metrics as summarized in Table 8 on the DAWN dataset, which show that SimYOLO-V5s_WIOU outperforms in multiclass mAP50, mAP50-95, and precision scores, and also detects motorcycles and trucks with a higher mAP50 score on the DAWN dataset. However, it also achieves competitive results when compared with the SOTA in recall, F1 score, and detecting cars and buses. However, SimYOLO-V5s_WIOU also shows competitive results in terms of speed metrics, as summarized in Table 9, including pre-processing time, inference time, post-processing time, and FPS, compared to SOTA.

thumbnail
Table 8. Comparison of vehicle detection performance of state-of-the-art and SimYOLO-V5s_WIOU on DAWN dataset.

https://doi.org/10.1371/journal.pone.0342186.t008

thumbnail
Table 9. Comparison of vehicle detection performance (speed) of SimYOLO-V5s_WIOU with state-of-the-art on the DAWN dataset.

https://doi.org/10.1371/journal.pone.0342186.t009

SimYOLO-V5s_WIOU vehicle detection performance on the FC dataset.

The optimized SimYOLO-V5s_WIOU algorithm achieves a multiclass vehicle detection mAP50 score of 61.8%, and a multiclass vehicle detection mAP50-95 score of 38.4% on the FC dataset. The SimYOLO-V5s_WIOU overall precision score is 75.6%, the recall score is 55.5%, and the balanced F1 score is 64%. SimYOLO-V5s_WIOU detects cars with a mAP50 score of 82.3%, buses with a mAP50 score of 62.9%, trucks with a mAP50 score of 58.3%, and motorcycles with a mAP50 score of 43.6%. When compared with the baseline YOLO-V5s algorithm, SimYOLO-V5s_WIOU outperforms in precision, recall, mAP50, mAP50-95, and F1 scores. Moreover, SimYOLO-V5s_WIOU also outperforms in detecting motorcycles in foggy weather, and also shows competitive results in detecting cars, buses, and trucks in foggy weather. SimYOLO-V5s_WIOU outperforms the baseline in terms of pre-processing time and inference time, and offers higher FPS compared to baseline YOLO-V5s. Additionally, SimYOLO-V5s_WIOU requires less training time compared to baseline YOLO-V5s. The comparison of vehicle detection accuracy and speed results on the FC dataset is summarized in Table 4 and Table 5, respectively. The proposed optimized SimYOLO-V5s_WIOU algorithm is also compared with SimYOLO-V5s variants proposed in the previous work on the larger FC dataset. It is evident from the results summarized in Table 10 that SimYOLO-V5s_WIOU outperformed all the previous SimYOLO-V5s variants with a precision score of 75.6%. It also shows competitive results in mAP50, mAP50-95, and F1 scores compared to SimYOLO-V5s variants in detecting vehicles in foggy weather on the FC dataset. Additionally, the proposed SimYOLO-V5s_WIOU algorithm is also compared with SimYOLO-V5s variants in terms of pre-processing, inference, and NMS post-processing time, FPS rate, and training time. The results are summarized in Table 11. The visualization of the vehicle detection performance of baseline YOLO-V5s and optimized SimYOLO-V5s_WIOU is compared and presented in Figs 9 and 10. It is evident from the visualization results of both Set-I Fig 9 and Set-II Fig 10 that the optimized SimYOLO-V5s_WIOU outperforms the baseline YOLO-V5s in correctly detecting vehicle classes, reducing miss and false detections, and achieving higher mAP50 scores. Objectness, classification, and localization loss functions, precision, recall, mAP50, and mAP50-95 scores of SimYOLO-V5s_WIOU vehicle detection algorithm are shown in Fig 11.

thumbnail
Fig 9. Visual comparison of vehicle detection performance of baseline YOLO-V5s and optimized SimYOLO-V5s_WIOU on set-I.

https://doi.org/10.1371/journal.pone.0342186.g009

thumbnail
Fig 10. Visual comparison of vehicle detection performance of baseline YOLO-V5s and optimized SimYOLO-V5s_WIOU on set-II.

https://doi.org/10.1371/journal.pone.0342186.g010

thumbnail
Fig 11. Loss, P, R, and mAP graphs of SimYOLO-V5s_WIOU on FD, DAWN, FC datasets.

https://doi.org/10.1371/journal.pone.0342186.g011

thumbnail
Table 10. Comparison of vehicle detection performance (accuracy) of SimYOLO-V5s_WIOU with state-of-the-art SimYOLO-V5s variants on the FC dataset.

https://doi.org/10.1371/journal.pone.0342186.t010

thumbnail
Table 11. Comparison of vehicle detection performance (speed) of SimYOLO-V5s_WIOU with state-of-the-art SimYOLO-V5s variants on the FC dataset.

https://doi.org/10.1371/journal.pone.0342186.t011

ECE-VDTDA system: Vehicle tracking

The vehicle tracking performance of the ECE-VDTDA system is evaluated on diverse video datasets, including BDD100K, web-collected, and self-collected. State-of-the-art Deep-SORT, Strong-SORT, and optimized Deep-SORT algorithms are utilized for the vehicle tracking task, empowered by the optimized SimYOLO-V5s_WIOU vehicle detection algorithm. The comparison of the ECE-VDT performance on the BDD100K dataset video sequence of 1213 frames is summarized in Table 12. It is evident from the comparative results that optimized SimYOLO-V5s_WIOU, along with the optimized Deep-SORT algorithm, outperformed baseline YOLO-V5s, along with baseline Deep-SORT and optimized Deep-SORT algorithms combination in terms of pre-processing time of 0.5 ms, inference time of 11 ms, and post-processing time of 1.6 ms. However, the optimized SimYOLO-V5s_WIOU and the optimized Deep-SORT algorithm combination show competitive results compared to other SimYOLO-V5s variants and the optimized Deep-SORT combination. The comparison of the ECE-VDT performance on the BDD100K dataset video sequence of 1213 frames is summarized in Table 13. It is evident from the results that the optimized SimYOLO-V5s_WIOU and Strong-SORT combination outperformed baseline YOLO-V5s and Strong-SORT in terms of pre-processing time of 0.4 ms, inference time of 14.8 ms, and post-processing time of 1.7 ms. However, the optimized SimYOLO-V5s_WIOU and the Strong-SORT algorithm combination show competitive results compared to other SimYOLO-V5s variants and the Strong-SORT combination. The robustness of the optimized SimYOLO-V5s_WIOU vehicle detection algorithm with baseline Deep-SORT, optimized Deep-SORT, and Strong-SORT vehicle tracking algorithms is further evaluated on a self-collected foggy weather video sequence of 10213 frames. The comparative results are summarized in Table 14. The results indicate that the optimized SimYOLO-V5s_WIOU vehicle detection algorithm enhances the performance of vehicle tracking algorithms, particularly at high speeds. It is also observed that for the vehicle tracking performance of the baseline Deep-SORT, optimized Deep-SORT, and Strong-SORT, the Deep-SORT algorithm switches track IDs more frequently compared to Strong-SORT. However, the processing time and track update time of Strong-SORT are comparatively higher than those of Deep-SORT. The visualization of vehicle detection and tracking performance of the ECE-VDT module on diverse BDD100K, web-collected, and self-collected video datasets of the ECE-VDTDA system is shown in Fig 12.

thumbnail
Table 12. Comparison of vehicle detection and tracking performance of the ECE-VDT system with SOTA on the BDD100K video sequence of 1213 frames.

https://doi.org/10.1371/journal.pone.0342186.t012

thumbnail
Table 13. Comparison of vehicle detection and tracking performance of the ECE-VDT system with SOTA on the BDD100K video sequence of 1213 frames.

https://doi.org/10.1371/journal.pone.0342186.t013

thumbnail
Table 14. Comparison of vehicle detection and tracking performance of the ECE-VDT system with SOTA on the self video sequence of 10213 frames.

https://doi.org/10.1371/journal.pone.0342186.t014

thumbnail
Fig 12. Visualization of Vehicle detection and tracking performance of ECE-VDTDA system.

https://doi.org/10.1371/journal.pone.0342186.g012

ECE-VDTDA system: Driver assistance and collision avoidance

The efficient and cost-effective vehicle detection and tracking performance of the ECE-VDT module enables the driver assistance and collision avoidance functionality of the proposed ECE-VDTDA system. In proposed ECE-VDTDA system, a detection is counted as a True Positive (TP) when SimYOLO-V5s_WIOU correctly localized a vehicle with an IOU ≥ 0.5 and optimized Deep-SORT consistently associated it with the correct track ID. These verified detections are then used for distance, speed, and TTC calculations. A False Positive (FP) occurred when SimYOLO-V5s_WIOU detected a vehicle where none existed, when the IOU with the ground truth was < 0.5, or when optimized Deep-SORT produced an incorrect association (identity mismatch). FP cases were excluded from TTC computations and used only for evaluating Precision, Recall, and mAP. The distance, speed, and TTC thresholds and estimations help to generate collision alerts.

These estimations are an integral part of collision avoidance and driver assistance in the ECE-VDTDA system. Experiments are performed on the diverse web, BDD100K, and self-collected video datasets are utilized for distance, speed, and TTC estimations. The distance-only, speed-only, distance-TTC, and speed-TTC alerts, along with vehicle track IDs, are displayed on the vehicles’ bounding boxes. The process of distance, speed, and TTC estimation is already discussed in the research methodology section. The ECE-VDTDA system’s performance of distance-TTC and speed-TTC estimation over web, BDD100K, and self-collected video datasets is shown in Fig 13. The Fig 13A-13C show the processing time, FPS, and distance-TTC estimations and alerts over video frames. Whereas, Fig 13D-13F show the processing time, FPS, and speed-TTC estimations and alerts over video frames. The distance-TTC relationship is more consistent throughout the video frames, whereas the speed-TTC relationship is comparatively more fluctuating. The FPS processing speed of the ECE-VDTDA system is also competitive, with a rate above 30 FPS and a maximum FPS rate of 70 or higher. The FPS processing speed demonstrates the effectiveness of the proposed ECE-VDTDA systems in real-world settings. The visualization of the ECE-VDTDA system’s performance is provided in Fig 14. The Fig 14A-14C show the distance-TTC estimations and alerts over video frames. Whereas, Fig 14D-14F show the speed-TTC estimations and alerts over video frames. The alerts are displayed on the vehicles’ bounding boxes in different colors, based on the severity of the potential road collisions. The red color shows the imminent road collision warning/alert. The yellow color indicates a collision alert to gain the driver’s attention, while the white color alerts indicate a safe zone for drivers to drive comfortably without requiring focused attention. The ECE-VDTDA system’s performance sets a strong foundation for further exploring collision avoidance and driver assistance design, implementation, real-world testing, and deployment in the modern world’s human-centric and autonomous vehicles.

thumbnail
Fig 13. Comparative analysis of processing time, FPS, distance(m)-TTC(s), and speed(km/h)-TTC(s) alerts on web, BDD100K, and self-collected diverse weather datasets for collision avoidance and driver assistance.

https://doi.org/10.1371/journal.pone.0342186.g013

thumbnail
Fig 14. Visual comparison of distance(m)-TTC(s) and speed(km/h)-TTC estimations and alerts on web, BDD100K, and self-collected diverse weather datasets for collision avoidance and driver assistance.

https://doi.org/10.1371/journal.pone.0342186.g014

Conclusion, limitations and future works

Conclusion

Emerging human-centric and autonomous diving applications, such as Forward Collision Warning (FCW) and Rear-end Collision Warning (RCW) for Advanced Driver Assistance Systems (ADAS) and Collision Avoidance Systems (CAS), require efficient and cost-effective solutions. In this research work, a vision-based, robust, and computationally efficient vehicle detection, tracking, distance, speed, and Time-To-Collision (TTC) estimation system, Efficient and Cost-Effective Vehicle Detection and Tracking with Driver Assistance (ECE-VDTDA), is proposed for collision avoidance and driver assistance in foggy weather conditions. The vehicle detection performance of the baseline YOLO-V5s algorithm is enhanced by incorporating the SimSPPF module into the backbone and the Wise Intersection Over Union (WIOU) localization loss function. These design changes are incorporated and optimized. A SimYOLO-V5s_WIOU, an efficient and cost-effective vehicle detection algorithm, is proposed. The performance of SimYOLO-V5s_WIOU is evaluated on diverse FD, DAWN, and FC foggy weather datasets and compared with baseline YOLO-V5s and state-of-the-art methods. The optimized SimYOLO-V5s_WIOU outperformed in detection performance and speed. The vehicle tracking performance of baseline Deep-SORT, optimized Deep-SORT, and Strong-SORT is enhanced and evaluated in combination with baseline YOLO-V5s and SimYOLO-V5s_WIOU vehicle detection algorithms on diverse BDD100K, web-collected, and self-collected foggy weather video datasets. The ECE-VDT module, comprising SimYOLO-V5s_WIOU vehicle detection and an optimized Deep-SORT vehicle tracking algorithm, sets a robust and strong foundation for the collision avoidance and driver assistance module. The collision avoidance and driver assistance module comprises distance, speed, and TTC estimations based on pre-defined thresholds and a collision warning/alert mechanism. Distance and speed estimations are linked with TTC separately, and performance is quantitatively and visually compared on diverse video datasets. In proposed ECE-VDTDA system, a detection is counted as a True Positive (TP) when SimYOLO-V5s_WIOU correctly localized a vehicle with an IOU ≥ 0.5 and optimized Deep-SORT consistently associated it with the correct track ID. These verified detections are then used for distance, speed, and TTC calculations. A False Positive (FP) occurred when SimYOLO-V5s_WIOU detected a vehicle where none existed, when the IOU with the ground truth was < 0.5, or when optimized Deep-SORT produced an incorrect association (identity mismatch). FP cases were excluded from TTC computations and used only for evaluating precision, recall, and mAP. The overall functioning of the proposed ECE-VDTDA system establishes a strong foundation for FCW and RCW applications in ADAS and CAS systems.

Limitations

Several limitations are highlighted in the design and implementation of the proposed ECE-VDTDA system. Predefined threshold levels, scaling factors, and bounding box heights are the primary settings for vision-based distance, speed, and TTC estimations in this research work. A formal probabilistic crash-risk estimate is not included in this study, as such analysis requires advanced statistical frameworks, such as extreme value theory. However, in real-world settings, such as camera installation, adjustment, angle, Field of View (FOV), and Region of Interest (ROI), among other parameters, may need to be further considered and evaluated for the design and implementation of the proposed ECE-VDTDA system.

Future works

These highlighted limitations are the primary motivation for further fine-tuning the performance of the ECE-VDTDA system. In future work, actual camera parameters will also be incorporated for distance, speed, and TTC estimations. More quantitative observations and measurements will also be considered in the design of the ECE-VDTDA system.

Acknowledgments

We extend our sincere gratitude to all supervisors, colleagues, collaborators, and reviewers whose constructive insights and support contributed significantly to the completion of this research.

References

  1. 1. Adam MA, Tapamo JR. Survey on image-based vehicle detection methods. World Electric Vehicle Journal. 2025;16(6):303.
  2. 2. Li J, He Y, Li Y, Huang H, Wu D, Jin J. Vehicle real-time collision risk prediction: a multi-modal learning approach for diverse urban road scenarios based on a large-scale near-crash event dataset. Engineering Applications of Artificial Intelligence. 2025;157:111299.
  3. 3. Raeesi H, Khosravi A, Sarhadi P. Collision avoidance for autonomous vehicles using reachability-based trajectory planning in highway driving. Proceedings of the Institution of Mechanical Engineers, Part D: Journal of Automobile Engineering. 2024;239(4):1003–20. https://doi.org/10.1177/09544070231222053
  4. 4. Chandolu SB, Medidhi NM, Satyala C, Mohammad Z, Sharma AR, Venigalla PP. Computer vision-based multi-modal advanced driver assistance system. In: 2025 International Conference on Inventive Computation Technologies (ICICT). 2025. p. 638–45. https://doi.org/10.1109/icict64420.2025.11004869
  5. 5. Ramesh G, K M KR, Devadiga MT, Manohara M, Boloor S, Sowjanya N, et al. A survey on vehicle collision avoidance systems: Innovations, challenges, and future prospects. In: 2025 International Conference on Artificial Intelligence and Data Engineering (AIDE). 2025. p. 466–71.
  6. 6. Lian H, Li M, Li T, Zhang Y, Shi Y, Fan Y, et al. Vehicle speed measurement method using monocular cameras. Sci Rep. 2025;15(1):2755. pmid:39843469
  7. 7. Delmo JAB. Deep learning-based vehicle speed estimation in bidirectional traffic lanes. Procedia Computer Science. 2025;252:222–30.
  8. 8. Yadav GK, Kancharla T, Nair S. Real time vehicle detection for rear and forward collision warning systems. In: International conference on advances in computing and communications. 2011. p. 368–77.
  9. 9. Lee H, Choi S. Development of collision avoidance system in slippery road conditions. IEEE Trans Intell Transport Syst. 2022;23(10):19544–56.
  10. 10. Hagenus J, Mathiesen FB, Schumann JF, Zgonnikov A. A survey on robustness in trajectory prediction for autonomous vehicles. arXiv preprint 2024.
  11. 11. Lin P, Javanmardi E, Nakazato J, Tsukada M. Occlusion-aware path planning for collision avoidance: leveraging potential field method with responsibility-sensitive safety. In: 2023 IEEE 26th International Conference on Intelligent Transportation Systems (ITSC). 2023. p. 2561–7. https://doi.org/10.1109/itsc57777.2023.10422621
  12. 12. Qin P, Liu F, Guo Z, Li Z, Shang Y. Hierarchical collision-free trajectory planning for autonomous vehicles based on improved artificial potential field method. Transactions of the Institute of Measurement and Control. 2023;46(4):799–812.
  13. 13. Hasain NM, Ahmed MA. Traffic safety evaluation using surrogate safety measures in the context of Indian mixed traffic: a critical review. IATSS Research. 2025;49(2):201–19.
  14. 14. Wang W, Zhang L, Yan B, Cheng Y. Development of a surrogate safety measure for evaluating rear-end collision risk perception. Transportation Research Record. 2025.
  15. 15. Tak S, Kim S, Lee D, Yeo H. A comparison analysis of surrogate safety measures with car-following perspectives for advanced driver assistance system. Journal of Advanced Transportation. 2018;2018:1–14.
  16. 16. Kovaceva J, Murgovski N, Kulcsar B, Wymeersch H, Bargman J. Critical zones for comfortable collision avoidance with a leading vehicle. arXiv preprint2023. https://arxiv.org/abs/2303.14709
  17. 17. Leonardi S, Distefano N. Exploring knowledge and perceptions of Advanced Driver Assistance Systems (ADAS): results of a southern Italian survey. Transportation Research Interdisciplinary Perspectives. 2025;31:101426.
  18. 18. de Winkel KN, Christoph M. Rethinking advanced driver assistance system taxonomies: a framework and inventory of real-world safety performance. Transportation Research Interdisciplinary Perspectives. 2025;29:101336.
  19. 19. Gulino M-S, Vichi G, Cecchetto F, Di Lillo L, Vangi D. A combined comfort and safety-based approach to assess the performance of advanced driver assistance functions. Eur Transp Res Rev. 2025;17(1).
  20. 20. Ultralytics. YOLO-V5 Network Architecture. Accessed 2023. https://docs.ultralytics.com/yolov5/tutorials/architecture description/#1-model-structure
  21. 21. Ultralytics. Github Reposatory; Accessed 2023 . https://github.com/ultralytics/yolov5
  22. 22. Tong Z, Chen Y, Xu Z, Yu R. Wise-IoU: bounding box regression loss with dynamic focusing mechanism. arXiv preprint arXiv:230110051. 2023.
  23. 23. Wojke N, Bewley A, Paulus D. Simple online and realtime tracking with a deep association metric. In: 2017 IEEE International Conference on Image Processing (ICIP). 2017. p. 3645–9. https://doi.org/10.1109/icip.2017.8296962
  24. 24. Real-time multi-object tracker using YOLOv5 and deep sort. [cited 2023 Jan 10]. https://github.com/mikel-brostrom/Yolov5-DeepSort-Pytorch
  25. 25. Raza N, Habib MA, Imran HMS, Qayum A, Perveen S, Jabbar S, et al. An efficient and cost-effective vehicle detection and tracking system for collision avoidance in foggy weather. IEEE Access. 2025;13:126525–56.
  26. 26. Du Y, Zhao Z, Song Y, Zhao Y, Su F, Gong T, et al. StrongSORT: make DeepSORT great again. IEEE Trans Multimedia. 2023;25:8725–37.
  27. 27. StrongSORT. [cited 2024 Jan 10]. https://stronggithub.com/dyhBUPT/StrongSORT
  28. 28. Cordts M, Omran M, Ramos S, Rehfeld T, Enzweiler M, Benenson R, et al. The cityscapes dataset for semantic urban scene understanding. In: Proceedings of the IEEE conference on computer vision and pattern recognition. 2016. p. 3213–23.
  29. 29. Sakaridis C, Dai D, Van Gool L. Semantic foggy scene understanding with synthetic data. Int J Comput Vis. 2018;126(9):973–92.
  30. 30. Kenk MA, Hassaballah M. DAWN: vehicle detection in adverse weather nature dataset. arXiv preprint 2020. https://arxiv.org/abs/2008.05402
  31. 31. DAWN Dataset. Mendeley Data, V3. [cited 2023]. 1017632/766ygrbt8y3
  32. 32. Foggy Cityscapes Dataset. [cited 2024 Jan 10]. https://www.cityscapes-dataset.com/downloads/
  33. 33. Yu F, Chen H, Wang X, Xian W, Chen Y, Liu F, et al. Bdd100k: a diverse driving dataset for heterogeneous multitask learning. In: Proceedings of the IEEE/CVF conference on computer vision and pattern recognition. 2020. p. 2636–45.
  34. 34. Illinois Department of Transportation. Driving in foggy weather. https://www.youtube.com/watch?v=fPtVum5s72A
  35. 35. Raza N, Razzaq A, Noman W, Rehman A, Soofi AA. A deep learning based real-time vehicle detection and collision avoidance system for low-visibility foggy environment. In: 4th International Conference on Communication, Computing and Digital Systems (C-CODE). 2025. p. 1–6.
  36. 36. Redmon J, Divvala S, Girshick R, Farhadi A. You only look once: unified, real-time object detection. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. 2016. p. 779–88.
  37. 37. Miao G, Wang W, Tang J, Li F, Liang Y. Vehicle collision warning based on combination of the YOLO algorithm and the kalman filter in the driving assistance system. Journal of Advanced Transportation. 2025;2025(1).
  38. 38. Raza N, Habib MA, Ahmad M, Abbas Q, Aldajani MB, Latif MA. Efficient and cost-effective vehicle detection in foggy weather for edge/fog-enabled traffic surveillance and collision avoidance systems. Computers, Materials & Continua. 2024;81(1).
  39. 39. Raza N, Ahmad M, Habib MA. Assessment of efficient and cost-effective vehicle detection in foggy weather. In: 2024 18th International Conference on Open Source Systems and Technologies (ICOSST). IEEE; 2024. p. 1–6. https://doi.org/10.1109/icosst64562.2024.10871157
  40. 40. Raza N, Jabbar S, Han J, Han K. Social vehicle-to-everything (V2X) communication model for intelligent transportation systems based on 5G scenario. In: Proceedings of the 2nd International Conference on Future Networks and Distributed Systems. 2018. p. 1–8. https://doi.org/10.1145/3231053.3231120
  41. 41. Raza S, Ahmed M, Ahmad H, Mirza MA, Habib MA, Wang S. Task offloading in mmWave based 5G vehicular cloud computing. J Ambient Intell Human Comput. 2022;14(9):12595–607.
  42. 42. Khiem NM, Van Thanh T, Dung NH, Takahashi Y. A novel approach combining YOLO and DeepSORT for detecting and counting live fish in natural environments through video. PLoS One. 2025;20(6):e0323547. pmid:40498782
  43. 43. Dou H, Chen S, Xu F, Liu Y, Zhao H. Analysis of vehicle and pedestrian detection effects of improved YOLOv8 model in drone-assisted urban traffic monitoring system. PLoS One. 2025;20(3):e0314817. pmid:40100905
  44. 44. Liu Y, Zhou H, Zhao M. Research on target detection based on improved YOLOv7 in complex traffic scenarios. PLoS One. 2025;20(5):e0323410. pmid:40388486
  45. 45. Luo Z, Bi Y, Yang X, Li Y, Yu S, Wu M, et al. Enhanced YOLOv5s + DeepSORT method for highway vehicle speed detection and multi-sensor verification. Front Phys. 2024;12.
  46. 46. Khow ZJ, Tan Y-F, Karim HA, Rashid HAA. Improved YOLOv8 model for a comprehensive approach to object detection and distance estimation. IEEE Access. 2024;12:63754–67.
  47. 47. Guerrero-Contreras G, Balderas-Díaz S, Díaz-Gomez A, Medina-Bulo I, Domínguez-Jiménez JJ. Cost-effective ADAS for inter-vehicle distance estimation using computer vision and deep learning. In: 2025 21st International Conference on Intelligent Environments (IE). IEEE; 2025. p. 1–8.
  48. 48. Thombre A, Rai AK, Dumka L, Agarwal A. Object distance estimation from a single moving camera for advanced driver assistance system. In: 2024 IEEE International Conference on Electronics, Computing and Communication Technologies (CONECCT). 2024. p. 1–6. https://doi.org/10.1109/conecct62155.2024.10677218
  49. 49. Yu J, Jiang Y, Wang Z, Cao Z, Huang T. Unitbox: an advanced object detection network. In: Proceedings of the 24th ACM international conference on Multimedia. 2016. p. 516–20.
  50. 50. Zheng Z, Wang P, Liu W, Li J, Ye R, Ren D. Distance-IoU loss: faster and better learning for bounding box regression. AAAI. 2020;34(07):12993–3000.
  51. 51. Zhang Y-F, Ren W, Zhang Z, Jia Z, Wang L, Tan T. Focal and efficient IOU loss for accurate bounding box regression. Neurocomputing. 2022;506:146–57.
  52. 52. Gevorgyan Z. SioU loss: more powerful learning for bounding box regression. arXiv preprint 2022. https://arxiv.org/abs/2205.12740
  53. 53. Bewley A, Ge Z, Ott L, Ramos F, Upcroft B. Simple online and realtime tracking. In: 2016 IEEE International Conference on Image Processing (ICIP). 2016. p. 3464–8. https://doi.org/10.1109/icip.2016.7533003
  54. 54. Jurecki RS, Stańczyk TL, Jaśkiewicz MJ. Driver’s reaction time in a simulated, complex road incident. Transport. 2014;32(1):44–54.
  55. 55. Jurecki RS, Stan´czyk TL. Analyzing driver response times for pedestrian intrusions in crash-imminent situations. In: 2018 XI International Science-Technical Conference Automotive Safety. 2018. p. 1–7.
  56. 56. Kilicarslan M, Zheng JY. Predict vehicle collision by TTC from motion using a single video camera. IEEE Trans Intell Transport Syst. 2019;20(2):522–33.
  57. 57. Shim J, Yu J, Lee K. Integrated risk grid map for collision avoidance and mitigation maneuvers of autonomous vehicle. IEEE Access. 2025.
  58. 58. YOLOLabel: Image annotation tool in YOLO format. Accessed 2023. https://githubcom/developer0hye/Yolo_Label
  59. 59. Hou P, Chen L. Rain-fog detection via spatial adaptive deformable network with multi-scale feature preservation and task-aware dynamic calibration. Eng Res Express. 2025;7(3):035248.
  60. 60. Cai J, Gao Y, Tang J. Robustness benchmark evaluation and optimization for real-time vehicle detection under multiple adverse conditions. Applied Sciences. 2025;15(9):4950.
  61. 61. Bayramov E, Istenes Z. Weather-informed vision enhancement for autonomous vehicles in adverse conditions. Pollack. 2025;20(3):88–95.