Enhanced YOLO Algorithm for Robust Object Detection in Challenging Nighttime and Blurry, Low Vision

Enhanced YOLO Algorithm for Robust Object Detection in Challenging Nighttime and Blurry, Low Vision

DOI: 10.4018/979-8-3693-0639-0.ch017
OnDemand:
(Individual Chapters)
Available
$37.50
No Current Special Offers
TOTAL SAVINGS: $37.50

Abstract

In today's computer vision systems, the spread of object detection has been booming. Object detection in challenging conditions such as low-illumination or misty nights remains a difficult task, especially for one-stage detectors, which have limited improved solutions available. This approach improves upon existing one-stage models and excels in detecting objects in partially visible, and night environments. It segments objects using bounding boxes and tracks them in motion pictures. To detect an object in low-light environment we employ an RGB camera to generate a properly lighted image from an unilluminated image using dehazing and grayscale conversion methods. Secondly, low-illuminated images undergo dehazing and gray-scale conversion techniques to obtain a better-lighted image using the popular one-stage object detection algorithm YOLOv8. Video inputs are also taken for fast-moving vehicles; rates ranging from 5 frames per second to 160 frames per second could be efficiently predicted by YOLO-ODDT. All renowned object detectors are overshadowed in terms of speed and accuracy.
Chapter Preview
Top

Introduction

The advancement of object detection in computer vision systems owes much to the development of convolutional neural networks (CNNs). The integration of object detection techniques into computer vision research necessitates accurate object detection. However, challenges persist in detecting objects under low-light, dark, and adverse weather conditions for most detection models. In such conditions, images often appear blurry and dark.

Two primary models categorize object detectors: one-stage and two-stage models. One-stage models, exemplified by YOLOv4, Mobile Net SSD, and Squeeznet, balance efficiency and accuracy through a straightforward structure and swift operation, combining classification and regression tasks. On the other hand, two-stage models like R-CNN, Mask, and Mask Refined R-CNN prioritize accuracy at the expense of processing speed. YOLOv8, the latest addition to the YOLO series, introduces a decoupled head architecture that separates regression and classification, enhancing the balance between detection accuracy and efficiency compared to previous models.

Despite CNNs advancing object detection, challenges persist, especially in challenging conditions. One-stage models offer a compromise between efficiency and accuracy, while two-stage models prioritize accuracy but sacrifice processing speed. YOLOv8, with its Decoupled Head architecture, seeks to optimize this balance.

This study focuses on refining the stem layer of YOLOv8 to address challenges in accurately detecting objects under blurry lighting conditions, resulting in the proposed YOLO-ODDT architecture. The modified stem layer architecture, as illustrated in Figure 1, aims to mitigate difficulties posed by unclear lighting conditions in object detection tasks.

The main contributions of the proposed method are as follows:

  • 1.

    Precision enhancement of object segmentation within bounding boxes, particularly those with significant overlap. The model's efficacy has been evaluated across various datasets, including Nightjars, MS COCO, and ImageNet, demonstrating promising outcomes in detecting objects in low-light and blurred conditions.

  • 2.

    In the subsequent algorithmic phase, the system computes object distances and estimates real-time object counts through segmentation masks. The approach leverages segmentation masks to precisely recognize objects and promptly gauge their proximity to the observer, holding significant potential for applications ranging from object tracking to autonomous navigation systems.

  • 3.

    Moving to the third phase, the algorithm employs a specialized variant of Non-Maximum Suppression (NMS) known as Diag-NMS. This method utilizes the diagonal length of bounding boxes to determine box similarity and control intersection over union (IoU). This tailored NMS modification offers heightened resilience, resulting in improved differentiation and retention of bounding boxes for targets with substantial overlap. Through Diag-NMS, a more precise and efficient object detection mechanism is achieved, particularly beneficial in scenarios where multiple targets closely coexist.

Figure 1.

Object detection model overview

979-8-3693-0639-0.ch017.f01

Complete Chapter List

Search this Book:
Reset