Manuscript Title:

VOICE ASSISTED REAL-TIME OBJECT DETECTION USING YOLO V4- TINY ALGORITHM FOR VISUAL CHALLENGED

Author:

ZAEEM NAZIR, MUHAMMAD WASEEM IQBAL, KHALID HAMID, HAFIZ ABDUL BASIT MUHAMMAD, M. ASHRAF NAZIR, QURRA-TUL-ANN, NAZIM HUSSAIN,

DOI Number:

DOI:10.17605/OSF.IO/APQYH

Published : 2023-02-10

About the author(s)

1. ZAEEM NAZIR - Ph.D. Scholar, Department of Computer Science, Superior University, Lahore, Pakistan.
2. MUHAMMAD WASEEM IQBAL - Ph.D., Associate Professor Department of Software Engineering, Superior University, Lahore, Pakistan.
3. KHALID HAMID - Ph.D. Scholar, Department of Computer Science, Superior University, Lahore, Pakistan and Lecturer at NCBA & E University East Canal Campus Lahore.
4. HAFIZ ABDUL BASIT MUHAMMAD - Ph.D. Scholar, Department of Computer Science, Superior University, Lahore, Pakistan and Lecturer at Minhaj University Lahore.
5. M. ASHRAF NAZIR - Ph.D. Scholar, Department of Computer Science, Superior University, Lahore, Pakistan and Lecturer at GC University, Lahore.
6. QURRA-TUL-ANN - Government College University, Lahore, Pakistan.
7. NAZIM HUSSAIN - Lecturer, Department of Computer Science, Government College University, Lahore, Pakistan.

Full Text : PDF

Abstract

Visual impairment is a problem that is frequently getting worse everywhere. The World Health Organization estimates that 284 million individuals worldwide suffer from near- or distance vision impairment. The suggested work's goal is to create an Android application for blind persons that works with a smartphone and white cane. As the primary distinction between the proposed system and the current one, we use the cutting-edge "You Only Look Once: Unified, Real-Time Object Detection" technology. When compared to other algorithms, YOLOv4-tiny performs twice as quickly. To recognise the actual things in front of the visually impaired individual in real time, YOLOv4-tiny algorithm was trained on both Custom dataset and COCO dataset. Then determine how far that object is from the individual and produce an audio output. The camera is initialised using the OpenCV library, and it then starts taking frames and feeding them to the system. Other than that, we are using Python 3 for this project. The system then employs the YOLOv4-tiny algorithm, which has been trained on both the Custom dataset and the COCO dataset, to recognise and gauge the distance between the objects in front of the user. Text to voice conversion is then used to turn the objects that were detected into an audio segment. Our system outputs an audio segment that tells the user the name of the object and its distance from them. The user may now visualise the objects around him using this knowledge. The suggested method will even shield the user from running into nearby objects, keeping him safe from harm. An android-based application that represents the full system is available. The user of the Android application has the choice between an internal camera and an external camera. The esp32-cam is the external camera, whereas the internal camera is the one found in the android phone, which is attached to the white cane. Real-time video from their mobile phone camera or esp32-cam will be used to detect objects. The user opens the camera and feeds the real-time footage when he decides to begin the object detection procedure. The YOLOv4-tiny programme processes each frame, detecting objects and calculating their distances from people. Then, the label and distance of the object are converted into audio format by the audio system. The user can then hear the audio using their smartphone's speaker.


Keywords

Machine Learning, YOLO V4-Tiny, Visually Impairment, Real-Time Object Detection, Data Mining, Hypotenuses, STEM-Based Smartphone.