Manuscript Title:

CYBERBULLYING DETECTION FROM BANGLA TEXTS EXTRACTED IN SOCIAL MEDIA IMAGES USING MACHINE LEARNING ALGORITHMS

Author:

MD. TOFAEL AHMED, SAMINA TASNIM SINTHY, PINTU CHANDRA PAUL, KASHMI SULTANA

DOI Number:

DOI:10.5281/zenodo.15515445

Published : 2025-05-23

About the author(s)

1. MD. TOFAEL AHMED - Department of Information and Communication Technology, Comilla University, Bangladesh.
2. SAMINA TASN.M SINTHY - Department of Information and Communication Technology, Comilla University, Bangladesh.
3. PINTU CHANDRA PAUL - Department of Information and Communication Technology, Comilla University, Bangladesh.
4. KASHMI SULTANA - Department of Information and Communication Technology, Comilla University, Bangladesh.

Full Text : PDF

Abstract

As the number of people using social media grows, so does the rate of cyberbullying in Bangladesh. Cyberbullying can take several forms, one of which is text in images. To address this problem, we conducted research in which we utilized OCR to extract text from photos and then used NLP and Machine Learning to detect the existence of Cyberbullying in those texts. 1003 images were gathered from Facebook, and textual data was extracted using OCR. The texts were then preprocessed, and features were extracted before being fed through four Machine Learning algorithms to train. Accuracy, precision, recall, f1-score, training and prediction time, and roc area were all included in the final performance analysis. Finally, the decision Tree algorithm had the highest accuracy of 94% for the test data. Linear SVC (94%), Random Forest (90%), and KNN (88%) came in second, third, and fourth, respectively.


Keywords

Cyberbullying, OCR, Machine Learning, Image, Social Media.