SPAM DETECTION BASED ON FUSION OF SPAMMER BEHAVIOR  FEATURES AND LINGUISTIC FEATURES

AMNA IQBAL, MUHAMMAD YOUNAS, RAMZAN TALIB, MUHAMMAD MURAD KHAN, BUSHRA ZAFAR

Manuscript Title:

SPAM DETECTION BASED ON FUSION OF SPAMMER BEHAVIOR FEATURES AND LINGUISTIC FEATURES

Author:

AMNA IQBAL, MUHAMMAD YOUNAS, RAMZAN TALIB, MUHAMMAD MURAD KHAN, BUSHRA ZAFAR

DOI Number:

DOI:10.17605/OSF.IO/CVM6N

Published : 2023-07-23

About the author(s)

1. AMNA IQBAL - Ph.D Candidate, Computer Science at Government College University Faisalabad, Pakistan.
2. MUHAMMAD YOUNAS - Assistant Professor, Computer Science Department, Government College University Faisalabad Pakistan.
3. RAMZAN TALIB - Professor and Chairman of the Department of Computer Science, Government College University Faisalabad (GCUF), Pakistan.
4. MUHAMMAD MURAD KHAN - Assistant Professor with the Department of Computer Science, Government College University Faisalabad, Pakistan.
5. BUSHRA ZAFAR - Assistant Professor in Department of Computer Science at Government College University Faisalabad (GCUF), Pakistan.

Full Text : PDF

Abstract

E-commerce sites, forums, and blogs have become popular platforms for people to share their views. Reviews have emerged as a crucial source of information for potential customers, influencing their purchasing decisions. Similarly for profit gain or fame, Spam reviews are deliberately written with the intention of defaming businesses or individuals. This act is known as review spamming. Spam review detection is rapidly answered by various ML techniques. Review of spamming is more challenging task in multilingual communities. Spammer behavior features and linguistic features often exhibit complex relationships that influence the nature of spam reviews. The unified representation of features is another challenging task in spam detection. Various deep learning approaches have been proposed for review spamming, including different neural networks (Convolutional Neural Network, CNN). These methods are specialized in extracting the features but lack to capture feature dependencies effectively with other features. Spam Review Detection using the Fusion Gradient Boosting Model (SRD-FGBM) is proposed with fusion of spammer behavior features and linguistic features to automatically detect and classify the spam reviews. Fusion enables the proposed model to automatically learn the interactions between the features during the training process, allowing it to capture complex relationships and make predictions based on both types of features. It apparently shows the promising result by obtaining 94.3% accuracy.

Keywords

Review Spamming, Linguistic features, spammer behavior features, Classification, Feature engineering, SVM, Gradient Boosting Model (GBM), Fusion.