Manuscript Title:

PSAT-BASED SENTIMENT ANALYSIS: FOR TEXT AND DATA MINING

Author:

IRFAN ALI KANDHRO, FAYYAZ ALI, ASADULLAH KEHAR, RAJA SOHAIL AHMED LARIK, MUHAMMAD SAMEER AHMED SHEIKHA, JAVERIA TAUFIQ

DOI Number:

DOI:10.17605/OSF.IO/ABXN4

Published : 2022-04-23

About the author(s)

1. IRFAN ALI KANDHRO - Department of Computer Science, Sindh Madressatul Islam University,Karachi, Pakistan.
2. FAYYAZ ALI - Department of Software Engineering, Sir Syed University of Engineering and Technology, Karachi Sindh, Pakistan.
3. ASADULLAH KEHAR - Institute of computer science, Shah Abdul Latif University, Khairpur, Sindh, Pakistan.
4. RAJA SOHAIL AHMED LARIK - School of Computer Science and Engineering, Nanjing University of Science & Technology, 210094, P.R. China.
5. MUHAMMAD SAMEER AHMED SHEIKHA - Department of Computer Science, Sindh Madressatul Islam University,Karachi, Pakistan.
6. JAVERIA TAUFIQ - Department of Computer Science, Sindh Madressatul Islam University,Karachi, Pakistan.

Full Text : PDF

Abstract

In this paper, we have developed a preprocessing and Sentiment analysis (SA) tool (PSAT) that will be used to analyze the sentiments of peoples, cricket audience, former cricketers, and other sports personality on abandoned tour of Pakistan by New Zealand reasoning security concerns. This paper focuses on data cleaning and analyzing the sentiments from textual data. The data is collected from different social websites, Facebook, and tweeter. In which the user can choose a topic and specify their preferences. The model uses recent linked tweets to detect the polarity (negative, positive, both and neutral) of the issue and displays the findings. Around 3000 Arabic tweets were randomly selected and evenly labelled to train the programmed. In this research, we offer a novel technique that uses a combination of parameters to apply sentiment analysis of cricket text tweets and comments. Those parameters are (1) the time of the tweets, (2) preprocessing methods like stemming and retweets, (3) removing whitespaces, (4) Capitalizing. The PSAT tool combined with Naive Bayes classifier a group of classification algorithms based on Bayes’ Theorem. The Accuracy PSAT tool is 75% approx. and F1 Score is 69%. According to our experiment, The Naive Bayes machine learning approach is the most accurate at predicting topic polarity. The tool is excellent for intermediate and advanced users, and it can assist them in determining the ideal parameter combinations for sentiment analysis


Keywords

Sentiments Analysis, Cleaning, Tool, PSAT, Naïve Bayes, Confusion Matrix and F1 Score.