Anomaly Detection in Social Media Conversation Using Natural Language Processing with LSTM and Naïve Bayes Models

Babatunde A.N.,Shuaib B.M.,Kadri A.F., Isiaka O.S.,Abdulrahman, T.A., Ismail, S.I. & Oke A.A

Anomaly Detection in Social Media Conversation Using Natural Language Processing with LSTM and Naïve Bayes Models

Files

Anomaly Detection in Social Media Conversation Using Natural .pdf(695.76 KB)

Date

2024

Authors

Babatunde A.N.,Shuaib B.M.,Kadri A.F., Isiaka O.S.,Abdulrahman, T.A., Ismail, S.I. & Oke A.A

Publisher

Journal of The Faculty of Computational Sciences & Informatics Academic City University College Accra, Ghana

Abstract

While these platforms are now used widely, they have also resulted in the rise of problematic or discursive and offensive material which has serious implications for online safety. To tackle this issue, this paper presents a hybrid method combining Natural Language Processing (NLP) for data preprocessing, Long-Short-Term-Memory (LSTM) with Naïve Bayes (NB) model to detect anomalies in the conversational style of users on social media. The NLP was used to process raw tweet data which was taken from the Kaggle dataset containing 24,783 instances with 6 features and processed to a format that is ideal for analysis. Naïve Bayes and LSTM models were then trained to detect abnormal or problematic messages such as hate speech and offensive language after data cleaning and transformation. Naïve Bayes, which served as a fast, probabilistic baseline model for the classification of the text was complemented in this research with the ability of the LSTM model to provide deep contextual understanding as well as sequential patterns that occur in the conversation. The model’s performance metrics followed standard Machine Learning practices, and the best result was presented by the hybrid LSTM-NB model with an accuracy of 99.2% while the NB had 90% test result and LSTM 95% test score. This impressive result highlights the benefits of deploying NLP, integrated with both traditional machine learning and deep learning techniques to deal with the issue of detecting anomalies on online conversations. This paper aims to add to the NLP and AI literature, by offering an economic model to make digital communication safer.

URI

https://kwasuspace.kwasu.edu.ng/handle/123456789/7417

Collections

Scholarly Publication

Full item page