Anomaly Detection in Social Media Conversation Using Natural Language Processing with LSTM and Naïve Bayes Models
Loading...
Date
2024
Journal Title
Journal ISSN
Volume Title
Publisher
Journal of The Faculty of Computational Sciences & Informatics Academic City University College Accra, Ghana
Abstract
While these platforms are now used widely, they have also resulted in the rise of problematic or
discursive and offensive material which has serious implications for online safety. To tackle this issue,
this paper presents a hybrid method combining Natural Language Processing (NLP) for data
preprocessing, Long-Short-Term-Memory (LSTM) with Naïve Bayes (NB) model to detect anomalies in
the conversational style of users on social media. The NLP was used to process raw tweet data which
was taken from the Kaggle dataset containing 24,783 instances with 6 features and processed to a
format that is ideal for analysis. Naïve Bayes and LSTM models were then trained to detect abnormal
or problematic messages such as hate speech and offensive language after data cleaning and
transformation. Naïve Bayes, which served as a fast, probabilistic baseline model for the classification
of the text was complemented in this research with the ability of the LSTM model to provide deep
contextual understanding as well as sequential patterns that occur in the conversation. The model’s
performance metrics followed standard Machine Learning practices, and the best result was
presented by the hybrid LSTM-NB model with an accuracy of 99.2% while the NB had 90% test result
and LSTM 95% test score. This impressive result highlights the benefits of deploying NLP, integrated
with both traditional machine learning and deep learning techniques to deal with the issue of
detecting anomalies on online conversations. This paper aims to add to the NLP and AI literature, by
offering an economic model to make digital communication safer.