Indian Journal of Engineering

Home

Volume 20, Issue 53, January - June, 2023

Enhanced accuracy for sms spam detection using One Dimensional Ternary Patterns (1D-TP) and firefly algorithm

Aiyeniko O¹, Aro TO^2♦, Olukiran OA³, Alfa AA², Umoru LC², Owonipa A⁴

¹Department of Computer Sciences, Lagos State University, Lagos State. Nigeria
²Department of Computer Science, Confluence University of Science and Technology, Osara, Kogi State, Nigeria
³Department of Computer Engineering, Ladoke Akintola University, Ogbomoso, Oyo State, Nigeria
⁴Division of Computer Science and Mathematics University of Stirling, United Kingdom

^♦Corresponding author
Department of Computer Science, Confluence University of Science and Technology, Osara, Kogi State Nigeria

ABSTRACT

The Short Message Service (SMS) is a broadly used mobile communication channel; its attractiveness is traceable to many factors such as easy delivery method, cheap approach and expedient usage. However, unwanted messages referred to as SMS Spam has been identified to be one of the major problems for users and mobile service providers. This paper developed an SMS Spam detection model by optimization of One-Dimensional Ternary Pattern (1D-TP) feature extraction algorithm through the application of a robust optimization algorithm known as the firefly algorithm. The implementation of the model was done in a python environment due to its unique features in data/text analysis and classification. Accuracy of the optimised 1D-TP was done using five selected learning algorithms; Artificial Neural Network (ANN), Decision Tree (C4.5), Naïve Bayes (NB), K Nearest Neighbour (KNN) and Support Vector Machne (SVM). The accuracy of SMS Spam detection was evaluated with three datasets: Kaggle SMS Spam, British English SMS Corpora and SMS Spam Corpus v0.1 dataset. Results showed the effectiveness of the firefly algorithm with the best accuracy of 93.94% recorded in the Kaggle SMS Spam dataset using NB classifier when 𝛽 = 0 for upper features compared with the other two SMS spam datasets, which the best accuracy obtained is 92.96% in British English SMS Corpora dataset using NB when 𝛽 = 1 for lower features and accuracy of 91.97% was recorded in SMS Spam Corpus v0.1 dataset using NB when 𝛽 = 4 for upper features. The improvement was shown in the output through a reduction in the level of misclassification.

Keywords: Accuracy, Firefly Algorithm, SMS Spam, One Dimensional Ternary Pattern

Indian Journal of Engineering, 2023, 20(53), e4ije1004

PDF

DOI: https://doi.org/10.54905/disssi/v20i53/e4ije1004

Published: 3 March 2023