Indian Journal of Engineering

  • Home

Volume 20, Issue 53, January - June, 2023

Enhanced accuracy for sms spam detection using One Dimensional Ternary Patterns (1D-TP) and firefly algorithm

Aiyeniko O1, Aro TO2♦, Olukiran OA3, Alfa AA2, Umoru LC2, Owonipa A4

1Department of Computer Sciences, Lagos State University, Lagos State. Nigeria
2Department of Computer Science, Confluence University of Science and Technology, Osara, Kogi State, Nigeria
3Department of Computer Engineering, Ladoke Akintola University, Ogbomoso, Oyo State, Nigeria
4Division of Computer Science and Mathematics University of Stirling, United Kingdom

♦Corresponding author
Department of Computer Science, Confluence University of Science and Technology, Osara, Kogi State Nigeria

ABSTRACT

The Short Message Service (SMS) is a broadly used mobile communication channel; its attractiveness is traceable to many factors such as easy delivery method, cheap approach and expedient usage. However, unwanted messages referred to as SMS Spam has been identified to be one of the major problems for users and mobile service providers. This paper developed an SMS Spam detection model by optimization of One-Dimensional Ternary Pattern (1D-TP) feature extraction algorithm through the application of a robust optimization algorithm known as the firefly algorithm. The implementation of the model was done in a python environment due to its unique features in data/text analysis and classification. Accuracy of the optimised 1D-TP was done using five selected learning algorithms; Artificial Neural Network (ANN), Decision Tree (C4.5), Naïve Bayes (NB), K Nearest Neighbour (KNN) and Support Vector Machne (SVM). The accuracy of SMS Spam detection was evaluated with three datasets: Kaggle SMS Spam, British English SMS Corpora and SMS Spam Corpus v0.1 dataset. Results showed the effectiveness of the firefly algorithm with the best accuracy of 93.94% recorded in the Kaggle SMS Spam dataset using NB classifier when 𝛽 = 0 for upper features compared with the other two SMS spam datasets, which the best accuracy obtained is 92.96% in British English SMS Corpora dataset using NB when 𝛽 = 1 for lower features and accuracy of 91.97% was recorded in SMS Spam Corpus v0.1 dataset using NB when 𝛽 = 4 for upper features. The improvement was shown in the output through a reduction in the level of misclassification.

Keywords: Accuracy, Firefly Algorithm, SMS Spam, One Dimensional Ternary Pattern

Indian Journal of Engineering, 2023, 20(53), e4ije1004
PDF
DOI: https://doi.org/10.54905/disssi/v20i53/e4ije1004

Published: 3 March 2023

Creative Commons License

© The Author(s) 2023. Open Access. This article is licensed under a Creative Commons Attribution License 4.0 (CC BY 4.0).