The Short Message Service (SMS) is a broadly used mobile communication channel; its attractiveness is traceable to many factors such as easy delivery method, cheap approach and expedient usage. However, unwanted messages referred to as SMS Spam has been identified to be one of the major problems for users and mobile service providers. This paper developed an SMS Spam detection model by optimization of One-Dimensional Ternary Pattern (1D-TP) feature extraction algorithm through the application of a robust optimization algorithm known as the firefly algorithm. The implementation of the model was done in a python environment due to its unique features in data/text analysis and classification. Accuracy of the optimised 1D-TP was done using five selected learning algorithms; Artificial Neural Network (ANN), Decision Tree (C4.5), Naïve Bayes (NB), K Nearest Neighbour (KNN) and Support Vector Machne (SVM). The accuracy of SMS Spam detection was evaluated with three datasets: Kaggle SMS Spam, British English SMS Corpora and SMS Spam Corpus v0.1 dataset. Results showed the effectiveness of the firefly algorithm with the best accuracy of 93.94% recorded in the Kaggle SMS Spam dataset using NB classifier when 𝛽 = 0 for upper features compared with the other two SMS spam datasets, which the best accuracy obtained is 92.96% in British English SMS Corpora dataset using NB when 𝛽 = 1 for lower features and accuracy of 91.97% was recorded in SMS Spam Corpus v0.1 dataset using NB when 𝛽 = 4 for upper features. The improvement was shown in the output through a reduction in the level of misclassification.
Keywords: Accuracy, Firefly Algorithm, SMS Spam, One Dimensional Ternary Pattern