A Hybrid Spam Detection Method Based on Unstructured Datasets

Olga Angelopoulou, Shao Y, Trovati Marcello, Shi Q, Asimakopoulou E, Bessis Nik

Research output: Contribution to journalArticlepeer-review

9 Citations (Scopus)
75 Downloads (Pure)


The identification of non-genuine or malicious messages poses a variety of challenges due to the continuous changes in the techniques utilised by cyber-criminals. In this article, we propose a hybrid detection method based on a combination of image and text spam recognition techniques. In particular, the former is based on sparse representation-based classification, which focuses on the global and local image features, and a dictionary learning technique to achieve a spam and a ham sub-dictionary. On the other hand, the textual analysis is based on semantic properties of documents to assess the level of maliciousness. More specifically, we are able to distinguish between meta-spam and real spam. Experimental results show the accuracy and potential of our approach.
Original languageEnglish
Pages (from-to)233-243
Number of pages11
JournalSoft Computing
Issue number1
Early online date21 Dec 2015
Publication statusPublished - 1 Jan 2017


  • Image spam, Text spam, Semantic networks, Classication, Subclass Discriminant Analysis, Feature Selection, Sparse Representation


Dive into the research topics of 'A Hybrid Spam Detection Method Based on Unstructured Datasets'. Together they form a unique fingerprint.

Cite this