Towards effective malware clustering: reducing false negatives through feature weighting and the Lp metric

Renato Cordeiro De Amorim, Peter Komisarczuk

    Research output: Chapter in Book/Report/Conference proceedingChapter (peer-reviewed)peer-review

    Abstract

    In this paper we present a novel method to reduce the incidence of false negatives in the clustering of malware detected during drive-by-download attacks. Our method comprises the use of a high-interaction client honey-pot called Capture-HPC to acquire behavioural system and network data, and the application of clustering analysis. Our method addresses various issues in clustering, including (i) finding the number of clusters in the dataset, (ii) finding good initial centroids, (iii) determining the relevance of each of the features at each cluster. Our method applies partitional clustering based on the Minkowski Weighted K-Means (Lp) and anomalous pattern initialization. We have performed various experiments on a dataset containing the behaviour of 17,000 possibly infected websites gathered from sources of malicious URLs. We find that our method produces a smaller within cluster variance and a lower quantity of false negatives than other popular clustering algorithms such as K-Means and the Ward's method.
    Original languageEnglish
    Title of host publicationCase Studies in Secure Computing
    Subtitle of host publicationAchievements and Trends
    EditorsBiju Issac, Nauman Israr
    PublisherCRC Press
    Pages295-310
    ISBN (Print)9781482207064
    Publication statusPublished - Sept 2014

    Fingerprint

    Dive into the research topics of 'Towards effective malware clustering: reducing false negatives through feature weighting and the Lp metric'. Together they form a unique fingerprint.

    Cite this