University of Hertfordshire

Partitional Clustering of Malware Using K-Means

Research output: Chapter in Book/Report/Conference proceedingChapter (peer-reviewed)

  • Renato Cordeiro De Amorim
  • Peter Komisarczuk
View graph of relations
Original languageEnglish
Title of host publicationCyberpatterns
Subtitle of host publicationUnifying Design Patterns with Security and Attack Patterns
PublisherSpringer
Pages223-233
ISBN (Electronic)978-3-319-04447-7
ISBN (Print)978-3-319-04446-0
DOIs
Publication statusPublished - May 2014

Abstract

This paper describes a novel method aiming to cluster datasets containing malware behavioural data. Our method transform the data into an standardised data matrix that can be used in any clustering algorithm, finds the number of clusters in the data set and includes an optional visualization step for high-dimensional data using principal component analysis. Our clustering method deals well with categorical data, and it is able to cluster the behavioural data of 17,000 websites, acquired with Capture-HPC, in less than 2 min

ID: 9822819