Units of software are represented as points in a multidimensional space, by calculating 12 measures of software complexity for each unit. To large sets of commercial software are thereby represented as 2236 and 4456 12-ary vectors respectively. These two sets of vectors are then clustered by a variety of competitive neural networks. It is found that the software does not fall into any simple set of clusters, but that a complex pattern of clustering emerges. These clusters give a view of the structural similarity of units of code in the data sets.
|Name||UH Computer Science Technical Report|
|Publisher||University of Hertfordshire|