A text annotation method based on semantic sequences

J. Bao, C. Lyon, P.C.R. Lane

    This paper presents a text annotation method based on semantic sequences to label a document and a cluster of documents. The basic idea underlying the semantic sequence approach is to find locally frequent meanings to act as the labels of a document, using an ontology such as WordNet. The ontology is also used to measure the semantic similarity of labels that indicate similarity between documents. Further, a text clustering method based upon four natural rules is introduced to cluster documents and label each cluster. This method does not need any pre-defined number of clusters, which is necessary for the partitioning clustering method, and avoids the need to set appropriate levels as in the hierarachical clustering method.
    Original languageEnglish
    JournalProceedings of the Seventh International Workshop on Computational Semantics
    Publication statusPublished - 2007


    • semantic sequences
    • text annotation
    • WordNet
    • clustering


