Using sociolinguistic inspired features for gender classification of web authors

Vasiliki Simaki, Christina Aravantinou, Iosif Mporas, Vasileios Megalooikonomou

Research output: Chapter in Book/Report/Conference proceedingConference contribution

3 Citations (Scopus)

Abstract

In this article we present a methodology for classification of text from web authors, using sociolinguistic inspired text features. The proposed methodology uses a baseline text mining based feature set, which is combined with text features that quantify results from theoretical and sociolinguistic studies. Two combination approaches were evaluated and the evaluation results indicated a significant improvement in both combination cases. For the best performing combination approach the accuracy was 84.36%, in terms of percentage of correctly classified web posts.

Original languageEnglish
Title of host publicationText, Speech, and Dialogue - 18th International Conference, TSD 2015, Proceedings
EditorsVáclav Matoušek, Pavel Král
PublisherSpringer Nature
Pages587-594
Number of pages8
ISBN (Print)9783319240329
DOIs
Publication statusPublished - 1 Jan 2015
Externally publishedYes
Event18th International Conference on Text, Speech and Dialogue, TSD 2015 - Pilsen, Czech Republic
Duration: 14 Sept 201517 Sept 2015

Publication series

NameLecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)
Volume9302
ISSN (Print)0302-9743
ISSN (Electronic)1611-3349

Conference

Conference18th International Conference on Text, Speech and Dialogue, TSD 2015
Country/TerritoryCzech Republic
CityPilsen
Period14/09/1517/09/15

Keywords

  • Gender identification
  • Sociolinguistics
  • Text classification algorithms

Fingerprint

Dive into the research topics of 'Using sociolinguistic inspired features for gender classification of web authors'. Together they form a unique fingerprint.

Cite this