18th Annual Computer Security Applications Conference
December 9-13, 2002
Las Vegas, Nevada

Gender-Preferential Text Mining of E-mail Discourse

Malcolm Corney, Alison Anderson and George Mohay
Queensland University of Technology

Olivier de Vel
Defence Science and Technology Organisation

This paper describes an investigation of authorship gender attribution mining from e-mail text documents. We used an extended set of predominantly topic content-free e-mail document features such as style markers, structural characteristics and gender-preferential language features together with a Support Vector Machine learning algorithm. Experiments using a corpus of e-mail documents generated by a large number of authors of both genders gave promising results for author gender categorisation.

Keywords: computer forensics, authorship attribution, email, data mining

Read Paper Read Paper (in PDF)