Skip to main content

Automatic Lyrics-based Music Genre Classification in a Multilingual Setting

Howard, Sam, Silla Jr, Carlos N., Johnson, Colin G. (2011) Automatic Lyrics-based Music Genre Classification in a Multilingual Setting. In: Proceedings of the Thirteenth Brazilian Symposium on Computer Music. . (KAR id:33266)


A large amount of research has been undertaken with regard to the classification of lyrics into genres, but most of this work has featured solely English lyrics. This study investigates the implications of classifying a multilingual database and the effectiveness of a number of techniques and algorithms for doing so. Part of this involves the creation of a high-quality dataset for use in this research. This paper finds that there are significant challenges in preprocessing multilingual text, and that traditional techniques like stemming and stop words may actually do more harm than good in such circumstances. It also finds that classes with strong language bias may be more likely to perform better than those with multiple languages.

Item Type: Conference or workshop item (Paper)
Subjects: Q Science > QA Mathematics (inc Computing science) > QA 76 Software, computer programming,
Divisions: Faculties > Sciences > School of Computing
Depositing User: Colin Johnson
Date Deposited: 21 Feb 2013 18:54 UTC
Last Modified: 01 Aug 2019 10:35 UTC
Resource URI: (The current URI for this page, for reference purposes)
Johnson, Colin G.:
  • Depositors only (login required):


Downloads per month over past year