Skip to main content

Automatic Lyrics-based Music Genre Classification in a Multilingual Setting

Howard, Sam, Silla Jr, Carlos N., Johnson, Colin G. (2011) Automatic Lyrics-based Music Genre Classification in a Multilingual Setting. In: Proceedings of the Thirteenth Brazilian Symposium on Computer Music. . (KAR id:33266)

Language: English
Click to download this file (130kB) Preview
[thumbnail of mgc-lyrics.pdf]
This file may not be suitable for users of assistive technology.
Request an accessible format
Official URL:


A large amount of research has been undertaken with regard to the classification of lyrics into genres, but most of this work has featured solely English lyrics. This study investigates the implications of classifying a multilingual database and the effectiveness of a number of techniques and algorithms for doing so. Part of this involves the creation of a high-quality dataset for use in this research. This paper finds that there are significant challenges in preprocessing multilingual text, and that traditional techniques like stemming and stop words may actually do more harm than good in such circumstances. It also finds that classes with strong language bias may be more likely to perform better than those with multiple languages.

Item Type: Conference or workshop item (Paper)
Subjects: Q Science > QA Mathematics (inc Computing science) > QA 76 Software, computer programming,
Divisions: Divisions > Division of Computing, Engineering and Mathematical Sciences > School of Computing
Depositing User: Colin Johnson
Date Deposited: 21 Feb 2013 18:54 UTC
Last Modified: 16 Nov 2021 10:10 UTC
Resource URI: (The current URI for this page, for reference purposes)
Johnson, Colin G.:
  • Depositors only (login required):

Total unique views for this document in KAR since July 2020. For more details click on the image.