Skip to main content
Kent Academic Repository

Is GPT-4 good enough to evaluate jokes?

Goes, Fabricio, Sawicki, Piotr, Grześ, Marek, Brown, Dan, Volpe, Marco (2023) Is GPT-4 good enough to evaluate jokes? In: International Conference for Computational Creativity. . , Waterloo, Canada (In press) (KAR id:101552)

Abstract

In this paper, we investigate the ability of large language models (LLMs), specifically GPT-4, to assess the funniness of jokes in comparison to human ratings. We use a dataset of jokes annotated with human ratings and explore different system descriptions in GPT-4 to imitate human judges with various types of humour. We propose a novel method to create a system description using many-shot prompting, providing numerous examples of jokes and their evaluation scores. Additionally, we examine the performance of different system descriptions when given varying amounts of instructions and examples on how to evaluate jokes. Our main contributions include a new method for creating a system description in LLMs to evaluate jokes and a comprehensive methodology to assess LLMs' ability to evaluate jokes using rankings rather than individual scores.

Item Type: Conference or workshop item (Poster)
Uncontrolled keywords: Creativity; GPT-4; LLMs; NLP
Subjects: Q Science > Q Science (General) > Q335 Artificial intelligence
Divisions: Divisions > Division of Computing, Engineering and Mathematical Sciences > School of Computing
Funders: University of Kent (https://ror.org/00xkeyj56)
Depositing User: Piotr Sawicki
Date Deposited: 05 Jun 2023 17:28 UTC
Last Modified: 05 Nov 2024 13:07 UTC
Resource URI: https://kar.kent.ac.uk/id/eprint/101552 (The current URI for this page, for reference purposes)

University of Kent Author Information

Sawicki, Piotr.

Creator's ORCID:
CReDIT Contributor Roles:

Grześ, Marek.

Creator's ORCID: https://orcid.org/0000-0003-4901-1539
CReDIT Contributor Roles:
  • Depositors only (login required):

Total unique views for this document in KAR since July 2020. For more details click on the image.