Skip to main content

A Logical Approach to the Lemmatisation of Computational Lexica

Mills, Jon (1999) A Logical Approach to the Lemmatisation of Computational Lexica. In: VI Simposio Internacional de Comunicación Social, 25-28 Jan 1999, Santiago de Cuba, Cuba. (The full text of this publication is not currently available from this repository. You may be able to access a copy if URLs are provided)

The full text of this publication is not currently available from this repository. You may be able to access a copy if URLs are provided. (Contact us about this Publication)

Abstract

Lemmatisation is a crucial part of the compilation of a computational lexicon; it is the process which determines the selection and presentation of lemmata. Lemmatisation is a non-trivial task; it consists of a good deal more than deinflection to identify a baseform that can serve as a headword. The lemma has three functions: to uniquely identify the lexical unit, to locate it in the system, and to describe its form. The computational lexicographer is confronted by a number of problems. Homographs need to be distinguished. Several variants of the baseform may exist from which a preferred form will have to be chosen. Compounds may be written as solid, hyphenated or as two words. A way has to be found to treat multi-word lexemes. It may be necessary to give some very common affixes main-entry status. A decision has to be made whether to treat derivatives as main entries with cross reference to the baseform or regroup them under the baseform. A solution is suggested in which the preferred baseform together with a number of other distinguishers may be satisfactorily employed to fulfil all the functions of the lemma. Next it is shown how these elements may be placed within a logical framework to implement computational lexica. The model is then extended to deal with problems of asymmetry in interlingual lemmatisation.

Item Type: Conference or workshop item (Paper)
Subjects: T Technology
P Language and Literature
P Language and Literature > P Philology. Linguistics
Divisions: Faculties > Humanities > School of European Culture and Languages
Depositing User: Jon Mills
Date Deposited: 24 May 2016 11:49 UTC
Last Modified: 28 May 2019 13:44 UTC
Resource URI: https://kar.kent.ac.uk/id/eprint/8339 (The current URI for this page, for reference purposes)
  • Depositors only (login required):