Mills, Jon (1999) A Logical Approach to the Lemmatisation of Computational Lexica. In: VI Simposio Internacional de Comunicación Social, 25-28 Jan 1999, Santiago de Cuba, Cuba. (The full text of this publication is not currently available from this repository. You may be able to access a copy if URLs are provided) (KAR id:8339)
The full text of this publication is not currently available from this repository. You may be able to access a copy if URLs are provided. |
Abstract
Lemmatisation is a crucial part of the compilation of a computational lexicon; it is the process which determines the selection and presentation of lemmata. Lemmatisation is a non-trivial task; it consists of a good deal more than deinflection to identify a baseform that can serve as a headword. The lemma has three functions: to uniquely identify the lexical unit, to locate it in the system, and to describe its form. The computational lexicographer is confronted by a number of problems. Homographs need to be distinguished. Several variants of the baseform may exist from which a preferred form will have to be chosen. Compounds may be written as solid, hyphenated or as two words. A way has to be found to treat multi-word lexemes. It may be necessary to give some very common affixes main-entry status. A decision has to be made whether to treat derivatives as main entries with cross reference to the baseform or regroup them under the baseform. A solution is suggested in which the preferred baseform together with a number of other distinguishers may be satisfactorily employed to fulfil all the functions of the lemma. Next it is shown how these elements may be placed within a logical framework to implement computational lexica. The model is then extended to deal with problems of asymmetry in interlingual lemmatisation.
Item Type: | Conference or workshop item (Paper) |
---|---|
Subjects: |
T Technology P Language and Literature P Language and Literature > P Philology. Linguistics |
Divisions: | Divisions > Division of Arts and Humanities > School of Culture and Languages |
Depositing User: | Francis Mills |
Date Deposited: | 24 May 2016 11:49 UTC |
Last Modified: | 05 Nov 2024 09:40 UTC |
Resource URI: | https://kar.kent.ac.uk/id/eprint/8339 (The current URI for this page, for reference purposes) |
- Export to:
- RefWorks
- EPrints3 XML
- BibTeX
- CSV
- Depositors only (login required):