Semantic disambiguation using Distributional Semantics
Thesis title in Czech: | Semantic disambiguation using Distributional Semantics |
---|---|
Thesis title in English: | Semantic disambiguation using Distributional Semantics |
Key words: | - |
English key words: | WORD SENSE DISAMBIGUATION, VECTOR SPACE MODEL, PRAGUE DEPENDENCY TREEBANK |
Academic year of topic announcement: | 2010/2011 |
Thesis type: | diploma thesis |
Thesis language: | angličtina |
Department: | Institute of Formal and Applied Linguistics (32-UFAL) |
Supervisor: | RNDr. Jiří Hana, Ph.D. |
Author: | hidden![]() |
Date of registration: | 10.12.2010 |
Date of assignment: | 14.01.2011 |
Date and time of defence: | 10.05.2012 00:00 |
Date of electronic submission: | 13.04.2012 |
Date of submission of printed version: | 13.04.2012 |
Date of proceeded defence: | 10.05.2012 |
Opponents: | doc. Mgr. Barbora Vidová Hladká, Ph.D. |
Guidelines |
The goal of this thesis is to employ the combination of Distributional Semantics as used in Natural Language Programming (e.g. Schütze 1998) and of the traditional propositional semantics, as suggested for example by E. Hovy (2010), in a task of automatic categorization (for example, lemma disambiguation on the Prague Dependency Treebank).
E. Hovy's semantics combines traditional propositional semantics based on symbolic logic and statistical word distribution information of Distributional Semantics as used in Natural Language Programming (e.g. Schütze 1998). The core resource is a single lexico-semantic lexicon where concepts are organized as tensors encoding strenght of relations to other concepts. Using these strenghts of relations, appropriateness of terms given a particular context can be determined, and used for a variety of tasks, including term disambiguation. Distributional Semantics has a strong cognitive plausibility, as shown for example by its ability to predict human brain activity associated with the meanings of nouns (Mitchell et al 2008). The result of this thesis should be a system performing automatic categorization using Hovy's semantics, for example, a system for lexical disambiguation tested on the Prague Dependency Treebank. Lexical disambiguation is a process of determining the correct meaning of a word based on its context (e.g. determining whether 'bank' refers to an institution or to a river bank). |
References |
Hovy, Eduard (2010): Distributional Semantics and the Lexicon, Keynote speech at COLLING 2010.
Landauer, Thomas K. and Dumais, Susan T. (1997). A solution to Plato's problem: The latent semantic analysis theory of acquisition, induction and representation of knowledge. Psychological Review, 104(2), 211-240. Mitchell, Tom M.; Shinkareva, Svetlana V.; Carlson, Andrew; Chang, Kai-Min; Malave, Vicente L.; Mason, Robert A.; Just, Marcel Adam (2008). Predicting human brain activity associated with the meanings of nouns. Science, 320, 1191-1195. Schütze, Hinrich (1998). Automatic word sense discrimination. Computational Linguistics, 24(1), 97-123. Stefan Evert, Alessandro Lenci: Distributional Semantic Models - A course at ESSLLI 2009, Bordeaux, July 27-31 2009. Lin, Dekang (1998). Automatic retrieval and clustering of similar words. In Proceedings of the 17th International Conference on Computational Linguistics (COLING-ACL 1998), pages 768-774, Montreal, Canada. |