Sök:

Word sense disambiguation med Svenskt OrdNät


In information retrieval there is a problem with ambiguous words. To solve this problem word sense disambiguation WSD is used. Few studies combining information retrieval and word sense disambiguation have been conducted with Swedish words. The purpose of this thesis has been divided in two. The first purpose was to examine Swedish information retrieval and disambiguation in the query phase. The second purpose was to compare the disambiguation between automatic and manual expansion. We chose a number of topics from the GP_HDINF test collection in Query Performance Analyser QPA. The topics we chose had to have more than ten relevant documents so that expansion would be possible. According to the rules of the automatic expansion we were to choose relations in the following order; synonyms, hyponyms, hypernyms. If such relations did not exist the topic was rejected. This left us with 14 topics. We made a baseline query with inflections of the Swedish words. Our baseline query was expanded once automatically using the sense that the Lesk algorithm chose from the Swedish WordNet and once manually by the authors. We compared precision and recall from our baseline with precision and recall from both the automatic and the manual expansions. Our study shows that the Lesk algorithm performs 60 % correct disambiguation and that manual expansion performs better than automatic expansion. The difference between automatic WSD and manual WSD is negligible and we suggest the use of automatic WSD to overcome the problems in IR because it saves the user a lot of time.

Författare

Jens Christiansson Zeina Zimmerman

Lärosäte och institution

Högskolan i Borås/Institutionen Biblioteks- och informationsvetenskap (BHS)

Nivå:

Detta är en D-uppsats.

Läs mer..