Sök:

Klassifikation på webben


The thesis concerns the use of classification schemes for organising resources in subject-based hierarchical search-services on the web. The aim is to investigate the different kinds of classification that are used, which fall into two groups: established respectively new classification schemes. This is attained by studying literature and the websites of each search-service. The prospects for using classification in a web-environment are discussed and divided into three major groups: browsing, control and cross-classification. One question at issue concerns how an established scheme manages to organise web-resources and whether this classification differs from the classification of new schemes and in what way. The BUBL LINK search-service makes use of the Dewey Decimal Classification DDC and exemplifies the application of an established classification scheme on the web. Yahoo! represents a non-established scheme. A presentation of the schemes, including the DDC, is done with the intention to achieve an understanding of the essential theoretical principles behind them. The classification schemes of BUBL LINK and Yahoo! are analysed with the purpose to discern the basic characteristics of each scheme which are illustrated by examples from the classification and then compared. The thesis ends in a discussion where flexibility, updating, notation, verbal description, subject-analysis and control are pointed out as important aspects. Which classification to use depends highly upon the aim and target group of the particular search-service. BUBL LINK and Yahoo! are two representative examples demonstrating how classification can be used on the web. Nr 76 Erik Åkesson: Nyhetssöktjänster på webben: En utvärdering av News Index, Excite News Search och Ananova News search engines on the Web: An evaluation of News Index, Excite News Search and Ananova [55]PDF-version The purpose of this study is to examine the retrieval performance of three search engines, specialized in retrieving news articles: News Index, Excite News Search and Ananova. Thirty questions, grouped into three categories politics, economy and sports, were used and the first twenty documents for each question were examined. The questions used were designed to be as current as possible and efforts were made to perform the searches with as little time span as possible between each search engine. The precision of the search engines was determined for each of the questions as well as for each category and for the combined categories. In measuring precision an average was calculated, intended to favour search engines that place its relevant documents early in the ranked list. The relevance of the retrieved documents was evaluated using a three-grade scale. Irrelevant articles and duplicates were given 0 points, partially relevant documents were given 0,5 points and those judged to be highly relevant were given 1 point. The results of the study show surprisingly high precision from two of the search engines, Excite and News Index with the former performing slightly better than the latter. Ananova performed considerably worse than the other two. One possible reason for the high precision observed is the relatively low complexity of the documents retrieved compared to web pages in general. When comparing the different categories of questions one notable result was that all search engines performed considerably worse in the "economy"-category. Possible reasons for this are, apart from a higher number of duplicates, a shortage of relevant articles for the questions in this category as well as possible differences between either the documents retrieved in the different categories, or the web pages publishing them.

Författare

Maria Lindén

Lärosäte och institution

Högskolan i Borås/Institutionen Biblioteks- och informationsvetenskap (BHS)

Nivå:

Detta är en D-uppsats.

Läs mer..