You are currently viewing an archive of this site. To view new content and see what I’ve been up to lately please check the main page at http://www.aidanf.net
Finn, A., Kushmerick, N., & Smyth, B. (2001). Fact or fiction: Content classification for digital libraries. Joint DELOS-NSF Workshop on Personalisation and Recommender Systems in
Digital Libraries (Dublin). postscript, pdf
Abstract
The World-Wide Web (WWW) is a vast repository of information, much of which is valuable but very often hidden to the user. The anarchic nature of the WWW presents unique challenges when it comes to information extraction and categorization. We view the WWW as a valuable resource for the gathering of information for Digital Libraries. In this paper we will describe the process of extracting and classifying information from the WWW for the purpose of integrating it into digital libraries. Our efforts focus on ways to automatically classify news articles according to whether they present opinions or reported facts. We describe and evaluate a system in development that automatically classifies and recommends Web news articles from sports and politics domains.