You are currently viewing an archive of this site. To view new content and see what I’ve been up to lately please check the main page at http://www.aidanf.net
Finn, A. (2006). A Multi-Level Boundary Classification Approach to Information Extraction. Phd thesis (University College Dublin). pdf
Abstract
Information Extraction (IE) is the process of identifying a set of pre-defined relevant items in text documents. We investigate the application of Machine Learning classification techniques to the problem of Information Extraction. In particular we use Support Vector Machines and several different feature-sets to build a set of classifiers for Information Extraction (IE). We show that this approach is competitive with current state-of-the-art Information Extraction algorithms based on specialized learning algorithms.
Finn, A. & Kushmerick, N. (2006). Learning to classify documents according to genre. To appear Journal of the American Society for Information Science and Technology (JASIST), Special Issue on Computational Analysis of Style, volume 57, number 11, September 2006. pdf
ELIE is a tool for adaptive information extraction from text. It also provides a number of other text processing tools e.g. POS tagging, chunking, gazetteer, stemming.
Finn, A. & Kushmerick, N. (2003). Learning to classify documents according to genre. IJCAI-03 Workshop on Computational Approaches to Style Analysis and Synthesis (Acapulco). pdf, postscript
Finn, A. & Kushmerick, N. (2003). Active learning strategies for information extraction. Poster submission(rejected) to International Joint Conference on Artificial Intelligence (Acapulco). pdf
Finn, A. (2002). Machine learning for genre classification. Msc thesis (University College Dublin). postscript
Finn, A., Kushmerick, N. & Smyth, B. (2002). Genre classification and domain transfer for information filtering. In Proc. European Colloquium on Information Retrieval Research (Glasgow). postscript