Finn, A. (2006). A Multi-Level Boundary Classification Approach to Information Extraction. Phd thesis (University College Dublin). pdf
Abstract
Information Extraction (IE) is the process of identifying a set of pre-defined relevant items in text documents. We investigate the application of Machine Learning classification techniques to the problem of Information Extraction. In particular we use Support Vector Machines and several different feature-sets to build a set of classifiers for Information Extraction (IE). We show that this approach is competitive with current state-of-the-art Information Extraction algorithms based on specialized learning algorithms. read more »
Finn, A. & Kushmerick, N. (2006). Learning to classify documents according to genre. To appear Journal of the American Society for Information Science and Technology (JASIST), Special Issue on Computational Analysis of Style, volume 57, number 11, September 2006. pdf read more »
ELIE is a tool for adaptive information extraction from text. It also provides a number of other text processing tools e.g. POS tagging, chunking, gazetteer, stemming. read more »
Finn, A. & Kushmerick, N. (2003). Learning to classify documents according to genre. IJCAI-03 Workshop on Computational Approaches to Style Analysis and Synthesis (Acapulco). pdf, postscript read more »
Finn, A. & Kushmerick, N. (2003). Active learning strategies for information extraction. Poster submission(rejected) to International Joint Conference on Artificial Intelligence (Acapulco). pdf read more »
Finn, A. (2002). Machine learning for genre classification. Msc thesis (University College Dublin). postscript read more »
Finn, A., Kushmerick, N. & Smyth, B. (2002). Genre classification and domain transfer for information filtering. In Proc. European Colloquium on Information Retrieval Research (Glasgow). postscript read more »