NSF ITR: Unified Graphical Models of Information Extraction and Data Mining with Application to Social Network Analysis

Principal Investigators:

Andrew McCallum, PI
[email protected]

David Jensen, co-PI
[email protected]

Center for Intelligent Information Retrieval (CIIR) and Knowledge Discovery Laboratory (KDL)

Computer Science Department
140 Governors Drive
University of Massachusetts
Amherst, MA 01003-9264

This project aims to improve the ability to data mine information previously locked in unstructured natural language text. The research focuses on developing novel statistical models for information extraction and data mining that have such tight integration that the boundaries between them disappear, resulting in a powerful unified framework for extraction and mining. The new algorithms will be applied to the creation of two large-scale databases with useful, publicly-available website front-ends: one concerning scientific research, the other government information. Mining these databases will enable insight into government efficiency and the flow of scientific ideas.

This project is supported in part by The Central Intelligence Agency, the National Security Agency, and the National Science Foundation under NSF grant #IIS-0326249.