Machine Learning and Friends Lunch

Integrating Probabilistic Extraction Models and Data Mining to Discover Relations and Patterns in Text

Abstract

Extracting relations from text is one of the most difficult of the currently defined information extraction tasks. In order for automatic relation extraction systems to obtain human-level performance, they must be able to incorporate the relational patterns inherent in the data (for example, that one's sister is likely one's mother'sdaughter, or that children are likely to attend the college of their parents). Hand-coding such knowledge can be time-consuming andinadequate. Additionally, there may exist many interesting, unknown relational patterns that both improve extraction performance and provide insight into textual data. We describe a probabilistic extraction model that provides mutual benefits to both ``top-down'' relational pattern discovery and ``bottom-up'' relation extraction.

This is work done as part of an internship at Google New York last semester.

Back to ML Lunch home