SEER: Auto-Generating Information Extraction Rules from User-Specified Examples

論文URL:http://dl.acm.org/citation.cfm?doid=3025453.3025540

論文アブストラクト:Time-consuming and complicated best describe the current state of the Information Extraction (IE) field. Machine learning approaches to IE require large collections of labeled datasets that are difficult to create and use obscure mathematical models, occasionally returning unwanted results that are unexplainable. Rule-based approaches, while resulting in easy-to-understand IE rules, are still time-consuming and labor-intensive. SEER combines the best of these two approaches: a learning model for IE rules based on a small number of user-specified examples. In this paper, we explain the design behind SEER and present a user study comparing our system against a commercially available tool in which users create IE rules manually. Our results show that SEER helps users complete text extraction tasks more quickly, as well as more accurately.

日本語のまとめ:

文書の機械学習用データアノテーション支援。ユーザが最初のいくつかをアノテーションすると、そこから自動でルールを学習していってあとはシステムが勝手にやってくれる。

(80文字)

発表スライド: