Exploratory analysis of unstructured text is a difficult task, particularly when defining and extracting domain-specific concepts. We present iSeqL, an interactive tool for the rapid construction of customized text mining models through sequence labeling. With iSeqL, analysts engage in an active learning loop, labeling text instances and iteratively assessing trained models by viewing model predictions in the context of both individual text instances and task-specific visualizations of the full dataset. To build suitable models with limited training data, iSeqL leverages transfer learning and pre-trained contextual word embeddings within a recurrent neural architecture. Through case studies and an online experiment, we demonstrate the use of iSeqL to quickly bootstrap models sufficiently accurate to perform in-depth exploratory analysis. With less than an hour of annotation effort, iSeqL users are able to generate stable outputs over custom extracted entities, including context-sensitive discovery of phrases that were never manually labeled.
BibTeX
@inproceedings{2020-iseql,
title = {iSeqL: Interactive Sequence Learning},
author = {Shrivastava, Akshat AND Heer, Jeffrey},
booktitle = {ACM Intelligent User Interfaces},
year = {2020},
url = {https://idl.uw.edu/papers/iseql},
doi = {10.1145/3377325.3377503}
}