Most information extraction methods focus on binary relations expressed
within single sentences. In high-value domains, however, $n$-ary relations are
of great demand (e.g., drug-gene-mutation interactions in precision oncology).
Such relations often involve entity mentions that are far apart in the
document, yet existing work on cross-sentence relation extraction is generally
confined to small text spans (e.g., three consecutive sentences), which
severely limits recall. In this paper, we propose a novel multiscale neural
architecture for document-level $n$-ary relation extraction. Our system
combines representations learned over various text spans throughout the
document and across the subrelation hierarchy. Widening the system's purview to
the entire document maximizes potential recall. Moreover, by integrating weak
signals across the document, multiscale modeling increases precision, even in
the presence of noisy labels from distant supervision. Experiments on
biomedical machine reading show that our approach substantially outperforms
previous $n$-ary relation extraction methods.
Robin Jia, Cliff Wong, Hoifung Poon