Research

Towards Annotating and Creating Summary Highlights at Sub-sentence Level

Highlighting is a powerful tool to pick out important content and emphasize. Creating summary highlights at the sub-sentence level is particularly desirable, because sub-sentences are more concise than whole sentences. They are also better suited than individual words and phrases that can potentially lead to disfluent, fragmented summaries. In this paper we seek to generate summary highlights by annotating summary-worthy sub-sentences and teaching classifiers to do the same. We frame the task as jointly selecting important sentences and identifying a single most informative textual unit from each sentence. This formulation dramatically reduces the task complexity involved in sentence compression. Our study provides new benchmarks and baselines for generating highlights at the sub-sentence level. (The Workshop on New Frontiers in Summarization EMNLP 2019) [paper]

Guiding Extractive Summarization with Question-Answering Rewards

Highlighting while reading is a natural behavior for people to track salient content of a document. It would be highly desirable to teach an extractive summarizer to do the same. However, a major obstacle to the development of a supervised summarizer is the lack of ground-truth data. Manual labeling of extraction units is cost-prohibitive, yet automatically acquiring labels by aligning human abstracts with source documents can perform inferior. In this paper we describe a novel framework to guide a supervised, extractive summarization system with question-answering rewards. We argue that quality summaries should serve as the document surrogate to answer important questions, and question-answer pairs can be conveniently obtained from human abstracts. The system learns to promote summaries that are informative, fluent, and perform competitively on question-answering. Our results compare favorably with those reported by strong summarization baselines as evaluated by automatic metrics and human assessors. (NAACL 2019) [paper] [code] [slides]

Dynamic Transfer Learning for Named Entity Recognition

State-of-the-art named entity recognition (NER) systems have been improving continuously using neural architectures over the past several years. However, many tasks including NER require large sets of annotated data to achieve such performance. In particular, we focus on NER from clinical notes, which is one of the most fundamental and critical problems for medical text analysis. Our work centers on effectively adapting these neural architectures towards low-resource settings using parameter transfer methods. We complement a standard hierarchical NER model with a general transfer learning framework consisting of parameter sharing between the source and target tasks, and showcase scores significantly above the baseline architecture. These sharing schemes require an exponential search over tied parameter sets to generate an optimal configuration. To mitigate the problem of exhaustively searching for model optimization, we propose the Dynamic Transfer Networks (DTN), a gated architecture which learns the appropriate parameter sharing scheme between source and target datasets. DTN achieves the improvements of the optimized transfer learning framework with just a single training setting, effectively removing the need for exponential search. (AAAI 2019 Workshop on Health Intelligence) [paper]

Reinforced Extractive Summarization

We investigate a new training paradigm for extractive summarization. Traditionally, human abstracts are used to derive goldstandard labels for extraction units. However, the labels are often inaccurate, because human abstracts and source documents cannot be easily aligned at the word level. In this paper we convert human abstracts to a set of Cloze-style comprehension questions. System summaries are encouraged to preserve salient source content useful for answering questions and share common words with the abstracts. We use reinforcement learning to explore the space of possible extractive summaries and introduce a question-focused reward function to promote concise, fluent, and informative summaries. Our experiments show that the proposed method is effective. It surpasses state-of-the-art systems on the standard summarization dataset. (SRW ACL 2018) [paper] [poster]

Question Effectiveness

In this work we looked towards building computational models that learn to discriminate effective questions from ineffective ones. Armed with such a capability, future advanced systems can evaluate the quality of questions and provide suggestions for effective question wording. We created a large-scale, real-world dataset that contains over 400,000 questions collected from Reddit "Ask Me Anything" threads. Each thread resembles an online press conference where questions compete with each other for attention from the host. This dataset enables the development of a class of computational models for predicting whether a question will be answered. We develop a new convolutional neural network architecture with variable-length context and demonstrate the efficacy of the model by comparing it with state-of-the-art baselines and human judges. (FLAIRS 2017) [paper] [dataset]

© 2019 Kristjan Arumae
Template design by Andreas Viklund