This is a very early sentence compression paper from Hongyan Jing, a student of Kathleen McKeown. It was really interesting for me to read because I’ve looked at a bunch of recent work on the same topic.

To shorten a sentence, Jing does the following:

Jing evaluates her method using a custom metric, the ‘‘success rate’’, which measures the degree of overlap between a computer and a human annotator.


One thing which is striking is the deep similarity between this work and later approaches to sentence compression. The core concerns in Jing (2000), Filipova and Strube (2008) and Clarke and Lapata (2010) are quite similar: the goal is to retain ‘‘important’’ portions of a source sentence while retaining grammaticality. The techniques for determining how to automatically meet these criteria have shifted, but in many ways the basic ideas are the same. Of course, recent sentence compression papers use neural networks to try and replicate these aims in a “data-driven” fashion.

Based on 20 years of research, it seems like the sentence compression agenda in NLP might be more data + better modeling = closer automated replication of human summarization decisions. (As measured by ROUGE score or Pyramid or some other metric).

This paradigm is common across NLP. But for summarization, the approach often annoys me: it seems obvious that different kinds of users will have different information needs. Query-focused summarization is supposed to address this issue, but I’m sort of dubious that there even exists a “gold standard”, perfect summary for a given query. There seem to be deep limitations to this “annotate and model” paradigm, at least for sentence compression.