2.18 Text Summarization#

In this lesson, we see how to use models that summarize texts.

What is Text Summarization#

Text summarization is a task whose goal is generating a concise and precise summary of long texts, without losing the overall meaning.

Text Summarization Approaches#

Broadly, there are two approaches to summarizing texts in NLP:

  • Extraction-based summarization: A subset of words or sentences that represent the most important points is pulled from the long text and combined to make a summary. The results may not be grammatically accurate.

  • Abstraction-based summarization: Advanced deep learning techniques (mainly in seq-to-seq models) are applied to paraphrase and shorten the original document, just like humans do. Since abstractive machine learning algorithms can generate new phrases and sentences that represent the most important information from the source text, they can assist in overcoming the grammatical inaccuracies of the extraction-based techniques.

Although abstraction performs better at text summarization, developing its algorithms requires complicated deep learning techniques and sophisticated language modeling. As such, extractive text summarization approaches are still widely popular.

Text Summarization Datasets#

An example dataset for the summarization task is the Gigaword dataset, containing ~4 millions articles with their headlines, which are treated as summary of the whole article. Another example is the Extreme Summarization (XSum) dataset, which consists of ~200k BBC news articles covering a variety of domains, accompanied by a one-sentence summary.

Text Summarization Metrics#

Text summarization performance is commonly measured with the ROUGE metrics. ROUGE (Recall-Oriented Understudy for Gisting Evaluation), is a set of metrics and a software package specifically designed for evaluating automatic summarization, but that can be also used for machine translation. The metrics compare an automatically produced summary or translation against reference (high-quality and human-produced) summaries or translations.

Read this article for a step-by-step guide on how to compute ROUGE scores.

Text Summarization with Python#

Let’s see how to use pre-trained models to summarize texts.

Install and Import Libraries#

We install and import the necessary libraries.

pip install transformers
from transformers import pipeline

Try Text Summarization#

Then, we download a pre-trained text summarization model and load it into a summarization pipeline. The default model is sshleifer/distilbart-cnn-12-6, which is a distilled BART trained on XSum and other summarization datasets.

# download pre-trained text summarization model
model_summarizer = pipeline("summarization")

Let’s test the model with an excerpt from the General availability of Azure OpenAI Service expands access to large, advanced AI models with added enterprise benefits article.

# text from https://azure.microsoft.com/en-us/blog/general-availability-of-azure-openai-service-expands-access-to-large-advanced-ai-models-with-added-enterprise-benefits/
text = "Large language models are quickly becoming an essential platform for people " \
"to innovate, apply AI to solve big problems, and imagine what’s possible. Today, " \
"we are excited to announce the general availability of Azure OpenAI Service as part " \
"of Microsoft’s continued commitment to democratizing AI, and ongoing partnership with " \
"OpenAI. With Azure OpenAI Service now generally available, more businesses can apply " \
"for access to the most advanced AI models in the world—including GPT-3.5, Codex, " \
"and DALL•E 2—backed by the trusted enterprise-grade capabilities and AI-optimized " \
"infrastructure of Microsoft Azure, to create cutting-edge applications. Customers " \
"will also be able to access ChatGPT—a fine-tuned version of GPT-3.5 that has been " \
"trained and runs inference on Azure AI infrastructure—through Azure OpenAI Service soon."

resp = model_summarizer(text)[0]
Microsoft announces general availability of Azure OpenAI Service. GPT-3.5,
Codex, and DALL•E 2 models are among the most advanced AI models in the world.
Customers will also be able to access ChatGPT, a fine-tuned version of GPT that
has been trained and runs inference on Azure AI infrastructure.

As you can see from the results, the sshleifer/distilbart-cnn-12-6 model performs abstraction-based summarization.

Code Exercises#


How are text summarization models typically categorized?

  1. Extraction-based and Abstraction-based.

  2. Rule-based and Statistical.

  3. Heuristic and Neural Network.

  4. Frequency-based and Semantic-based.

What’s the commonly used set of metrics for evaluating text summarization models?

  1. Perplexity

  2. BLEU

  3. F-score

  4. Word Mover’s Distance

  5. ROUGE

Questions and Feedbacks#

Have questions about this lesson? Would you like to exchange ideas? Or would you like to point out something that needs to be corrected? Join the NLPlanet Discord server and interact with the community! There’s a specific channel for this course called practical-nlp-nlplanet.