Exploring Few-Shot Learning For Summarization

3 minute read

Published: November 02, 2021

In this blog, we will be exploring few-shot learning for summarization.

The first step is to list down the requirements and questions for which we are finding the answers :

Few-Shot Learning of an Interleaved Text Summarization Model by Pretraining with Synthetic Data

Paper : https://arxiv.org/pdf/2103.05131.pdf
About the dataset used :
- AMI dataset
- PubMed dataset
Model :
- Hierarchical encoder-decoder architecture.
Metrics :
- ROUGE
BaseLine Comparison
- Transfer Learning
- Human evaluation
Learnings for our own proposed model :
- Phrases in interleaved texts are equivalent to visual patterns in images, and therefore, attending phrases are more relevant for thread recognition than attending posts
- Hence, hierarchical attention for phrases also is important.

Paper : https://arxiv.org/pdf/2004.14884.pdf
About the dataset used:
- Amazon dataset
- Yelp dataset
Model :
- Encoder - Decoder architecture for unsupervised training
- Novelty Reduction through regularization
- Summary Adaptation
Metrics :
- ROUGE
- BaseLine Comparison
- Transfer Learning
- Human evaluation
Learnings for our own proposed model :
- The summary generate by unsupervised learning is prone to producing a significant amount of information that is unsupported by reviews.
- Also, since they are trained mostly on subjectively written reviews, and as a result, tend to generate summaries in the same writing style.

Other reference papers :

Multimodal Few-Shot Learning with Frozen Language Models
- Paper : https://arxiv.org/pdf/2106.13884.pdf
Learning to Summarize with Human Feedback
- Blog : https://openai.com/blog/learning-to-summarize-with-human-feedback/
Direct multimodal few-shot learning of speech and images
- Paper : https://arxiv.org/pdf/2012.05680.pdf

Paper : https://aclanthology.org/I17-1047.pdf
About the paper :
- Images based conversations
- For Question Answering
- Proposal of retrieval approaches
About the ICG dataset :
- Images : 4222
- Utterances : 25332
Process of data collection :
- QA dialogues from VQG dataset.
- Photo gallery selection from crowd-sourcing platform.
- IGC-Twitter dataset consists of
  - 250K quadruples of visual context, textual context, question and response.
  - This dataset was further refined.
Learnings for our own dataset :
- ICG-Twitter dataset can be used for training
- Also, the paper enlists the process of fetching conversation based data from twitter.
- However, the ICG dataset is small, can prove to be good for testing.

Paper : https://arxiv.org/pdf/1704.00200.pdf
About the dataset :
- Dialogue : 150000
Process of data collection :
- Domain knowledge curation
  - Crawling
  - Parsing
  - Taxonomical filtering
  - Relevent fashion attributes identification
  - Creating fashion profile
- Gathering multimodal dialogs :
  - Crowdsourcing
Learnings for our own dataset :
- A useful dataset with multiple images
- Can be used for summarization
- Lays out the details of data collection

RecipeQA: A Challenge Dataset for Multimodal Comprehension of Cooking Recipes
- Paper : https://arxiv.org/pdf/1809.00812.pdf
Ubuntu dialogue corpus
- http://dataset.cs.mcgill.ca/ubuntu-corpus-1.0/
Conversational AI datasets
- https://www.topbots.com/conversational-ai-datasets/
The Dialogue Dodecathlon
- Paper : https://arxiv.org/pdf/1911.03768.pdf

As pointed out by some papers already creating our own dataset can be really laborious and demanding.
In addition to these I believe twitter conversations are a very good place to start at.
Filtering with images based posts can be really helpful.