CLIP : Connecting Text and Images

less than 1 minute read

Published: October 25, 2021

This paper is trained on a wide variety of images with a wide variety of natural language supervision that’s abundantly available on the internet. By design, the network can be instructed in natural language to perform a great variety of classification benchmarks, without directly optimizing for the benchmark’s performance.

Paper Link : https://openai.com/blog/clip/
Model : Pre-training + zero-shot prediction.

Summary

The method uses an abundantly available source of supervision: the text paired with images found across the internet. This data is used to create the following proxy training task for CLIP: given an image, predict which out of a set of 32,768 randomly sampled text snippets, was actually paired with it in the dataset.

Important : Zero-shot classifier.

Share on

Twitter Facebook Google+ LinkedIn

GSOC 2017 - Week 1 of GSoC 17

1 minute read

Published: June 07, 2017

This blog is dedicated to the first week of Google Summer of Code (i.e June 1 - June 7). The target of the first week according to my timeline was to get conversant with the code structure and implement the derivative using statsmodels and partly by numdifftools.

Ashwin Pathak

CLIP : Connecting Text and Images

Summary

Share on

You May Also Enjoy

GSOC 2017 - Week 4 of GSoC 17

GSOC 2017 - Week 3 of GSoC 17

GSOC 2017 - Week 2 of GSoC 17

GSOC 2017 - Week 1 of GSoC 17