The NLP Cypher | 12.26.21

AI Summer is Out Forever

Merry Christmas 🎄 for those celebrating. And Happy New Year!

Even OpenAI is feeling the holiday spirit: they open sourced their photorealistic GLIDE model several days ago.

Includes three notebooks:

The text2im

  • notebook shows how to use GLIDE (filtered) with classifier-free guidance to produce images conditioned on text prompts.

The inpaint

  • notebook shows how to use GLIDE (filtered) to fill in a masked region of an image, conditioned on a text prompt.

The clip_guided

  • notebook shows how to use GLIDE (filtered) + a filtered noise-aware CLIP model to produce images conditioned on text prompts.

Parallel Inference with Adapters

A new feature on the adapters library for conducting inference with various adapters simultaneously. (not sure if parallizing is a real word, I just made it up).

Colab of the Week 🎉🥳

SetFit: Outperforming GPT-3 in Few-Shot Text-Classification


No more Transformer Diagrams 😂

Abhishek maps boring model diagrams to code for building intuition!

AGI and the Gov’t Apathy


Periodic Table of NLP Tasks

Streamlit demo…


JellyFish is a library for approximate & phonetic matching of strings.

Algos used…

For string comparison:

  • Levenshtein Distance
  • Damerau-Levenshtein Distance
  • Jaro Distance
  • Jaro-Winkler Distance
  • Match Rating Approach Comparison
  • Hamming Distance

For phonetic encoding:

  • American Soundex
  • Metaphone
  • NYSIIS (New York State Identification and Intelligence System)
  • Match Rating Codex

New Speech Models from Microsoft on 🤗 Hub


… a model pruning toolkit for pre-trained language models.

Deep Learning in NLP YouTube Lectures

Play the Shannon Game With Language Models

A new summarization evaluation metric called the Shannon Score is proposed. It performs the Shannon Game with a language model.




Next Level


Papers to Read 📚

Repo Cypher 👨‍💻

A collection of recently released repos that caught our 👁

PECOS — Predictions for Enormous and Correlated Output Spaces

PECOS is a machine learning framework for fast learning and inference on problems with large output spaces, such as extreme multi-label ranking (XMR) and large-scale retrieval.

Connected Papers 📈

Exploring Neural Models for Query-Focused Summarization

A systematic exploration of neural approaches to query summarization, considering two general classes of methods: two-stage extractive-abstractive solutions and end-to-end models.

Connected Papers 📈

Randomised Controlled Trial Abstract Result Tabulator

RCT-ART is an NLP pipeline built with spaCy for converting clinical trial result sentences into tables through jointly extracting intervention, outcome and outcome measure entities and their relations.

Connected Papers 📈




Subscribe to the NLP Cypher newsletter for the latest in NLP & ML code/research. 🤟

Love podcasts or audiobooks? Learn on the go with our new app.

Recommended from Medium

Quick Recap : 30 days of Natural Language Processing ( NLP) with Projects Series

Training an Object Detection Model with TensorFlow API using Google COLAB

Leveraging Watson’s Machine Learning GPUs to accelerate your Deep Learning project in Python

Recognizing Handwritten Digits using Machine Learning

Train Yolo V4/V3 for custom object detection in Google Colab

Guide to Deep Learning

Compressing Puppy Image Using Rank-K Approximation

Supervised vs Unsupervised Learning

Get the Medium app

A button that says 'Download on the App Store', and if clicked it will lead you to the iOS App store
A button that says 'Get it on, Google Play', and if clicked it will lead you to the Google Play store
Ricky Costa

Ricky Costa

Subscribe to the NLP Cypher newsletter for the latest in NLP & ML code/research. 🤟

More from Medium

Pre-trained Language Models for Relational Data

Going the extra mile, lessons learnt from Kaggle on how to train better NLP models (Part II)

Getting Started with Spell Workspaces and IPUs

About biases in the data and how that affects the factual knowledge language models learn