The Titan’s Goblet | Cole

NATURAL LANGUAGE PROCESSING (NLP) WEEKLY NEWSLETTER

The NLP Cypher | 12.26.21

Merry Christmas 🎄 for those celebrating. And Happy New Year!

Even OpenAI is feeling the holiday spirit: they open sourced their photorealistic GLIDE model several days ago.

Includes three notebooks:

The text2im

  • notebook shows how to use GLIDE (filtered) with classifier-free guidance to produce images conditioned on text prompts.

The inpaint

  • notebook shows how to use GLIDE (filtered) to fill in a masked region of an image, conditioned on a text prompt.

The clip_guided

  • notebook shows how to use GLIDE (filtered) + a filtered noise-aware CLIP model to produce images conditioned on text prompts.

Parallel Inference with Adapters

A new feature on the adapters library for conducting inference with various adapters simultaneously. (not sure if parallizing is a real word, I just made it up).

Colab of the Week 🎉🥳

SetFit: Outperforming GPT-3 in Few-Shot Text-Classification

Colab

No more Transformer Diagrams 😂

Abhishek maps boring model diagrams to code for building intuition!

AGI and the Gov’t Apathy

lol

Periodic Table of NLP Tasks

Streamlit demo…

JellyFish

JellyFish is a library for approximate & phonetic matching of strings.

Algos used…

For string comparison:

  • Levenshtein Distance
  • Damerau-Levenshtein Distance
  • Jaro Distance
  • Jaro-Winkler Distance
  • Match Rating Approach Comparison
  • Hamming Distance

For phonetic encoding:

  • American Soundex
  • Metaphone
  • NYSIIS (New York State Identification and Intelligence System)
  • Match Rating Codex

New Speech Models from Microsoft on 🤗 Hub

TextPruner

… a model pruning toolkit for pre-trained language models.

Deep Learning in NLP YouTube Lectures

Play the Shannon Game With Language Models

A new summarization evaluation metric called the Shannon Score is proposed. It performs the Shannon Game with a language model.

Paper: https://arxiv.org/pdf/2103.10918.pdf

FakeYou

attrs

Next Level

Demo

Papers to Read 📚

https://arxiv.org/pdf/2112.12731.pdf
https://arxiv.org/pdf/2112.10508.pdf
https://arxiv.org/pdf/2112.04426.pdf
https://arxiv.org/pdf/2112.11739.pdf

Repo Cypher 👨‍💻

A collection of recently released repos that caught our 👁

PECOS — Predictions for Enormous and Correlated Output Spaces

PECOS is a machine learning framework for fast learning and inference on problems with large output spaces, such as extreme multi-label ranking (XMR) and large-scale retrieval.

Connected Papers 📈

Exploring Neural Models for Query-Focused Summarization

A systematic exploration of neural approaches to query summarization, considering two general classes of methods: two-stage extractive-abstractive solutions and end-to-end models.

Connected Papers 📈

Randomised Controlled Trial Abstract Result Tabulator

RCT-ART is an NLP pipeline built with spaCy for converting clinical trial result sentences into tables through jointly extracting intervention, outcome and outcome measure entities and their relations.

Connected Papers 📈

--

--

Get the Medium app

A button that says 'Download on the App Store', and if clicked it will lead you to the iOS App store
A button that says 'Get it on, Google Play', and if clicked it will lead you to the Google Play store
Ricky Costa

Subscribe to the NLP Cypher newsletter for the latest in NLP & ML code/research. 🤟