NATURAL LANGUAGE PROCESSING (NLP) WEEKLY NEWSLETTER
The NLP Cypher | 12.12.21
Magnus
Is Moore’s Law Finito?
NeurIPS Research Papers by Institution
Here’s a collection of papers by your favorite big tech and educational institutions.
A New and Blazing Fast WordPiece Tokenizer
GLaM| 1.2 Trillion Param Sparse Model
“The Generalist Language Model (GLaM), a trillion weight model that can be trained and served efficiently (in terms of computation and energy use) thanks to sparsity, and achieves competitive performance on multiple few-shot learning tasks. GLaM’s performance compares favorably to a dense language model, GPT-3 (175B) with significantly improved learning efficiency across 29 public NLP benchmarks in seven categories, spanning language completion, open-domain question answering, and natural language inference tasks.”
Glam vs. GPT-3 on NLG and NLU Tasks
Awesome Take Away:
This large sparse model is competitive with dense counterparts while training on much less data and consuming less energy.
Information Extraction from Scanned Receipts: Fine-tuning LayoutLM on SROIE
An OCR demo with LayoutLM fine-tuned for information extraction on receipts data.
AI Predictions Survey
http://www.pwc.com/us/en/tech-effect/ai-analytics/ai-predictions.html
Improving GitHub Search
Gopher — Deepmind’s Language Model
GauGAN2 | Photorealistic Text 2 Image
Transformers From Scratch
“I procrastinated a deep dive into transformers for a few years. Finally the discomfort of not knowing what makes them tick grew too great for me. Here is that …”
https://e2eml.school/transformers.html
PyTorch | Julia (but not exactly like Julia)
“When trying to predict how PyTorch would itself get disrupted, we used to joke a bit about the next version of PyTorch being written in Julia. This was not very serious: a huge factor in moving PyTorch from Lua to Python was to tap into Python’s immense ecosystem (an ecosystem that shows no signs of going away) and even today it is still hard to imagine how a new language can overcome the network effects of Python.”
Decoding Text Generation Tutorial Top-K and Top-P
One of the most intuitive tutorials out there.
Punctuation Model
Attention Neural Networks Slides
Slides
https://www.dropbox.com/s/rahrg6s7w4vud9f/lecture12_attention_neural_networks.pdf?dl=0
Code
Lemmatize spaCy
spaCy’s new lemmatizer is super accurate and blows XLM-RoBERTa out of the water! This blog post presents inner workings, benchmarks and quick start snippets. 😎
Awesome Papers 📚
Repo Cypher 👨💻
A collection of recently released repos that caught our 👁
Coqui TTS
TTS is a library for advanced text-to-speech generation. TTS comes with pretrained models, tools for measuring dataset quality and already used in 20+ languages.
Causal Distillation for Language Models
Distillation library that uses a third objective that encourages the student to imitate the causal computation process of the teacher through interchange intervention training (IIT).
NL-Augmenter
[NL-Augmenter] augments text datasets in several ways, includes: randomizing names and numbers, changing style/syntax, paraphrasing, and KB-based paraphrasing.
Deepparse
Deepparse is a state-of-the-art library for parsing multinational street addresses using deep learning.
CALVIN — A benchmark for Language-Conditioned Policy Learning for Long-Horizon Robot Manipulation Tasks
A simulated benchmark to learn long-horizon language-conditioned tasks. The aim is to make it possible to develop agents that can solve many robotic manipulation tasks over a long horizon, from onboard sensors, and specified only via human language.
Hashformers
Library for the hashtag segmentation task which automatically inserts missing spaces between words in a hashtag.