Weekly NLP News Cypher
10.25.19
T5, Google’s New Transformer
Facebook’s RoBERTa Distilled by Hugging Face
Multiprocessing vs. Threading
Fine-Tuning BERT, a Tutorial
Microsoft’s UniLM AI Improves Summarization
T5 | The New SOTA Transformer from Google
A new entrant in the transformer school of hard-knocks was unveiled yesterday by Google called T5. This new transformer achieved new SOTA performance on SuperGLUE leaderboard scoring a total score of 88.9, just 0.9 away from human performance.
The model comes in 5 sizes:
- T5-Small (60 million params)
- T5-Base (220 million params)
- T5-Large (770 million params)
- T5–3B (3 billion params)
- T5–11B (11 billion params)
Github:
Facebook AI’s RoBERTa Distilled by Hugging Face
Smaller models make it easier to deploy and less $$ for cloud compute.
“95% of RoBERTa-base
's performance on GLUE, twice as fast as RoBERTa while being 35% smaller.” — Hugging Face
Below are the results of dev sets on GLUE:
Github:
Multiprocessing vs. Threading
Understanding the difference between multiprocessing vs. threading is important when deploying machine learning models: FloydHub’s new article goes in-depth:
Fine-Tuning BERT, a Tutorial
Chris McCormick’s blog show us how to use Hugging Face’s Pytorch library to fine-tune BERT for sentence classification:
Microsoft’s UniLM AI Improves Summarization
New Microsoft model, UniLM, completes unidirectional, sequence-to-sequence, and bidirectional prediction which helps improve performance on several NLP tasks. Code and pre-trained models found here:
This is a weekly round-up of NLP News and Code drops from Techies worldwide.
Follow us on Twitter for more NLP News, Code & Demos: @Quantum_Stat