Weekly NLP News Cypher

10.25.19

--

T5, Google’s New Transformer

Facebook’s RoBERTa Distilled by Hugging Face

Multiprocessing vs. Threading

Fine-Tuning BERT, a Tutorial

Microsoft’s UniLM AI Improves Summarization

T5 | The New SOTA Transformer from Google

A new entrant in the transformer school of hard-knocks was unveiled yesterday by Google called T5. This new transformer achieved new SOTA performance on SuperGLUE leaderboard scoring a total score of 88.9, just 0.9 away from human performance.

The model comes in 5 sizes:

  • T5-Small (60 million params)
  • T5-Base (220 million params)
  • T5-Large (770 million params)
  • T5–3B (3 billion params)
  • T5–11B (11 billion params)
SuperGLUE

Github:

Facebook AI’s RoBERTa Distilled by Hugging Face

Smaller models make it easier to deploy and less $$ for cloud compute.

“95% of RoBERTa-base's performance on GLUE, twice as fast as RoBERTa while being 35% smaller.” — Hugging Face

Below are the results of dev sets on GLUE:

Hugging Face

Github:

Multiprocessing vs. Threading

Understanding the difference between multiprocessing vs. threading is important when deploying machine learning models: FloydHub’s new article goes in-depth:

Fine-Tuning BERT, a Tutorial

Chris McCormick’s blog show us how to use Hugging Face’s Pytorch library to fine-tune BERT for sentence classification:

Microsoft’s UniLM AI Improves Summarization

New Microsoft model, UniLM, completes unidirectional, sequence-to-sequence, and bidirectional prediction which helps improve performance on several NLP tasks. Code and pre-trained models found here:

This is a weekly round-up of NLP News and Code drops from Techies worldwide.

Follow us on Twitter for more NLP News, Code & Demos: @Quantum_Stat

www.quantumstat.com

--

--

Ricky Costa
Ricky Costa

Written by Ricky Costa

Subscribe to the NLP Cypher newsletter for the latest in NLP & ML code/research. 🤟

No responses yet