NLP News Cypher | 01.05.20

Where Eagles Dare…

Ricky Costa

--

“Before the first frost, a brilliant flash of blue-green light lit the snow and reminded us that winter was almost here.” — GPT-2 (1st iteration)

It’s 2020, can you believe it? It’s been 19 years since the monolith created a baby in space!

sup homie

But seriously, it is 2020, and in my opinion, one of the best outcomes from the recent advancements in NLP/deep learning is the ease of fine-tuning and inference with only a couple lines of code:

@jmcimula twitter

Happy New Year!

This Week:

AI Recap For the New Year

The Common Voice

GPT-2 for the Twitter

The Italian BERT

RASA and the Community

Where is AI Going?

Keras for OCR

Yann Goes Deep

AI Recap For the New Year

And If you need to recap on all things deep learning check out this repo highlighting everything from preprocessing to transfer learning on notebooks.

Top models and libraries in use today:

The Common Voice

For those looking to dive into the speech-enabled app world. Check out Mozilla’s amazing set of audio datasets (multi-lingual too!).

GPT-2 for the Twitter

If you are looking to have your GPT-2 text generator fine-tuned on the text of a Twitter account, you first need to have your data arranged in the appropriate format. Max’s repo give us this, and afterwards, you can use his other repo, GPT-2 Simple, to generate the text!

GPT-2-simple GitHub:

The Italian BERT

In one of our Cypher’s back in November, I joked about how Mr. Di Sipio wanted an Italian BERT:

Well, we have one now! Turns out it’s called GilBERTo (Sorry BERTini)! And it’s architecture is based on RoBERTa:

GitHub:

RASA and the Community

From RASA, the open-sourced Conversational AI platform, you can now see how peeps are deploying dialogue systems (aka el chatbots) with their framework. Cool page to see how people are managing the chatbot hype.

Where is AI Going?

Top brass shares their thoughts on AI’s path. NLP gets a big shout out.

For what it’s worth, In 2020, I expect to see more multi-modal learning (Merging pictures/video and text) research and newer datasets. I find there is not enough entropy in raw text to model the world. In addition, expect to see more deployments into other languages other than English and more Symbolic/Connectionist integration (aka deep knowledge graphs).

Let’s see what they say:

Keras for OCR

Hey, remember OCR? (pip install tesseract) Well text detection is of importance if you want to convert images of text into digital text. Check out this wonderful repo using Keras implementation of Convolutional Recurrent Neural Network.

In addition, its performance is robust 👇!

in GitHub

Every Sunday we do a weekly round-up of NLP news and code drops from researchers around the world.

If you enjoyed this article, help us out and share with friends or social media!

For complete coverage, follow our twitter: @Quantum_Stat

www.quantumstat.com

--

--

Ricky Costa
Ricky Costa

Written by Ricky Costa

Subscribe to the NLP Cypher newsletter for the latest in NLP & ML code/research. 🤟

No responses yet