NLP News Cypher | 11.24.19
Turkey, Set, Go…
This Week:
🤗 and the French Connection
A Montage of Cheat Sheets
Walmart Woos Siri (Into Buying Groceries)
Yann LeCun Welcomes SONY to the Party!
You Stay Classy San Diego…
New Cross-Lingual Question Answering Benchmark
BERT’s Big Squeeze
NLU is Hard!
🤗 and the French Connection
The French RoBERTa, aka CamemBERT, is now part of Hugging Face’s transformer library.
The transformer achieves state-of-the-art (SOTA) results on several NLP downstream tasks: part-of-speech tagging, dependency parsing, named-entity recognition, and natural language inference in French.
Paper:
This ex-CERN dude wants an Italian BERT 😂😂
A Montage of Cheat Sheets
Need a cheat-sheet for data science or ML? Thanks to this fellow, the biggest payload of cheat sheets in the galaxy covers several programming languages and use-cases is easily accessible on GitHub.
Check it out:
Walmart Woos Siri (Into Buying Groceries)
One of the biggest retailers on planet Earth is turning to Conversational AI. This past week Walmart announced its partnership with Apple’s Siri for peeps looking to buy groceries online — the service is called Walmart Voice Order. This comes after McDonalds’ September acquisition on their own Conversational AI company, Apprente. You can say that conversational AI is pushing the most M&A 💰💰💰 at the moment.
Press Release:
Yann LeCun Welcomes SONY to the Party!
Why?
Reason: SONY just joined the other tech giants in the race to build AI (cough*SKYNET*cough) after opening their own AI research lab.
“With this move, the Japanese consumer electronics giant intends to go head-to-head with Google and Facebook, competing for AI talent and projects, and targeting a much bigger role in an ever-accelerating global AI race.”
— eetimes.com
in other words…
You Stay Classy San Diego!
AI is already being used by news organizations. This past week a huge survey was released on how companies use and feel about AI technology in the newsroom. Below is a list of the most active areas:
•News-gathering: sourcing of information, story idea generation,
identifying trends, investigations, event or issue monitoring,
extracting information or content.
• News production: content creation, editing, packaging for different
formats and platforms, text, image and video creation, re-purposing
content for different audiences.
• News distribution: personalization, marketing, finding audiences,
understanding user behavior, monetization/subscriptions.
Report:
New Cross-Lingual Question Answering Benchmark
Facebook AI releases a new benchmark, called MLQA, for extractive cross-lingual QA across several languages. Cool aspect is that they built each question in parallel so you can evaluate the delta in your model’s performance simultaneously across different languages.
Metadata:
MLQA contains 12,000 QA instances in English and more than 5,000 in each of six other languages: Arabic, German, Hindi, Spanish, Vietnamese, and Simplified Chinese.
Article:
BERT’s Big Squeeze
Expensive VRAM is not cool! So lets squeeze BERT down and keep most of its performance. Better yet, lets collect all of the research attempting to compress BERT and list it for everyone to see! 😱
Mitch Gordon‘s web-page indexed research papers working on BERT compression methods such as: pruning, weight factorization, knowledge distillation, weight sharing, and quantization. In addition, he also shows whether the compression occurred at the pre-training or downstream stage.
Table:
NLU is Hard!
Ever wanted to know how non-techies think about the performance of chatbots? Below is the most common mistakes encountered by citizens worldwide. Turns out NLG is tough, but NLU is tougher.
This is a weekly roundup of NLP news and code drops from researchers worldwide.
Follow our Twitter for complete coverage: @Quantum_Stat