NLP News Cypher | 11.10.19

Echoes from EMNLP and GPT-2 Strikes Back

--

Wow, what a week it was. The EMNLP conference gave us many treats to chew on such as the growing popularity of cross-lingual learning and the continued adoption of knowledge graphs in language models.

Because of all this action, this week’s Cypher will be a bit longer than usual.

🤯 EMNLP 2019 🤯

New QA Leaderboard Attempting to Mitigate SQuAD Problems

GPT-2 Doesn’t Bring Armageddon

Chollet’s New Formulation of Intelligence

Unsupervised Cross-lingual Representation Learning

Compute Growth Goes Hyperbolic

What were some of the top keywords in EMNLP papers?

🤯EMNLP 2019🤯

Stephen Mayhew et al were live tweeting during the conference (thank you) and sharing all the action. Here are a few threads that caught our eyes:

1.Chris Manning discusses the GQA dataset, which takes natural language questions generated from graphs (based on the visual genome project) and delivers new leaderboard results from his neural state machine paper to be presented at NeurIPS next month. Full thread and link to GQA below:

2. Allen Institute’s Matt Gardner shares his slides on the limitations of reading comprehension task in NLP. He posits an open reading benchmark that can evaluate multiple problems in reading comprehension (e.g. Sentence-level linguistic structure, Discrete Reasoning Over Paragraphs, Question-based coreference resolution, Reasoning Over Paragraph Effects in Situations, time, grounding and others) all at once. Slides:

Allen Institute

3. If you haven’t heard of GNN’s, (Graph Neural Networks) you should get familiar. Below is the presentation Graph Neural Networks for Natural Language Processing by Shikhar Vashishth, Naganand Yadati and Partha Talukdar. Warning it’s 315 slides long. Great work!

Github: https://github.com/svjan5/GNNs-for-NLP

4. My prayers were answered when Michael Galkin summarized all the knowledge graph insights from EMNLP. I won’t even bother discussing it since he did such a great job in the column below, part 2 is dropping soon!

New QA Leaderboard Attempting to Mitigate SQuAD Problems

TechQA, a new leaderboard based on questions posted in IBM DeveloperWorks, is introduced by IBM Research for enterprise Question Answering systems. Sobering insight we already knew:

“Natural Questions was created by harvesting users’ questions of Google’s search engine and then finding answers by using turkers. When a SQuAD system is tested on the Natural Questions leaderboard the F measure drops dramatically to 6% (on short answers — it is 2% for a SQuAD v1.1 system) illustrating the brittleness of SQuAD trained systems.”

GPT-2 Doesn’t Bring Armageddon

OpenAI finally unveiled their 1.5 billion parameter transformer to the world. In addition, they also unveiled a detection model for detecting AI written text that thinks all my stuff is written by AI. 😂😂

…55 minutes later, Adam King, the creator of Talktotransformer.com put GPT-2 up. 🧐🧐

… 80 minutes later, Hugging Face put it up…🧐🧐

Me:

Chollet’s New Formulation of Intelligence

Francis Chollet (of the Keras fame) dropped his thesis on defining and measuring intelligence and a new eval dataset called ARC (Abstraction and Reasoning Corpus). Apparently he was working on this for the past 2 years.

“ARC can be seen as a general artificial intelligence benchmark, as a program synthesis benchmark, or as a psychometric intelligence test. It is targeted at both humans and artificially intelligent systems that aim at emulating a human-like form of general fluid intelligence.”

Unsupervised Cross-lingual Representation Learning

Last week we showed a picture we took during Sebastian Ruder’s talk at NYU. In his blog post, he shares some the slides he used and more:

Compute Growth Goes Hyperbolic

Nothing to see here, move along…

This column is a weekly round-up of NLP news and code drops from researchers worldwide.

Follow us on Twitter for more Code & Demos: @Quantum_Stat

www.quantumstat.com

--

--

Ricky Costa
Ricky Costa

Written by Ricky Costa

Subscribe to the NLP Cypher newsletter for the latest in NLP & ML code/research. 🤟

No responses yet