The Journey | Jongman

NATURAL LANGUAGE PROCESSING (NLP) WEEKLY NEWSLETTER

The NLP Cypher | 10.17.21

Phase Transition

Ricky Costa
4 min readOct 17, 2021

--

David is killing it!

Welcome back NLP peeps! Do you miss the old days? The old internet days of modem calling, static websites, you know… a time of innocence where developers were innovating the backbone of the internet at hyper speeds?

Well, we are very much going thru that right now via the Web 3.0 revolution. Cryptocurrencies usually get all of the attention but there is something else at play and it involves the entire web.

You see, the current internet (web 2.0) sucks. 🤷‍♂️

The internet being the world’s hive mind, has, over time created single points of failure due to centralization among its players and how the HTTP protocol is structured.

What does that mean?

Here are some examples of single points of failure: On the nation-state level, a country’s leader can shut down their country’s internet and prevent free speech from an opposing populace. On the company level, if a company doesn't agree with your point of view, they can cancel you and your access to their services, and sadly, you can be filtered out of the digital system.

The server in the middle problem…

The HTTP protocol, is a location-based protocol, meaning a URL points to an IP address where the files of interest are located. And if that server goes down and you depend on that website to pay your rent…this can happen 👇

As you can tell, these servers are super fragile and its consequences scale non-linearly. We need to bypass these servers but how do we do that?

Well the web can be turned into a P2P (peer to peer) network. My local computer can be a node, and your computer can be a node.

(And when we add content it gets shared among the nodes and hashed with a content ID. This protocol is now content-based and not location-based)

This method allows to bypass the central server problem and allow us to download content from all of the nodes simultaneously which can be much faster in performance relative to a single server. Content can be pinned on this network and no single owner can take it down. That means content can be up FOREVER. Like this meme:

Long story short. If you haven’t yet, initialize your first P2P node and let’s decentralize, for humanity’s sake.

ok ok back to NLP…

Is this the End of Python’s GIL Problem? 👀

Meanwhile this dev…

Alias-Free Generative Adversarial Networks (StyleGAN3)

It’s really good

this person doesn’t exist:

Google Cloud Just Got Spot Instances 👨‍💻

(surprise)

8-bit Optimizers via Block-wise Quantization

Tim Dettmers (ex-🤗) stunts on YouTube.

Airtable Open Sourced Alternative

SSH Tunneling

FYI, this is how some peeps hack (the hacking part is not what the article is about ☠ ).

Free Course on Vector Similarity Search and FAISS

FAISS tutorials are rare, so this is a jewel. 💎

FastAPI and GraphQL w/ Code

State of TV Pre-Phase Transition

Me thinks this is proof we are in a simulation. 😂

Fast WordPiece Tokenization

A New (and Really Fast) Tokenizer for Tensorflow.

“Happy to announce a fast tokenizer which is 8.2x faster than Hugging Face tokenizers and 5.1x faster than Tensorflow text. Accepted to EMNLP 2021 as oral (“Fast WordPiece Tokenizer”), and integrated into Google products. Being opensourced in Tensorflow soon.” -Deeny Zhou | Google Brain

paper: https://arxiv.org/pdf/2012.15524.pdf

Philosophical Foundations of Machine Intelligence

Quantum Stat

--

--

Ricky Costa
Ricky Costa

Written by Ricky Costa

Subscribe to the NLP Cypher newsletter for the latest in NLP & ML code/research. 🤟

No responses yet