NATURAL LANGUAGE PROCESSING (NLP) WEEKLY NEWSLETTER
The NLP Cypher | 10.17.21
Phase Transition
Welcome back NLP peeps! Do you miss the old days? The old internet days of modem calling, static websites, you know… a time of innocence where developers were innovating the backbone of the internet at hyper speeds?
Well, we are very much going thru that right now via the Web 3.0 revolution. Cryptocurrencies usually get all of the attention but there is something else at play and it involves the entire web.
You see, the current internet (web 2.0) sucks. 🤷♂️
The internet being the world’s hive mind, has, over time created single points of failure due to centralization among its players and how the HTTP protocol is structured.
What does that mean?
Here are some examples of single points of failure: On the nation-state level, a country’s leader can shut down their country’s internet and prevent free speech from an opposing populace. On the company level, if a company doesn't agree with your point of view, they can cancel you and your access to their services, and sadly, you can be filtered out of the digital system.
The server in the middle problem…
The HTTP protocol, is a location-based protocol, meaning a URL points to an IP address where the files of interest are located. And if that server goes down and you depend on that website to pay your rent…this can happen 👇
As you can tell, these servers are super fragile and its consequences scale non-linearly. We need to bypass these servers but how do we do that?
Well the web can be turned into a P2P (peer to peer) network. My local computer can be a node, and your computer can be a node.
(And when we add content it gets shared among the nodes and hashed with a content ID. This protocol is now content-based and not location-based)
This method allows to bypass the central server problem and allow us to download content from all of the nodes simultaneously which can be much faster in performance relative to a single server. Content can be pinned on this network and no single owner can take it down. That means content can be up FOREVER. Like this meme:
Long story short. If you haven’t yet, initialize your first P2P node and let’s decentralize, for humanity’s sake.
ok ok back to NLP…
Is this the End of Python’s GIL Problem? 👀
Meanwhile this dev…
Alias-Free Generative Adversarial Networks (StyleGAN3)
It’s really good
this person doesn’t exist:
Google Cloud Just Got Spot Instances 👨💻
(surprise)
8-bit Optimizers via Block-wise Quantization
Tim Dettmers (ex-🤗) stunts on YouTube.
Airtable Open Sourced Alternative
SSH Tunneling
FYI, this is how some peeps hack (the hacking part is not what the article is about ☠ ).
Free Course on Vector Similarity Search and FAISS
FAISS tutorials are rare, so this is a jewel. 💎
FastAPI and GraphQL w/ Code
State of TV Pre-Phase Transition
Me thinks this is proof we are in a simulation. 😂
Fast WordPiece Tokenization
A New (and Really Fast) Tokenizer for Tensorflow.
“Happy to announce a fast tokenizer which is 8.2x faster than Hugging Face tokenizers and 5.1x faster than Tensorflow text. Accepted to EMNLP 2021 as oral (“Fast WordPiece Tokenizer”), and integrated into Google products. Being opensourced in Tensorflow soon.” -Deeny Zhou | Google Brain
paper: https://arxiv.org/pdf/2012.15524.pdf