Hugging Face Raises Series B!
📣 We are so excited to announce our $40M series B led by Lee Fixel at Addition with participation from Lux Capital, A.Capital Ventures, and betaworks!
Thank you to all our open source contributors, pull requesters, issue openers, notebook creators, model architects, tweeting supporters & community members all over the world 🌎!
We couldn't do what we do & be where we are - in a field dominated by big tech - without you! 🙏🏻
Check us out on TechCrunch and VentureBeat!
We partnered with Amazon SageMaker to enable faster training of Transformers in your AWS cloud! 🔥
Head to our blog for walkthroughs, documentation and sample notebooks showing you how to use the new Hugging Face Deep Learning Containers (DLCs) with the SageMaker Python SDK to train models with PyTorch and TensorFlow, and
1️⃣ 🌐 Multilingual w/ M2M100 and 2️⃣ mBART-50
3️⃣ 🎤 Speech w/ Wav2Vec2-XLSR
4️⃣ Quantization w/ I-BERT
5️⃣ 🥇 SOTA NLU w/ DeBERTa-v2
Not to mention:
⚙️ TF models support XLA & AMP
➡️ Trainer supports SageMaker model Parallelism
💾 Tokenized datasets now 4x (!) smaller
🔥 Simpler from_csv/json/text
📈 New datasets: Common Voice, SST, e.g.:
🎤 Common Voice: speech data in 60 languages!
👩🏻🎤 fashionMNIST for CV.
🚀 Shout out to over 800+ users who are already sharing and hosting their datasets on the Hub!
🐍 Any dataset can be loaded with one line of python.
👉🏻 Check out the full list here
📝 Learn how to add yours here
Dark Mode is Here!
🌗 🌘 🌑 Get your equipment because it's getting very dark in here... The long-awaited dark mode is now available on Hugging Face 🚀
To try it out, activate in your user settings/theme (you have to be a registered user).
🔎 Looking for a sneak peek of AutoNLP in action?
Check out this exclusive preview video by Abhishek Thakur that shows just how easy it is to train models using AutoNLP!
FairScale just released support for ZeRO-DP3 and ZeRO-offload (to make large model fine-tuning easier), and you can already start playing with it in 🤗 Transformers!
This is still highly experimental so expect a few (maybe a lot) of rough edges. The PR gives a few examples, refer to the master documentation for more information.
For more information on what ZeRO-DP and ZeRO-offload are, you're in luck! Sylvain Gugger gave a talk about the topic at PyDataMTL.
DeBERTa-v2 beats the human baseline on SuperGLUE and up to a crazy 91.7% dev accuracy on MNLI task. It even beats T5 while 10x smaller!
DeBERTa-v2 was contributed by Pengcheng He from Microsoft Research
Try it directly on the hub or in 🤗 Transformers by installing from source!
DeBERTa will be available from pypi/anaconda as early as v4.4.0 is out!
RAG is a new NLP model that uses external documents to augment its knowledge. The RAG model by Aleksandra Piktus, Patrick Lewis, and more Facebook AI colleagues leverages external knowledge sources like Wikipedia to have direct and dynamic access to information at inference time.
🚀 This integration with RAG:
- Speeds up retrieval calls by 2x
- Improves the scalability of fine-tuning
📝 Check out our newest guest post by Amog Kamsetty and the Ray team on training a Retrieval Augmented Generation Model.
We’ve added a script to 🤗 Transformers that allows you to train a text classifier with nothing but a set of specified class names and some unlabeled data!
The script generates proxy-labels on your data from our zero-shot classification pipeline and performs knowledge distillation by training a smaller student model 💪
The result is an efficient classifier that speeds up inference by 100x or more compared to zero-shot classification 🚀
📕 Walkthrough colab notebook
Translate text to or between 50 languages with mBART-50 from Facebook AI!
🇺🇳 One-to-Many model: translate from English to 49 other languages
↔️ Many-to-Many model: translation between any pair of 50 languages
Check out all the mBART-50 models
🔥Brought to you by UC Berkeley, I-BERT is the first quantized model in 🤗Model Hub! Everything is integer in I-BERT. It brings you 4x speed-up with TensorRT!!