We've just launched our Inference API beta which lets you run fast inference on any of the 3,000+ models made available by the community.
It is an optimized and accelerated version of the open-access API that powers our free inference widgets, available on all of our model pages.
➡️ To subscribe, you will need to create or join an organization and head over to huggingface.co/pricing
If you need faster (GPU) inference, large volumes of requests, and/or a dedicated endpoint, let us know at email@example.com
You can find documentation about the API here.
The new 0.9.0 version of the library brings a lot of improvements:
- More robust alignment tracking
- Better error messages
- Many bug fixes
But most importantly, this new release brings the full support of the Unigram algorithm 🎉. You can convert your SentencePiece to 🤗 Tokenizers and start using all the features you love. We also support training Unigram tokenizers from scratch. Someone talked about a Byte-level Unigram 🤫?
Also the library now has a proper documentation! Go check it out here!
We've just released RAG, the first retrieval-augmented model in the library in collaboration with Facebook AI. Retrieval augmentation is a new paradigm that empowers models to efficiently find new information in a text corpus like Wikipedia at inference time rather than try to fit all of this knowledge into a huge fixed parameter set. The method lets the model excel at a number of tasks including question answering and question generation. You can try out our demo for both of these settings here or check out the model docs.
🤗 Transformers welcome its first ever end-2-end multimodal transformer and demo. LXMERT is the current state-of-the-art model for visual question answering (answering textual questions about a given image).
The above GIF demonstrates the capabilities of the version of the model pre-trained on the VQA dataset.
Check out our colab notebook to play with the model using your own questions and images.
The latest 🤗 Transformers release includes yet another type of model: the Funnel-Transformer (paper). This model combines the classic transformer architecture with a feature widely used in CNNs used in computer vision: pooling. After a given block of layers, the hidden state is pooled and the sequence length is cut in half.
These pooling steps, circumvented by a residual skip connection, allow funnel transformers to be deeper than other transformers with lower computational cost. The design speeds up inference without hindering the performance on tasks such as classification (that just require a summary of the sentence) and with a decoder that upsamples the sentence back to its original length. The model also gets state-of-the-art performance on some tasks like unmasking tokens.
The new 1.1.0 release of 🤗 Datasets adds support for Windows as well as a number of cool datasets, thanks to help from our amazing contributors:
- A new, debiased subset of Winogrande
- OpenWebText - an open source effort to reproduce OpenAI’s WebText dataset used by GPT-2, and it is also needed to reproduce ELECTRA
Finally, we’ve added the documentation of the ElasticSearch integration in 🤗Datasets. It lets you easily add a fast text search engine to browse your datasets. More information on how to use it in the documentation.