Accelerating Transformers in Production

About this event

Lewis Tunstall will talk about optimization of transformer models. He will cover knowledge distillation and weight quantization as well as frameworks like ONNX Runtime and Hugging Face Optimum.

Speaker

Lewis Tunstall ><

Lewis Tunstall is a Machine Learning Engineer at Hugging Face. He is responsible for implementing the tooling to conduct large-scale evaluations of the 10,000+ models and 1,000+ datasets hosted on the Hugging Face Hub. Recently, he published a book called “Natural Language Processing with Transformers”.