Collaborative pretraining and recycling finetuned models
About this Event
This talk will discuss our recent advancements in recycling finetuned models and collaborative pretraining. We would describe how to harness the data and computation invested in one or more models to collaboratively improve the pre-trained model they originated from, once or over and over again. The work will also touch on our initial understanding of how and why fusing several models by weight averaging works. All of these are small steps towards evolving pretrained models that we create together as a community, join us, check the best models and feel free to contact me and ask questions.
Speakers
Leshem Choshen Leshem Choshen currently leads the ColD-fusion challenge at IBM, aiming to collaboratively pretrain and propose to recycle finetuned models to do so. He received the postdoctoral Fulbright fellowship as well as IAAI and Blavatnik best Ph.D. awards. With broad NLP and ML interests, he also worked on Reinforcement Learning, Evaluation and Understanding of how neural networks learn. In parallel, he participated in Project Debater, creating a machine that could hold a formal debate, ending in a Nature cover and live debate.