回到首页

Highlights of Interspeech 2022

Highlights of Interspeech 2022

https://europe.naverlabs.com/blog/highlights-of-interspeech-2022/

Another important line of research was the attempt to scale up very large speech-to-text models. It was notably demonstrated, and asserted by the Meta AI authors in the title of this paper that scaling ASR improves zero and few shot learning. A «universal English ASR model» with 10B parameters was trained using 4.5 million hours of English speech from 10 different sources. We’re talking about supervised or semi-supervised (using pseudo-labeling) ASR here. It’s shown that the model generalizes well to novel domains and styles of speech. Interesting work but unfortunately the authors’ mentioned during the poster session that this model will NOT be open-sourced due to some subset of training data under non permissive licensing.

https://www.isca-speech.org/archive/pdfs/interspeech_2022/zheng22d_interspeech.pdf

本文创建于2022.9.29/10.53，修改于2022.9.29/10.53