- 일시: 11월 25일 목요일 오후 1시 30분
- 방법: Zoom 회의 참가
- 링크: https://sogang-ac-kr.zoom.us/j/87139168511?pwd=a0JHcnNMSXVuZ1VUaW5LeUZGdEVtQT09)
- 회의 ID: 871 3916 8511
- 암호: 327698
- Title: Overview of speech enhancement and low-latency on-device streaming ASR
- Summary:
The first part of this talk will introduce end-to-end speech denoising schemes based on LSTM or Transformer layers for robust speech recognition. From conventional mask estimation schemes, new multi-task denoising training would be covered in this talk. The second part will talk about designing low-latency on-device ASR models based on RNN-T architecture. Trade-offs between latency and WER would be compared between different delay constraining schemes.
- Profiles:
Jaeyoung Kim had Ph.D at Stanford University in 2014 and joined Samsung San Diego office for research on signal processing and speech recognition. In 2019, he joined Amazon and worked on wakeword models to improve Alexa recognition. In 2020, he moved to Google and started to work on on-device ASR modeling and later focused on speech self-supervising training to reduce supervised training datasets.
- 주관: 박형민 교수 연구실