#일시: 11월 25일 목요일 오후 1시 30분
#방법: Zoom 회의 참가
- 링크: https://sogang-ac-kr.zoom.us/j/87139168511?pwd=a0JHcnNMSXVuZ1VUaW5LeUZGdEVtQT09)
- 회의 ID: 871 3916 8511
- 암호: 327698
#Title: Overview of speech enhancement and low-latency on-device streaming ASR
#Summary:
The first part of this talk will introduce end-to-end speech denoising schemes
based on LSTM or Transformer layers for robust speech recognition. From
conventional mask estimation schemes, new multi-task denoising training would
be covered in this talk. The second part will talk about designing low-latency
on-device ASR models based on RNN-T architecture. Trade-offs between latency
and WER would be compared between different delay constraining schemes.
#Profiles:
Jaeyoung Kim had Ph.D at Stanford University in 2014 and joined Samsung San
Diego office for research on signal processing and speech recognition. In 2019,
he joined Amazon and worked on wakeword models to improve Alexa recognition. In
2020, he moved to Google and started to work on on-device ASR modeling and
later focused on speech self-supervising training to reduce supervised training
datasets.
#주관: 박형민 교수 연구실