I received a Ph.D. degree in Computer Science and Information Engineering from National Taiwan University, Taipei, Taiwan. I am currently a Postdoctoral Researcher at the Research Center for Information Technology Innovation, Academia Sinica, in Taipei, Taiwan. I was previously a Research Assistant at the same institute and also worked as an Applied Scientist Intern at Amazon in California, USA. I was honored the Gold Prize for the best non-intrusive systems and 1st place for the Hearing Industry Research Consortium student prizes at the Clarity Prediction Challenge, the 2nd Clarity Workshop on Machine Learning Challenges for Hearing Aids (Clarity-2022).
My research interests include deep learning, speech processing, and deep learning-based non-intrusive speech assessment model. Please kindly check the following link for more update publications.
Selected Publications
R. E. Zezario, Y.-W. Chen, S.-W. Fu, Y. Tsao, H. -M. Wang and C. -S. Fuh, “A Study on Incorporating Whisper for Robust Speech Assessment,” arXiv 2309.12766, 2023. (Top Performance on the Track 3 - VoiceMOS Challenge 2023) [pdf] [github]
R. E. Zezario, S.-W. Fu, F. Chen, C. -S. Fuh, H. -M. Wang and Y. Tsao, “Deep Learning-based Non-Intrusive Multi-Objective Speech Assessment Model with Cross-Domain Features,” in IEEE/ACM Transactions on Audio, Speech, and Language Processing, vol. 31, pp. 54-70, 2023. [pdf] [code]
R. E. Zezario, S.-W. Fu, F. Chen, C. -S. Fuh, H. -M. Wang and Y. Tsao, “MTI-Net: A Multi-Target Speech Intelligibility Prediction Model,” INTERSPEECH 2022, pp. 5463-5467, 2022. [pdf] [code]
R. E. Zezario, F. Chen, C. -S. Fuh, H. -M. Wang and Y. Tsao, “MBI-Net: A Non-Intrusive Multi-Branched Speech Intelligibility Prediction Model for Hearing Aids,” INTERSPEECH 2022, pp. 3944-3948, 2022. (Gold Prize for the best non-intrusive systems at Clarity Prediction Challenge 2022) [pdf] [code]
R. E. Zezario, C. -S. Fuh, H. -M. Wang and Y. Tsao, “Speech Enhancement with Zero-Shot Model Selection,” 2021 29th European Signal Processing Conference (EUSIPCO), pp. 491-495, 2021. [pdf] [code]
R. E. Zezario, S. -W. Fu, C. -S. Fuh, Y. Tsao and H. -M. Wang, “STOI-Net: A Deep Learning based Non-Intrusive Speech Intelligibility Assessment Model,” 2020 Asia-Pacific Signal and Information Processing Association Annual Summit and Conference (APSIPA ASC), pp. 482-486, 2020. [pdf] [code]
C. Yu* , R. E. Zezario* , S.-S. Wang, J. Sherman, Y.-Y. Hsieh, X. Lu, H.-M. Wang, and Y. Tsao, “Speech Enhancement Based on Denoising Autoencoder With Multi-Branched Encoders,” in IEEE/ACM Transactions on Audio, Speech, and Language Processing, vol. 28, pp. 2756-2769, 2020. (* equal contribution) [pdf]
R. E. Zezario, T. Hussain, X. Lu, H. -M. Wang and Y. Tsao, “Self-Supervised Denoising Autoencoder with Linear Regression Decoder for Speech Enhancement,” ICASSP 2020 - 2020 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 6669-6673, 2020. [pdf]
R. E. Zezario, J. W. C. Sigalingging, T. Hussain, J. -C. Wang and Y. Tsao, “Comparative Study of Masking and Mapping Based on Hierarchical Extreme Learning Machine for Speech Enhancement,” 2019 International Symposium on Intelligent Signal Processing and Communication Systems (ISPACS), pp. 1-2, 2019. [pdf]
R. E. Zezario,S.-W. Fu, X. Lu, H.-M. Wang, and Y. Tsao, “Specialized Speech Enhancement Model Selection Based on Learned Non-Intrusive Quality Assessment Metric,” INTERSPEECH 2019, pp.3168- 3172, 2019. [pdf] [code]
R. E. Zezario, J. Huang, X. Lu, Y. Tsao, H. Hwang and H. Wang, “Deep Denoising Autoencoder Based Post Filtering for Speech Enhancement,” 2018 Asia-Pacific Signal and Information Processing Association Annual Summit and Conference (APSIPA ASC), pp. 373-377, 2018. [pdf]
C. -Y. Hsu, R. E. Zezario, J. -C. Wang, C. -W. Ho, X. Lu and Y. Tsao, “Incorporating local environment information with ensemble neural networks to robust automatic speech recognition,” 2016 10th International Symposium on Chinese Spoken Language Processing (ISCSLP), pp. 1-5, 2016. [pdf]