Ryandhimas Zezario

I am currently a Postdoctoral Researcher at the Research Center for Information Technology Innovation, Academia Sinica, in Taipei, Taiwan. I received a Ph.D. degree in Computer Science and Information Engineering from National Taiwan University, Taipei, Taiwan.

I am a Reviewer in leading journals/conferences, such as IEEE/ACM TASLP, IEEE SPL, IEEE J-STSP, IEEE ICASSP, Interspeech, IEEE ASRU, IEEE SLT, IEEE ICME, Speech Communication, etc.

My research interests include deep learning, speech processing, speech recognition, and non-intrusive speech assessment. Please kindly check the following link for more updated publications.

Selected Publications

R. E. Zezario, “Non-Intrusive Intelligibility Prediction for Hearing Aids: Recent Advances, Trends, and Challenges,” to appear in 2025 Asia-Pacific Signal and Information Processing Association Annual Summit and Conference (APSIPA ASC). [pdf]

R. E. Zezario, D. A.M.G. Wisnu, H.-M. Wang, and Y. Tsao, “Speech Intelligibility Assessment with Uncertainty-Aware Whisper Embeddings and sLSTM,” to appear in 2025 Asia-Pacific Signal and Information Processing Association Annual Summit and Conference (APSIPA ASC). [pdf]

D. A. M. G. Wisnu, R. E. Zezario, S. Rini, H.-M. Wang, and Y.Tsao, “Improving Perceptual Audio Aesthetic Assessment via Triplet Loss and Self-Supervised Embeddings,” to appear in IEEE Workshop on Automatic Speech Recognition and Understanding (ASRU) 2025. (Third Place on the Track 2 - AudioMOS Challenge 2025) [pdf]

W. Ren, Y.-C. Lin, W.-C. Huang, R. E. Zezario, S.-W. Fu, S.-F. Huang, E. Cooper, H. Wu, H.-Yu Wei, H.-Min Wang, H.-yi Lee, Y. Tsao, “HighRateMOS: Sampling-Rate Aware Modeling for Speech Quality Assessment,” to appear in IEEE Workshop on Automatic Speech Recognition and Understanding (ASRU) 2025. (Top Performance on the Track 3 - AudioMOS Challenge 2025) [pdf]

S. Ahmed, R. E. Zezario, H.-G. Yuan, A. Hussain, H.-M. Wang, W.-H. Chung, and Y. Tsao, “NeuroAMP: A Novel End-to-end General Purpose Deep Neural Amplifier for Personalized Hearing Aids,” to appear in IEEE Transactions on Artificial Intelligence. [pdf]

R. E. Zezario, S. M. Siniscalchi, F. Chen, H.-M. Wang, and Y. Tsao, “Feature Importance across Domains for Improving Non-Intrusive Speech Intelligibility Prediction in Hearing Aids,”INTERSPEECH, pp.5473-5477, 2025. [pdf]

S. Ahmed, R. E. Zezario, N. Saleem, A. Hussain, H.-M. Wang, and Y. Tsao, “A Study on Speech Assessment with Visual Cues,”INTERSPEECH, pp.5418-5422, 2025. [pdf

R. E. Zezario, D. A. M. G. Wisnu, H.-M. Wang, and Y. Tsao, “A Study on Zero-Shot Non-Intrusive Speech Intelligibility for Hearing Aids Using Large Language Models,” ICCE-TW, 2025. [pdf]

R. E. Zezario, S. M. Siniscalchi, H.-M. Wang, and Y. Tsao, “A Study on Zero-shot Non-intrusive Speech Assessment using Large Language Models”, IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 1-5, 2025. [pdf]

D. A. M. G. Wisnu, S. Rini, R. E. Zezario, H.-M. Wang, and Y. Tsao, “HAAQI-Net: A Non-intrusive Neural Music Audio Quality Assessment Model for Hearing Aids,” in IEEE Transactions on Audio, Speech and Language Processing, vol. 33, pp. 1877-1892, 2025. [pdf]

W.-C. Huang, S.-W. Fu, E. Cooper, R. E. Zezario, T. Toda, H.-M. Wang, J. Yamagishi, and Y. Tsao,”The VoiceMOS Challenge 2024: Beyond Speech Quality Prediction”, IEEE Workshop on Spoken Language Technology (SLT), pp. 803-810, 2024. [pdf]

R. E. Zezario, F. Chen, C.-S. Fuh, H.-M. Wang, and Y.Tsao, “Non-Intrusive Speech Intelligibility Prediction for Hearing Aids using Whisper and Metadata,” INTERSPEECH 2024, pp. 3844-3848, 2024. [pdf]

R. E. Zezario, Y.-W. Chen, S.-W. Fu, Y. Tsao, H. -M. Wang and C. -S. Fuh, “A Study on Incorporating Whisper for Robust Speech Assessment,” IEEE International Conference on Multimedia and Expo (ICME), pp. 1-6, 2024. (Top Performance on the Track 3 - VoiceMOS Challenge 2023) [pdf] [dataset] [github]

R. E. Zezario, Bo-Ren Brian Bai, Chiou-Shann Fuh, Hsin-Min Wang, Yu Tsao, “Multi-Task Pseudo-Label Learning for Non-Intrusive Speech Quality Assessment Model,” IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 831-835, 2024. [pdf]

R. E. Zezario, S.-W. Fu, F. Chen, C. -S. Fuh, H. -M. Wang and Y. Tsao, “Deep Learning-based Non-Intrusive Multi-Objective Speech Assessment Model with Cross-Domain Features,” in IEEE/ACM Transactions on Audio, Speech, and Language Processing, vol. 31, pp. 54-70, 2023. (IEEE Signal Processing Societys top 25 downloaded articles (Sep.2022 - Sep.2023))[pdf] [code]

R. E. Zezario, S.-W. Fu, F. Chen, C. -S. Fuh, H. -M. Wang and Y. Tsao, “MTI-Net: A Multi-Target Speech Intelligibility Prediction Model,” INTERSPEECH 2022, pp. 5463-5467, 2022. [pdf] [code]

R. E. Zezario, F. Chen, C. -S. Fuh, H. -M. Wang and Y. Tsao, “MBI-Net: A Non-Intrusive Multi-Branched Speech Intelligibility Prediction Model for Hearing Aids,” INTERSPEECH 2022, pp. 3944-3948, 2022. (Gold Prize for the best non-intrusive systems at Clarity Prediction Challenge 2022) [pdf] [code]

R. E. Zezario, C. -S. Fuh, H. -M. Wang and Y. Tsao, “Speech Enhancement with Zero-Shot Model Selection,” European Signal Processing Conference (EUSIPCO), pp. 491-495, 2021. [pdf] [code]

R. E. Zezario, S. -W. Fu, C. -S. Fuh, Y. Tsao and H. -M. Wang, “STOI-Net: A Deep Learning based Non-Intrusive Speech Intelligibility Assessment Model,” Asia-Pacific Signal and Information Processing Association Annual Summit and Conference (APSIPA ASC), pp. 482-486, 2020. [pdf] [code]

C. Yu* , R. E. Zezario* , S.-S. Wang, J. Sherman, Y.-Y. Hsieh, X. Lu, H.-M. Wang, and Y. Tsao, “Speech Enhancement Based on Denoising Autoencoder With Multi-Branched Encoders,” in IEEE/ACM Transactions on Audio, Speech, and Language Processing, vol. 28, pp. 2756-2769, 2020. (* equal contribution) [pdf]

R. E. Zezario, T. Hussain, X. Lu, H. -M. Wang and Y. Tsao, “Self-Supervised Denoising Autoencoder with Linear Regression Decoder for Speech Enhancement,” IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 6669-6673, 2020. [pdf]

R. E. Zezario, S.-W. Fu, X. Lu, H.-M. Wang, and Y. Tsao, “Specialized Speech Enhancement Model Selection Based on Learned Non-Intrusive Quality Assessment Metric,” INTERSPEECH 2019, pp.3168- 3172, 2019. [pdf] [code]

R. E. Zezario, J. Huang, X. Lu, Y. Tsao, H. Hwang and H. Wang, “Deep Denoising Autoencoder Based Post Filtering for Speech Enhancement,” Asia-Pacific Signal and Information Processing Association Annual Summit and Conference (APSIPA ASC), pp. 373-377, 2018. [pdf]

C. -Y. Hsu, R. E. Zezario, J. -C. Wang, C. -W. Ho, X. Lu and Y. Tsao, “Incorporating local environment information with ensemble neural networks to robust automatic speech recognition,” International Symposium on Chinese Spoken Language Processing (ISCSLP), pp. 1-5, 2016. [pdf]