About Me
I'm a Research Scientist on the
Seed/Doubao team at
TikTok/
ByteDance, focusing on
speech synthesis. I completed my M.S. in Computer Science at
Columbia University.
My research interests include deep generative modeling, self-supervised representation and transfer learning, zero-shot learning, and knowledge distillation. I'm especially interested in neural end-to-end learning for audio and natural language processing tasks.
Publications
Seed-TTS: A family of high-quality vesatile speech generation models
Seed Team, ByteDance
arXiv:2406.02430, Jun. 2024 (
🔊 Demos)
VoiceShop: A unified speech-to-speech framework for zero-shot voice editing
Philip Anastassiou*, Zhenyu Tang*, Kainan Peng, Dongya Jia, Jiaxin Li, Ming Tu, Yuping Wang, Yuxuan Wang, Mingbo Ma
(*equal contribution)(*equal cont.)
arXiv:2404.06674, Apr. 2024 (
🔊 Demos)