News
- 2025.05: “ALMTokenizer: A Low-bitrate and Semantic-rich Audio Codec Tokenizer for Audio Language Modeling “[Arxiv] accepted by ICML 2025.
- 2025.04.25: 👋 We release the technical report of Kimi-Audio. [Code][Model][Paper].
- 2024.05: The paper of “UniAudio: Towards Universal Audio Generation with Large Language Models” is accepted at ICML 2024.
- 2024.04: The paper of “InstructTTS: Modelling Expressive TTS in Discrete Latent Space with Natural Language Style Prompt” is accepted by IEEE/ACM Transactions on Audio Speech and Language Processing.
-
2023.02: Our new work on prompt-based expressive TTS – “InstructTTS: Modelling Expressive TTS in Discrete Latent Space with Natural Language Style Prompt” is available [Demo][arXiv].
-
2022.02: Paper introducing DiffGAN-TTS is available from [arXiv].
-
2021.09: DiffSVC paper will appear in ASRU2021.
-
2021.05: Our new work on singing voice conversion with the denoising diffusion probabilistic model (DDPM)[Demo][arXiv].
- 2021.03: Our FastSVC paper has been accepted as an oral paper in ICME 2021.