A scalable generative AI framework built for researchers and developers working on Large Language Models, Multimodal, and Speech AI (Automatic Speech Recognition and Text-to-Speech)
🐸💬 - a deep learning toolkit for Text-to-Speech, battle-tested in research and production
A fast, local neural text to speech system
Build local voice agents with open-source models
StyleTTS 2: Towards Human-Level Text-to-Speech through Style Diffusion and Adversarial Training with Large Speech Language Models