Implement a ChatGPT-like LLM in PyTorch from scratch, step by step
StyleTTS 2: Towards Human-Level Text-to-Speech through Style Diffusion and Adversarial Training with Large Speech Language Models
A high-throughput and memory-efficient inference and serving engine for LLMs
The most powerful and modular diffusion model GUI, api and backend with a graph/nodes interface.
Ultralytics YOLO π
π€ Diffusers: State-of-the-art diffusion models for image, video, and audio generation in PyTorch.
Stable Diffusion web UI
We write your reusable computer vision tools. π
ONNX Runtime: cross-platform, high performance ML inferencing and training accelerator
Open standard for machine learning interoperability
π€ Transformers: the model-definition framework for state-of-the-art machine learning models in text, vision, audio, and multimodal models, for both inference and training.
A Fundamental End-to-End Speech Recognition Toolkit and Open Source SOTA Pretrained Models, Supporting Speech Recognition, Voice Activity Detection, Text Post-processing etc.
πΈπ¬ - a deep learning toolkit for Text-to-Speech, battle-tested in research and production
SoftVC VITS Singing Voice Conversion
Easy Docker setup for Stable Diffusion with user-friendly UI
Run the official Stable Diffusion releases in a Docker container with txt2img, img2img, depth2img, pix2pix, upscale4x, and inpaint.