🤗 Transformers: the model-definition framework for state-of-the-art machine learning models in text, vision, audio, and multimodal models, for both inference and training.
Industrial-grade speech recognition toolkit: 170x realtime, 50+ languages, speaker diarization, emotion detection, streaming, and OpenAI-compatible API.
Faster Whisper transcription with CTranslate2
Voice-to-text with push-to-talk for Wayland compositors
WhisperX: Automatic Speech Recognition with Word-level Timestamps (& Diarization)
OpenVINOâ„¢ is an open source toolkit for optimizing and deploying AI inference
Offline speech recognition API for Android, iOS, Raspberry Pi and servers with Python, Java, C# and Node
OpenAI Whisper ASR Webservice API
Whisper.net. Speech to text made simple using Whisper Models
A speech to text IBus engine using VOSK