capture d’écran, le 2026 02 26 à 21.27.55
Voicebox.sh : Open source voice cloning powered by Qwen3-TTS. Create natural-sounding speech from text with near-perfect voice replication.

Voicebox is a local-first voice cloning studio with DAW-like features for professional voice synthesis. Think of it as a local, free and open-source alternative to ElevenLabs — download models, clone voices, and generate speech entirely on your machine.

Unlike cloud services that lock your voice data behind subscriptions, Voicebox gives you complete privacy, professional tools, and native performance. Download a voice model, clone any voice from a few seconds of audio, and compose multi-voice projects with studio-grade editing tools.

Optimized for performance with Metal acceleration on Mac and CUDA acceleration on Windows/Linux for fast, local inference.

Powered by Alibaba’s Qwen3-TTS model for exceptional voice quality and accuracy.

Create multi-voice narratives with a timeline-based editor. Arrange tracks, trim clips, and mix conversations.

Combine multiple voice samples for higher quality and more natural-sounding results.

Scroll to Top