🐸 Coqui TTS with VITS Speed Control

Neural text-to-speech with pitch-preserving speed adjustment

🚀 Setup Instructions

1. Install Dependencies:

pip install TTS flask flask-cors torch soundfile numpy pydub librosa

2. Run the Enhanced Server:

python coqui_tts_server.py

3. Server will start at: https://tts-gcp.arthur.digital/

Connecting to Coqui TTS server...

đŸ“Ļ Available Models

Select a model to load (VITS models support speed control):

🎭 Voice Selection

Choose a voice/speaker for the selected model:

đŸŽ›ī¸ VITS Speed Control

Adjust speech speed while preserving natural pitch. VITS models use neural length_scale parameter for high-quality speed adjustment.

🚀 Speech Speed 0.90x
Fine-Tuned Speed Range: 0.5x (very slow) → 1.0x (normal) → 1.5x (fast)
Sweet Spots: 0.85x-0.95x for slightly slower, 0.75x-0.85x for clearer speech
Recommended: Start with 0.90x for subtle slowing
Override voice selection with custom speaker ID
Upload 3-10 seconds of clear speech for voice cloning (works with XTTS v2 model)

đŸŽĩ Generated Audio