F5-TTS Vietnamese - Text-to-Speech

Generating speech... Please wait...

🎧 Generated Audio

📊 Spectrogram

❗ Model Limitations

May not perform well with numbers, dates, and special characters
Rhythm may be inconsistent with some texts
Works best with clear, well-pronounced reference audio
Maximum 1000 words per request

📡 API Documentation

Use the following endpoint to integrate with your application:

POST /api/synthesize

curl -X POST http://localhost:5000/api/synthesize \
  -F "ref_audio=@sample.wav" \
  -F "gen_text=Xin chào, đây là giọng nói tổng hợp" \
  -F "speed=1.0"

Response:

{
  "success": true,
  "audio": "base64_encoded_audio_data",
  "spectrogram": "base64_encoded_image_data",
  "sample_rate": 24000,
  "message": "Speech synthesized successfully"
}

GET /api/health

Check if the service is running:

curl http://localhost:5000/api/health

GET /api/info

Get model information:

curl http://localhost:5000/api/info

🎤 F5-TTS Vietnamese

🎧 Generated Audio

📊 Spectrogram

❗ Model Limitations

📡 API Documentation

POST /api/synthesize

Response:

GET /api/health

GET /api/info