Skip to main content
Minimax Speech 02 HD is a high-definition text-to-speech model with emotional control and voice customization. It produces natural-sounding speech with adjustable speed, pitch, volume, and emotional tone.

Try in playground

Test Minimax Speech 02 HD in the Runpod Hub playground.
Endpointhttps://api.runpod.ai/v2/minimax-speech-02-hd/runsync
Pricing$0.05 per 1000 characters
TypeText-to-speech

Request

All parameters are passed within the input object in the request body.
input.prompt
string
required
Text to convert to speech.
input.voice_id
string
default:"Wise_Woman"
Voice identifier for the desired voice.
input.speed
number
default:"1"
Speech speed multiplier.
input.volume
number
default:"1"
Volume level.
input.pitch
number
default:"0"
Pitch adjustment.
input.emotion
string
default:"neutral"
Emotion to convey. Options include happy, sad, neutral, angry, fearful, surprised.
input.english_normalization
boolean
default:"false"
Enable English text normalization for better pronunciation of numbers, abbreviations, etc.
input.default_audio_url
string
Fallback audio URL if generation fails.
curl -X POST "https://api.runpod.ai/v2/minimax-speech-02-hd/runsync" \
  -H "Authorization: Bearer $RUNPOD_API_KEY" \
  -H "Content-Type: application/json" \
  -d '{
    "input": {
      "prompt": "Welcome to our advanced text-to-speech system. This is a demonstration of natural speech synthesis.",
      "voice_id": "Wise_Woman",
      "speed": 1,
      "volume": 1,
      "pitch": 0,
      "emotion": "happy",
      "english_normalization": false
    }
  }'

Response

id
string
Unique identifier for the request.
status
string
Request status. Returns COMPLETED on success, FAILED on error.
delayTime
integer
Time in milliseconds the request spent in queue before processing began.
executionTime
integer
Time in milliseconds the model took to generate the audio.
workerId
string
Identifier of the worker that processed the request.
output
object
The generation result containing the audio URL and cost.
output.audio_url
string
URL of the generated audio file. This URL expires after 7 days.
output.cost
float
Cost of the generation in USD.
{
  "id": "sync-a1b2c3d4-e5f6-7890-abcd-ef1234567890-u1",
  "status": "COMPLETED",
  "delayTime": 14,
  "executionTime": 3456,
  "workerId": "oqk7ao1uomckye",
  "output": {
    "audio_url": "https://audio.runpod.ai/abc123/output.mp3",
    "cost": 0.0055
  }
}
Audio URLs expire after 7 days. Download and store generated audio files immediately if you need to keep them.

Cost calculation

Minimax Speech 02 HD charges $0.05 per 1000 characters. Example costs:
CharactersCost
500 characters$0.025
1,000 characters$0.05
10,000 characters$0.50