语音合成(TTS)

文本转语音,OpenAI 兼容。按 model 路由到对应 TTS 上游(OpenAI 兼容 TTS / 阿里 Qwen-TTS),返回二进制音频。

生产环境POSThttps://api.lolai.lol/v1/audio/speech

请求参数

Header 参数

Authorizationstring必填

API Key,格式 Bearer <key>。

示例:Bearer sk-lolai-xxx

Content-Typestring必填

示例:application/json

Body 参数application/json

modelstring必填

TTS 模型,见下方可用模型。

示例:qwen3-tts-flash

inputstring必填

要合成的文本。按字符计费。

示例:你好,这是语音合成测试。

voicestring可选

音色名,取值看具体模型(如 Qwen-TTS 的 Cherry;OpenAI 的 alloy 等)。

示例:Cherry

response_formatstring可选

mp3 / opus / aac / flac / wav / pcm(部分上游固定格式)。

示例:mp3

speednumber可选

语速,部分上游支持。

示例:1.0

计费

按输入字符数计费(非时长、非 token)。失败请求不扣费。

语音转写(STT)

POST/v1/audio/transcriptions

multipart 上传音频(file)→ 转写文本,按音频时长计费。response_format:json(默认 { text })/ text / verbose_json。已接:阿里 Qwen3-ASR(qwen3-asr-flash-*)、OpenAI Whisper(whisper-1,需上游)。

语音转写 · curl

curl https://api.lolai.lol/v1/audio/transcriptions \
  -H "Authorization: Bearer sk-lolai-xxx" \
  -F "model=qwen3-asr-flash-2026-02-10" \
  -F "file=@audio.wav" \
  -F "response_format=json"

可用模型

TTS 模型见模型广场(筛选语音类型)。当前已接:阿里 Qwen-TTS(qwen3-tts-flash 等);OpenAI 兼容 TTS(tts-1 等,需上游支持)。

请求与响应体

用下面的示例确认请求格式与返回结构。需要在线发起请求时,点击页面顶部「调试」拉起在线运行面板。

curl https://api.lolai.lol/v1/audio/speech \
  -H "Authorization: Bearer sk-lolai-xxx" \
  -H "Content-Type: application/json" \
  -d '{
    "model": "qwen3-tts-flash",
    "input": "你好,这是 LOLAI 的语音合成测试。",
    "voice": "Cherry"
  }' \
  --output speech.wav

响应是二进制音频(如 audio/wav / audio/mpeg),不是 JSON —— 直接写文件或播放。