Text-to-Speech (TTS) API Tutorial
High-quality speech synthesis with multilingual, multi-voice support for natural output
6 voice options
Natural human voices
Adjustable speed
0.25x–4.0x
Streaming playback
Real-time audio output
HD quality
High-fidelity output
1. Basic speech synthesis
Getting Started
import openai
from pathlib import Path
client = openai.OpenAI(
api_key="your-api-key",
base_url="https://api.n1n.ai/v1"
)
# Basic text-to-speech
response = client.audio.speech.create(
model="tts-1",
voice="alloy", # Options: alloy, echo, fable, onyx, nova, shimmer
input="Welcome to N1N's text-to-speech API, supporting multiple voices and languages."
)
# Save audio file
response.stream_to_file("output.mp3")
# High-quality version
response_hd = client.audio.speech.create(
model="tts-1-hd", # Higher quality with slightly higher latency
voice="nova",
speed=1.0, # Speech rate 0.25-4.0
input="This is a high-quality speech synthesis demo."
)
response_hd.stream_to_file("output_hd.mp3")2. Voice selection guide
Alloy
Neutral, balanced voice
Best for: General scenarios, news reporting
Echo
Male, deep and resonant
Best for: Serious content, educational videos
Fable
British accent, elegant
Best for: Audiobooks, story narration
Onyx
Male, deep and magnetic
Best for: Podcasts, documentaries
Nova
Female, clear and friendly
Best for: Customer service, navigation systems
Shimmer
Female, warm and friendly
Best for: Children's content, assistants
3. Streaming audio playback
Real-time playback
import openai
import pygame
import io
# Stream and play audio
def stream_and_play_audio(text: str, voice: str = "alloy"):
response = client.audio.speech.create(
model="tts-1",
voice=voice,
input=text,
response_format="mp3"
)
# Initialize pygame
pygame.mixer.init()
# Convert response stream to bytes
audio_stream = io.BytesIO(response.content)
# Load and play
pygame.mixer.music.load(audio_stream)
pygame.mixer.music.play()
# Wait for playback to finish
while pygame.mixer.music.get_busy():
pygame.time.Clock().tick(10)
# Real-time voice assistant
class VoiceAssistant:
def __init__(self):
self.voice = "nova"
self.speed = 1.0
async def speak(self, text: str):
"""Asynchronous speech output"""
response = await client.audio.speech.create(
model="tts-1",
voice=self.voice,
speed=self.speed,
input=text
)
# Stream playback
await self.play_audio_stream(response)
async def play_audio_stream(self, audio_data):
# Implement streaming audio playback here
pass
# Node.js streaming example
const stream = await openai.audio.speech.create({
model: "tts-1",
voice: "alloy",
input: text,
stream: true
});
// Pipe to an audio player
stream.pipe(audioPlayer);4. Multilingual support
Internationalization use cases
# Multilingual support
languages = {
"english": "Hello, this is English text to speech.",
"chinese": "你好, 这是中文语音合成。",
"japanese": "こんにちは、これは日本語の音声合成です。",
"korean": "안녕하세요, 이것은 한국어 음성 합성입니다.",
"spanish": "Hola, esta es la síntesis de voz en español.",
"french": "Bonjour, ceci est la synthèse vocale française.",
"german": "Hallo, das ist deutsche Sprachsynthese.",
"russian": "Привет, это русский синтез речи."
}
# Batch-generate multilingual audio
for lang, text in languages.items():
response = client.audio.speech.create(
model="tts-1-hd",
voice="nova", # Nova voice has good multilingual support
input=text
)
response.stream_to_file(f"output_{lang}.mp3")
print(f"Generated {lang} audio")
# SSML support (advanced feature)
ssml_text = """
<speak>
<prosody rate="slow">Speak this part more slowly.</prosody>
<break time="500ms"/>
<prosody pitch="+2st">Make this part higher pitch.</prosody>
<emphasis level="strong">This is important!</emphasis>
</speak>
"""
response = client.audio.speech.create(
model="tts-1-hd",
voice="alloy",
input=ssml_text,
response_format="mp3"
)5. Use cases
📚 Content creation
- ✅ Audiobooks
- ✅ Podcast voiceovers
- ✅ Video narration
- ✅ Course explanations
🤖 Intelligent interactions
- ✅ Voice assistants
- ✅ Customer support
- ✅ Navigation announcements
- ✅ Accessibility applications