Streaming Output: Make AI Responses Smoother
Streaming lets users see the AI generation process in real time, greatly improving interactivity and experience. It is a must-have technique for modern AI applications.
Streaming Technology Comparison
| Technology | Real-time | Complexity | Browser support | Use cases |
|---|---|---|---|---|
| SSE | ⭐⭐⭐⭐ | Low | Excellent | One-way push |
| WebSocket | ⭐⭐⭐⭐⭐ | Medium | Good | Bidirectional communication |
| Long polling | ⭐⭐⭐ | Low | Universal | High compatibility |
SSE Implementation Example
Server-side Implementation
// Node.js SSE server
app.get('/api/stream', async (req, res) => {
// Set SSE headers
res.setHeader('Content-Type', 'text/event-stream');
res.setHeader('Cache-Control', 'no-cache');
res.setHeader('Connection', 'keep-alive');
res.setHeader('Access-Control-Allow-Origin', '*');
// Create streaming response
const stream = await openai.chat.completions.create({
model: 'gpt-4',
messages: req.query.messages,
stream: true
});
// Send data
for await (const chunk of stream) {
const content = chunk.choices[0]?.delta?.content || '';
if (content) {
res.write(`data: ${JSON.stringify({ content })}\n\n`);
}
}
// Send done signal
res.write('data: [DONE]\n\n');
res.end();
});Client-side Implementation
// Browser SSE client
class StreamClient {
constructor(url) {
this.url = url;
this.eventSource = null;
}
start(onMessage, onError) {
this.eventSource = new EventSource(this.url);
this.eventSource.onmessage = (event) => {
if (event.data === '[DONE]') {
this.close();
return;
}
try {
const data = JSON.parse(event.data);
onMessage(data.content);
} catch (e) {
console.error('Parse error:', e);
}
};
this.eventSource.onerror = (error) => {
onError(error);
this.close();
};
}
close() {
if (this.eventSource) {
this.eventSource.close();
this.eventSource = null;
}
}
}
// Usage example
const client = new StreamClient('/api/stream?prompt=Hello');
let fullText = '';
client.start(
(content) => {
fullText += content;
updateUI(fullText); // Update UI in real time
},
(error) => {
console.error('Stream error:', error);
}
);WebSocket Implementation
// WebSocket server
import { WebSocketServer } from 'ws';
const wss = new WebSocketServer({ port: 8080 });
wss.on('connection', (ws) => {
ws.on('message', async (message) => {
const data = JSON.parse(message);
// Create streaming response
const stream = await openai.chat.completions.create({
model: 'gpt-4',
messages: data.messages,
stream: true
});
// Stream send
for await (const chunk of stream) {
const content = chunk.choices[0]?.delta?.content || '';
if (content) {
ws.send(JSON.stringify({
type: 'content',
data: content
}));
}
}
// Send done signal
ws.send(JSON.stringify({ type: 'done' }));
});
});
// WebSocket client
class WSClient {
constructor(url) {
this.ws = new WebSocket(url);
this.setupHandlers();
}
setupHandlers() {
this.ws.onopen = () => {
console.log('Connected');
};
this.ws.onmessage = (event) => {
const message = JSON.parse(event.data);
switch (message.type) {
case 'content':
this.onContent(message.data);
break;
case 'done':
this.onComplete();
break;
}
};
}
send(messages) {
this.ws.send(JSON.stringify({ messages }));
}
}Streaming Best Practices
Performance Optimization
- • Use buffers to reduce network overhead
- • Batch small payloads
- • Implement backpressure control
- • Set reasonable timeouts
Error Handling
- • Auto-reconnect
- • Resume support
- • Graceful degradation strategies
- • Surface error states
User Experience Optimization
Typewriter Effect
// Typewriter effect implementation
function typeWriter(element, text, speed = 50) {
let index = 0;
function type() {
if (index < text.length) {
element.textContent += text.charAt(index);
index++;
setTimeout(type, speed);
}
}
type();
}Loading Animations
Show a blinking cursor, progress bar, or skeleton screen
Deliver Smooth AI Interactions
Use streaming output to make your AI applications faster and more delightful.
Get Started