Streaming Output: Make AI Responses Smoother

Streaming lets users see the AI generation process in real time, greatly improving interactivity and experience. It is a must-have technique for modern AI applications.

Streaming Technology Comparison

TechnologyReal-timeComplexityBrowser supportUse cases
SSE⭐⭐⭐⭐LowExcellentOne-way push
WebSocket⭐⭐⭐⭐⭐MediumGoodBidirectional communication
Long polling⭐⭐⭐LowUniversalHigh compatibility

SSE Implementation Example

Server-side Implementation

// Node.js SSE server
app.get('/api/stream', async (req, res) => {
  // Set SSE headers
  res.setHeader('Content-Type', 'text/event-stream');
  res.setHeader('Cache-Control', 'no-cache');
  res.setHeader('Connection', 'keep-alive');
  res.setHeader('Access-Control-Allow-Origin', '*');
  
  // Create streaming response
  const stream = await openai.chat.completions.create({
    model: 'gpt-4',
    messages: req.query.messages,
    stream: true
  });
  
  // Send data
  for await (const chunk of stream) {
    const content = chunk.choices[0]?.delta?.content || '';
    if (content) {
      res.write(`data: ${JSON.stringify({ content })}\n\n`);
    }
  }
  
  // Send done signal
  res.write('data: [DONE]\n\n');
  res.end();
});

Client-side Implementation

// Browser SSE client
class StreamClient {
  constructor(url) {
    this.url = url;
    this.eventSource = null;
  }
  
  start(onMessage, onError) {
    this.eventSource = new EventSource(this.url);
    
    this.eventSource.onmessage = (event) => {
      if (event.data === '[DONE]') {
        this.close();
        return;
      }
      
      try {
        const data = JSON.parse(event.data);
        onMessage(data.content);
      } catch (e) {
        console.error('Parse error:', e);
      }
    };
    
    this.eventSource.onerror = (error) => {
      onError(error);
      this.close();
    };
  }
  
  close() {
    if (this.eventSource) {
      this.eventSource.close();
      this.eventSource = null;
    }
  }
}

// Usage example
const client = new StreamClient('/api/stream?prompt=Hello');
let fullText = '';

client.start(
  (content) => {
    fullText += content;
    updateUI(fullText); // Update UI in real time
  },
  (error) => {
    console.error('Stream error:', error);
  }
);

WebSocket Implementation

// WebSocket server
import { WebSocketServer } from 'ws';

const wss = new WebSocketServer({ port: 8080 });

wss.on('connection', (ws) => {
  ws.on('message', async (message) => {
    const data = JSON.parse(message);
    
    // Create streaming response
    const stream = await openai.chat.completions.create({
      model: 'gpt-4',
      messages: data.messages,
      stream: true
    });
    
    // Stream send
    for await (const chunk of stream) {
      const content = chunk.choices[0]?.delta?.content || '';
      if (content) {
        ws.send(JSON.stringify({
          type: 'content',
          data: content
        }));
      }
    }
    
    // Send done signal
    ws.send(JSON.stringify({ type: 'done' }));
  });
});

// WebSocket client
class WSClient {
  constructor(url) {
    this.ws = new WebSocket(url);
    this.setupHandlers();
  }
  
  setupHandlers() {
    this.ws.onopen = () => {
      console.log('Connected');
    };
    
    this.ws.onmessage = (event) => {
      const message = JSON.parse(event.data);
      
      switch (message.type) {
        case 'content':
          this.onContent(message.data);
          break;
        case 'done':
          this.onComplete();
          break;
      }
    };
  }
  
  send(messages) {
    this.ws.send(JSON.stringify({ messages }));
  }
}

Streaming Best Practices

Performance Optimization

  • • Use buffers to reduce network overhead
  • • Batch small payloads
  • • Implement backpressure control
  • • Set reasonable timeouts

Error Handling

  • • Auto-reconnect
  • • Resume support
  • • Graceful degradation strategies
  • • Surface error states

User Experience Optimization

Typewriter Effect

// Typewriter effect implementation
function typeWriter(element, text, speed = 50) {
  let index = 0;
  
  function type() {
    if (index < text.length) {
      element.textContent += text.charAt(index);
      index++;
      setTimeout(type, speed);
    }
  }
  
  type();
}

Loading Animations

Show a blinking cursor, progress bar, or skeleton screen

Deliver Smooth AI Interactions

Use streaming output to make your AI applications faster and more delightful.

Get Started