Technology

Streaming AI Responses with Next.js and the Vercel AI SDK

Learn how to build real-time streaming AI applications using HTTP streaming techniques, Server-Sent Events, and the Vercel AI SDK with OpenAI's GPT-4o. From basic SSE fundamentals to streaming structured JSON objects validated by Zod schemas.

15 min read
Published

Complete Tutorial Code

Follow along with the complete source code for this Streaming AI tutorial. Includes all four streaming patterns built with Next.js, the Vercel AI SDK, and OpenAI GPT-4o.

View on GitHub

Introduction

Streaming AI responses has become a cornerstone of modern AI-powered applications. Rather than waiting for a complete response before displaying anything to the user, streaming delivers content incrementally — token by token — creating a responsive, engaging experience that feels alive and immediate.

This tutorial collection walks you through four progressive streaming techniques, starting from the fundamentals of HTTP streaming with Server-Sent Events (SSE) and building up to streaming structured JSON objects validated by Zod schemas using OpenAI's GPT-4o and the Vercel AI SDK.

Tech Stack

Next.js 16 — App Router and API Routes for server-side streaming endpoints
Vercel AI SDK (ai, @ai-sdk/openai, @ai-sdk/react) — streamlined streaming primitives and React hooks
OpenAI GPT-4o — the AI model powering the text and object streaming tutorials
Zod — schema validation for streamed JSON objects
partial-json — incremental JSON parsing for real-time structured data rendering
Tailwind CSS v4 and TypeScript — styling and type safety throughout

Tutorial 1: HTTP Streaming with SSE

The first tutorial introduces the fundamentals of HTTP streaming without any AI involvement. You will learn how to stream data from the server to the client using Server-Sent Events (SSE) and the Web Streams API, delivering text word-by-word over a persistent HTTP connection.

Understanding SSE at this level is essential before layering in AI models. You will see exactly how a ReadableStream is constructed on the server, how the browser consumes the event stream, and how to handle connection lifecycle events.

// app/api/sse/route.ts
export async function GET() {
  const stream = new ReadableStream({
    async start(controller) {
      const words = "Hello from the server!".split(" ");
      for (const word of words) {
        controller.enqueue(
          new TextEncoder().encode(`data: ${word}\n\n`)
        );
        await new Promise((r) => setTimeout(r, 200));
      }
      controller.close();
    },
  });

  return new Response(stream, {
    headers: {
      "Content-Type": "text/event-stream",
      "Cache-Control": "no-cache",
      Connection: "keep-alive",
    },
  });
}

Route: /tutorials/sse → API: /api/sse

Tutorial 2: Streaming Structured JSON with partial-json

The second tutorial demonstrates how to parse incomplete JSON in real-time as it streams token by token, and render a live structured UI using the partial-json library.

This tutorial simulates how a large language model streams a complex character profile JSON object chunk by chunk. As each token arrives, the UI progressively renders the fields that have been fully received — a pattern directly applicable to real AI applications.

// Client-side: parse partial JSON as it streams
import { parse } from "partial-json";

const response = await fetch("/api/partial-json");
const reader = response.body!.getReader();
const decoder = new TextDecoder();
let buffer = "";

while (true) {
  const { done, value } = await reader.read();
  if (done) break;
  buffer += decoder.decode(value, { stream: true });
  // Parse whatever JSON we have so far
  const partial = parse(buffer);
  renderUI(partial);
}

Route: /tutorials/partial-json → API: /api/partial-json

Tutorial 3: Stream Text from GPT

The third tutorial connects to OpenAI for the first time. Using the Vercel AI SDK's streamText function on the server and the useCompletion hook on the client, you can stream GPT-4o responses token-by-token with minimal boilerplate.

// app/api/completion/route.ts
import { streamText } from "ai";
import { openai } from "@ai-sdk/openai";

export async function POST(req: Request) {
  const { prompt } = await req.json();

  const result = streamText({
    model: openai("gpt-4o"),
    prompt,
  });

  return result.toDataStreamResponse();
}
// Client component using useCompletion
"use client";
import { useCompletion } from "@ai-sdk/react";

export default function StreamTextPage() {
  const { completion, input, handleInputChange, handleSubmit } =
    useCompletion({ api: "/api/completion" });

  return (
    <form onSubmit={handleSubmit}>
      <input value={input} onChange={handleInputChange} />
      <button type="submit">Generate</button>
      <p>{completion}</p>
    </form>
  );
}

Route: /tutorials/stream-text → API: /api/completion

Tutorial 4: Stream Object with Zod Schema

The fourth and most advanced tutorial combines everything learned so far. Using streamText with Output.object and the experimental_useObject hook, you can stream a complex, nested JSON object in real-time — validated by a Zod schema as each field arrives.

The demo generates a full RPG hero character profile from a user-provided context. As GPT-4o streams the response, each field of the hero profile appears progressively in the UI, fully type-safe and schema-validated.

// app/api/stream-object/schema.ts
import { z } from "zod";

export const heroSchema = z.object({
  name: z.string(),
  class: z.string(),
  level: z.number(),
  backstory: z.string(),
  stats: z.object({
    strength: z.number(),
    dexterity: z.number(),
    intelligence: z.number(),
    charisma: z.number(),
  }),
  abilities: z.array(z.string()),
  equipment: z.array(z.string()),
});
// app/api/stream-object/route.ts
import { streamText, Output } from "ai";
import { openai } from "@ai-sdk/openai";
import { heroSchema } from "./schema";

export async function POST(req: Request) {
  const { context } = await req.json();

  const result = streamText({
    model: openai("gpt-4o"),
    prompt: `Generate an RPG hero based on: ${context}`,
    experimental_output: Output.object({ schema: heroSchema }),
  });

  return result.toDataStreamResponse();
}

Route: /tutorials/stream-object → API: /api/stream-object

Key Benefits of Streaming AI

Instant Perceived Performance

Users see content immediately rather than staring at a loading spinner, dramatically improving perceived responsiveness.

Progressive Rendering

Structured data can be rendered field-by-field as it arrives, enabling rich, dynamic UIs that update in real-time.

Type-Safe Streaming

Zod schema validation ensures streamed objects conform to your expected shape, catching errors at the boundary.

Minimal Boilerplate

The Vercel AI SDK abstracts the complexity of streaming protocols, letting you focus on building features.

Getting Started

Ready to explore streaming AI? Follow these steps to get the tutorial project running on your local machine:

  1. 1
    Clone the repository:
    git clone https://github.com/audoir/streaming-ai-tutorial.git
  2. 2
    Configure your OpenAI API key:
    cp .env.example .env.local
    # Add your key: OPENAI_API_KEY=sk-...

    Required for Tutorials 3 and 4. Tutorials 1 and 2 work without an API key.

  3. 3
    Install dependencies:
    npm install
  4. 4
    Start the development server:
    npm run dev
  5. 5
    Explore the tutorials:
    http://localhost:3000 — Tutorial index
    http://localhost:3000/tutorials/sse — HTTP Streaming with SSE
    http://localhost:3000/tutorials/partial-json — Streaming Structured JSON
    http://localhost:3000/tutorials/stream-object — Stream Object with Zod Schema

Project Structure

app/
├── page.tsx                        # Tutorial index / home page
├── api/
│   ├── sse/route.ts                # SSE streaming endpoint
│   ├── partial-json/route.ts       # Partial JSON streaming endpoint
│   ├── completion/route.ts         # streamText (GPT) endpoint
│   └── stream-object/
│       ├── route.ts                # streamObject (GPT + Zod) endpoint
│       └── schema.ts               # Zod schema for hero profile
└── tutorials/
    ├── sse/page.tsx
    ├── partial-json/page.tsx
    ├── stream-text/page.tsx
    └── stream-object/page.tsx

Learning Outcomes

By completing these tutorials, you will have gained hands-on experience with:

  • • Building SSE streaming endpoints with the Web Streams API
  • • Consuming event streams on the client with the Fetch API
  • • Parsing incomplete JSON in real-time using the partial-json library
  • • Streaming GPT-4o text responses with the Vercel AI SDK's streamText
  • • Using the useCompletion React hook for minimal boilerplate streaming
  • • Streaming structured JSON objects validated by Zod schemas
  • • Progressive UI rendering as structured data arrives token by token
  • • TypeScript and type safety throughout the streaming stack

Conclusion

Streaming is no longer a nice-to-have — it is the expected behavior for AI-powered applications. Users have come to expect immediate feedback, and streaming delivers exactly that by showing content as it is generated rather than after a long wait.

This tutorial collection provides a progressive path from the raw fundamentals of SSE to production-ready AI streaming with schema validation. Each technique builds on the last, giving you a deep understanding of how streaming works at every layer of the stack.

About the Author

Wayne Cheng is the founder and AI app developer at Audoir, LLC. Prior to founding Audoir, he worked as a hardware design engineer for Silicon Valley startups and an audio engineer for creative organizations. He holds an MSEE from UC Davis and a Music Technology degree from Foothill College.

Further Exploration

To continue your streaming AI journey, explore the complete tutorial repository and experiment with extending the streaming patterns. Consider adding chat history, tool calls, or multi-modal inputs to deepen your understanding of the Vercel AI SDK's capabilities.

For more AI-powered development tools and tutorials, visit Audoir .