Nuxt4 + AI Streaming SSR: Optimizing LLM App First Paint from 3s to 300ms in 2026
Nuxt4 + AI Streaming SSR: Optimizing LLM App First Paint from 3s to 300ms in 2026
Is your AI chat app's first paint taking over 3 seconds? Users stare at a blank page waiting for the LLM to finish before seeing anything? SSR-rendered HTML contains all AI replies but users feel like they're waiting forever? In 2026, Nuxt4's streaming SSR completely transforms the AI app experience — first paint visible in 300ms, streaming output visible in real-time.
Background
The Dilemma of Traditional SSR in AI Applications
| Dimension | Traditional SSR | Streaming SSR |
|---|---|---|
| Rendering Mode | Render all at once after all data is ready | Render each part as data becomes available |
| First Paint Time | Wait for complete AI response (3-30s) | 300ms for first paint skeleton |
| User Experience | Long blank screen | Progressive content display |
| TTFB | Extremely high (waiting for AI response) | Extremely low (HTML header sent immediately) |
| Hydration | Full hydration | Island / Progressive hydration |
| Server Resources | Long connection occupation | Streaming release |
Nuxt4 Core New Features
- Streaming Rendering:
renderToStringsupports AsyncIterable, can send while rendering - Server Components:
.server.vuecomponents render on the server, no JS sent to client - Edge SSR: Native support for Cloudflare Workers / Vercel Edge / Deno Deploy
- Hybrid Rendering: Configure SSR/SSG/SWR strategy per route
Problem Analysis
Root Causes of Slow SSR in AI Applications
- Serial Waiting: SSR must wait for the AI API to return the complete response before generating HTML
- Full Hydration: Client re-executes all component logic, including AI calls
- Blocking Render: One slow component blocks the entire page render
- No Cache Strategy: AI responses are not cacheable, every request re-invokes the API
Step-by-Step Guide
Step 1: Create a Nuxt4 Project
npx nuxi@latest init ai-chat-app --template v4-compat
cd ai-chat-app
npm install
// nuxt.config.ts
export default defineNuxtConfig({
future: {
compatibilityVersion: 4,
},
experimental: {
componentIslands: true,
viewTransition: true,
renderJsonPayloads: true,
},
routeRules: {
'/': { ssr: true },
'/chat/**': { ssr: true },
'/static/**': { ssr: false },
'/api/ai/**': { cors: true },
},
nitro: {
preset: 'cloudflare-pages',
compressPublicAssets: true,
},
})
Step 2: Implement Streaming AI Server Component
<!-- components/ChatStream.server.vue -->
<script lang="ts" setup>
const props = defineProps<{
messageId: string
prompt: string
}>()
const stream = await aiStreamResponse(props.prompt)
async function* aiStreamResponse(prompt: string) {
const response = await fetch('https://api.openai.com/v1/chat/completions', {
method: 'POST',
headers: {
'Content-Type': 'application/json',
'Authorization': `Bearer ${useRuntimeConfig().public.aiApiKey}`,
},
body: JSON.stringify({
model: 'gpt-4o',
messages: [{ role: 'user', content: prompt }],
stream: true,
}),
})
const reader = response.body!.getReader()
const decoder = new TextDecoder()
while (true) {
const { done, value } = await reader.read()
if (done) break
const chunk = decoder.decode(value, { stream: true })
const lines = chunk.split('\n').filter(line => line.startsWith('data: '))
for (const line of lines) {
const data = line.slice(6)
if (data === '[DONE]') return
try {
const parsed = JSON.parse(data)
const content = parsed.choices[0]?.delta?.content
if (content) yield content
} catch {}
}
}
}
</script>
<template>
<div class="chat-stream">
<div class="message-content">
<template v-for="(segment, i) in stream" :key="i">
<span v-html="renderMarkdown(segment)" />
</template>
</div>
</div>
</template>
Step 3: Chat Page Implementation
<!-- pages/chat/[id].vue -->
<script lang="ts" setup>
const route = useRoute()
const chatId = route.params.id as string
const { data: messages, refresh } = await useFetch(`/api/chat/${chatId}/messages`)
const newMessage = ref('')
const isStreaming = ref(false)
async function sendMessage() {
if (!newMessage.value.trim() || isStreaming.value) return
const prompt = newMessage.value
newMessage.value = ''
isStreaming.value = true
await $fetch('/api/chat/send', {
method: 'POST',
body: { chatId, content: prompt },
})
await refresh()
isStreaming.value = false
}
</script>
<template>
<div class="chat-container">
<div class="messages">
<div v-for="msg in messages" :key="msg.id" :class="['message', msg.role]">
<div class="message-text">{{ msg.content }}</div>
</div>
<LazyChatStream v-if="isStreaming" :message-id="chatId" :prompt="newMessage" />
</div>
<div class="input-area">
<textarea v-model="newMessage" @keydown.enter.exact.prevent="sendMessage" placeholder="Type a message..." />
<button :disabled="isStreaming" @click="sendMessage">Send</button>
</div>
</div>
</template>
Step 4: API Route Streaming Response Implementation
// server/api/chat/stream.get.ts
export default defineEventHandler(async (event) => {
const query = getQuery(event)
const prompt = query.prompt as string
setResponseHeader(event, 'content-type', 'text/event-stream')
setResponseHeader(event, 'cache-control', 'no-cache')
setResponseHeader(event, 'connection', 'keep-alive')
const stream = await callAIStream(prompt)
return sendStream(event, stream)
})
async function callAIStream(prompt: string): Promise<ReadableStream> {
const response = await fetch('https://api.openai.com/v1/chat/completions', {
method: 'POST',
headers: {
'Content-Type': 'application/json',
'Authorization': `Bearer ${process.env.AI_API_KEY}`,
},
body: JSON.stringify({
model: 'gpt-4o',
messages: [{ role: 'user', content: prompt }],
stream: true,
}),
})
return new ReadableStream({
async start(controller) {
const reader = response.body!.getReader()
const decoder = new TextDecoder()
while (true) {
const { done, value } = await reader.read()
if (done) {
controller.close()
break
}
controller.enqueue(value)
}
},
})
}
Step 5: Edge SSR Deployment Configuration
// nuxt.config.ts - Edge Configuration
export default defineNuxtConfig({
nitro: {
preset: 'cloudflare-pages',
cloudflarePages: {
routes: {
exclude: ['/assets/*', '/_nuxt/*'],
},
},
},
experimental: {
asyncContext: true,
},
})
# Build and deploy to Cloudflare Pages
npm run build
npx wrangler pages deploy .output/public
Complete Code: Production-Grade AI Chat Application
// composables/useAIChat.ts
export function useAIChat(chatId: string) {
const config = useRuntimeConfig()
const messages = ref<ChatMessage[]>([])
const isStreaming = ref(false)
const currentStreamContent = ref('')
async function loadMessages() {
const { data } = await useFetch<ChatMessage[]>(`/api/chat/${chatId}/messages`)
if (data.value) messages.value = data.value
}
async function sendMessage(content: string) {
if (isStreaming.value) return
messages.value.push({
id: crypto.randomUUID(),
role: 'user',
content,
createdAt: new Date().toISOString(),
})
isStreaming.value = true
currentStreamContent.value = ''
try {
const response = await fetch(`/api/chat/stream?prompt=${encodeURIComponent(content)}`, {
headers: { 'Accept': 'text/event-stream' },
})
const reader = response.body!.getReader()
const decoder = new TextDecoder()
while (true) {
const { done, value } = await reader.read()
if (done) break
const chunk = decoder.decode(value, { stream: true })
const lines = chunk.split('\n').filter(l => l.startsWith('data: '))
for (const line of lines) {
const data = line.slice(6)
if (data === '[DONE]') continue
try {
const parsed = JSON.parse(data)
const delta = parsed.choices[0]?.delta?.content || ''
currentStreamContent.value += delta
} catch {}
}
}
messages.value.push({
id: crypto.randomUUID(),
role: 'assistant',
content: currentStreamContent.value,
createdAt: new Date().toISOString(),
})
} catch (error) {
console.error('Stream error:', error)
} finally {
isStreaming.value = false
currentStreamContent.value = ''
}
}
return { messages, isStreaming, currentStreamContent, loadMessages, sendMessage }
}
interface ChatMessage {
id: string
role: 'user' | 'assistant' | 'system'
content: string
createdAt: string
}
// server/api/chat/[id]/messages.get.ts
import { kv } from '~/server/utils/kv'
export default defineEventHandler(async (event) => {
const chatId = getRouterParam(event, 'id')!
const cached = await kv.get(`chat:${chatId}:messages`)
if (cached) return cached
const messages = await db.chatMessage.findMany({
where: { chatId },
orderBy: { createdAt: 'asc' },
})
await kv.set(`chat:${chatId}:messages`, messages, { ttl: 60 })
return messages
})
// server/middleware/cache.ts
export default defineEventHandler(async (event) => {
if (event.path.startsWith('/api/chat/stream')) return
const cached = await getResponseCache(event)
if (cached) {
return cached
}
})
async function getResponseCache(event: H3Event) {
const key = `cache:${event.path}`
return await kv.get(key)
}
Common Pitfalls
Pitfall 1: Using Client APIs in Server Components
Symptom: Using onClick, ref and other client APIs in .server.vue components causes build errors.
Solution: Server Components can only run on the server and cannot use any client APIs. Extract interactive parts into separate client components, wrapped with <ClientOnly> or island components.
Pitfall 2: Hydration Mismatch During Streaming Render
Symptom: Console shows Hydration mismatch warnings, streaming content is inconsistent with server-rendered content.
Solution: Wrap streaming content with <ClientOnly>, or use useAsyncData's lazy: true option to avoid blocking hydration. Ensure client and server use the same data source.
Pitfall 3: Edge Runtime Does Not Support Node.js API
Symptom: After deploying to Cloudflare Workers, errors like process is not defined or Buffer is not defined appear.
Solution: Edge Runtime does not include Node.js APIs. Use import { H3Event } from 'h3' instead of Node.js's IncomingMessage, use native fetch for AI API calls, and avoid Node.js dependencies like axios.
Pitfall 4: SSE Connections Buffered by CDN
Symptom: Streaming output appears all at once on the client side, not character by character.
Solution: Ensure CDN is configured with the X-Accel-Buffering: no response header. Cloudflare supports SSE passthrough by default. Vercel requires experimental: { streaming: true } in next.config.js.
Pitfall 5: High Concurrent AI Requests Causing Server OOM
Symptom: Under high concurrency, Node.js process memory keeps growing until OOM.
Solution: Implement request queues and concurrency limits, use AbortController for timeouts, and promptly clean up completed streaming connections.
Error Troubleshooting
| # | Error Message | Cause | Solution |
|---|---|---|---|
| 1 | Hydration mismatch |
Server and client rendered content is inconsistent | Wrap dynamic content with ClientOnly, check data consistency |
| 2 | process is not defined |
Edge Runtime does not support Node.js API | Use Web standard APIs instead, add nitro polyfill |
| 3 | Server Component cannot use client APIs |
Using ref/onClick in .server.vue | Extract interactive parts as client components |
| 4 | fetch is not a function |
Server fetch not configured | Ensure Node.js 18+ or configure nitro.nodeCompat |
| 5 | ReadableStream is not supported |
Runtime does not support streams | Upgrade to Node.js 18+ or use polyfill |
| 6 | CORS error on SSE |
Cross-origin SSE request blocked | Configure routeRules cors option |
| 7 | KV storage not available |
Edge environment has no persistent storage | Use Cloudflare KV or Vercel KV |
| 8 | Maximum call stack exceeded |
Recursive component render overflow | Check component nesting, limit recursion depth |
| 9 | 429 Too Many Requests |
AI API rate limiting | Implement request queue and backoff retry |
| 10 | Worker exceeded CPU time limit |
Edge function CPU time exceeded | Reduce server computation, stream AI responses |
Advanced Optimization
1. Island Architecture to Reduce JS Bundle Size
<!-- components/AIChat.island.vue -->
<script lang="ts" setup>
defineOptions({
island: true,
})
</script>
Island components only render on the server; the client doesn't download the corresponding JS, significantly reducing hydration overhead.
2. SWR Cache for AI Responses
const { data } = await useFetch('/api/chat/messages', {
key: `chat-${chatId}`,
getCachedData(key, nuxtApp) {
const cached = nuxtApp.payload.data[key]
if (!cached) return null
const expirationDate = new Date(cached.fetchedAt)
expirationDate.setMinutes(expirationDate.getMinutes() + 5)
if (expirationDate < new Date()) return null
return cached
},
})
3. Pre-render Skeleton Screen
<!-- components/ChatSkeleton.vue -->
<template>
<div class="chat-skeleton animate-pulse">
<div class="h-4 bg-gray-200 rounded w-3/4 mb-2" />
<div class="h-4 bg-gray-200 rounded w-1/2 mb-2" />
<div class="h-4 bg-gray-200 rounded w-5/6" />
</div>
</template>
4. Progressive Hydration Strategy
// nuxt.config.ts
export default defineNuxtConfig({
experimental: {
componentIslands: {
selectiveHydration: true,
},
},
})
Comparison Analysis
| Dimension | Nuxt3 SSR | Nuxt4 Streaming SSR | Next.js App Router | Remix |
|---|---|---|---|---|
| Streaming Render | Manual implementation | Native support | Native support | defer support |
| Server Components | None | Native support | RSC | None |
| Edge SSR | Experimental | Native support | Native support | Needs adaptation |
| Island Architecture | Experimental | Stable | None | None |
| AI Streaming Integration | Manual | Built-in composable | Vercel AI SDK | Manual |
| Hydration Strategy | Full | Progressive / Island | Selective | Full |
| Learning Curve | Medium | Medium | High | Medium |
| Ecosystem Maturity | Mature | Mature in 2026 | Mature | Medium |
Summary & Outlook
Summary: Nuxt4's streaming SSR brings a qualitative leap for AI applications — from 3-second blank screens to 300ms first paint visibility. Core optimization strategies: Server Components reduce JS bundle size, streaming rendering outputs AI responses in real-time, Edge SSR reduces latency, and island architecture enables on-demand hydration. For new AI applications, use Nuxt4 directly; for existing Nuxt3 projects, upgrade incrementally, focusing on converting AI interaction pages to streaming rendering mode.
Recommended Online Tools
- JSON Formatter (API debugging): /en/json/format
- Base64 Encode/Decode (Token processing): /en/encode/base64
- curl to Code (API testing): /en/dev/curl-to-code
Try these browser-local tools — no sign-up required →