Vue3 AI Integration: 5 LLM Interaction Patterns and Streaming Response Solutions in 2026

前端工程

Vue3 AI Integration: 5 LLM Interaction Patterns and Streaming Response Solutions in 2026

Still using fetch to wait for complete LLM responses before rendering? Users staring at blank screens for 10 seconds? In 2026, AI application UX standards have evolved—streaming output, real-time feedback, and intelligent interaction are now the baseline. This article walks you through 5 Vue3 LLM interaction patterns, each ready to copy into your project.


Background: Evolution of Frontend LLM Interaction

Stage Interaction Mode User Experience Technical Implementation
1.0 Request-Wait-Response Long blank wait HTTP fetch
2.0 Streaming Response Character-by-character output, like typing SSE / WebSocket
3.0 Function Calling AI proactively invokes tools Structured output + frontend routing
4.0 Multi-Model Routing Select optimal model per scenario Intelligent routing layer
5.0 Agent Autonomous Interaction AI plans and executes autonomously Multi-turn dialogue + tool chain

Problem Analysis: Why Traditional Approaches Fall Short

Three major pain points with traditional frontend LLM integration:

  1. Waiting Anxiety: No feedback before the complete response arrives
  2. Timeout Crashes: LLM response times are unpredictable; long text generation may exceed 30 seconds
  3. Feature Disconnect: AI capabilities are decoupled from UI interaction, preventing intelligent user experiences

Pattern 1: SSE Streaming Response

Composable Implementation

// composables/useSSEChat.ts
import { ref, onUnmounted } from 'vue'

interface ChatMessage {
  role: 'user' | 'assistant' | 'system'
  content: string
  timestamp: number
}

interface UseSSEChatOptions {
  apiUrl: string
  model?: string
  onToken?: (token: string) => void
  onError?: (error: Error) => void
  onComplete?: (fullText: string) => void
}

export function useSSEChat(options: UseSSEChatOptions) {
  const messages = ref<ChatMessage[]>([])
  const currentResponse = ref('')
  const isLoading = ref(false)
  const error = ref<string | null>(null)
  let abortController: AbortController | null = null

  async function sendMessage(content: string) {
    isLoading.value = true
    error.value = null
    currentResponse.value = ''
    abortController = new AbortController()

    messages.value.push({
      role: 'user',
      content,
      timestamp: Date.now(),
    })

    try {
      const response = await fetch(options.apiUrl, {
        method: 'POST',
        headers: {
          'Content-Type': 'application/json',
          Authorization: `Bearer ${import.meta.env.VITE_AI_API_KEY}`,
        },
        body: JSON.stringify({
          model: options.model || 'gpt-4',
          messages: messages.value.map(({ role, content: c }) => ({ role, content: c })),
          stream: true,
        }),
        signal: abortController.signal,
      })

      if (!response.ok) {
        throw new Error(`HTTP ${response.status}: ${response.statusText}`)
      }

      const reader = response.body?.getReader()
      if (!reader) throw new Error('ReadableStream not available')

      const decoder = new TextDecoder()
      let buffer = ''

      while (true) {
        const { done, value } = await reader.read()
        if (done) break

        buffer += decoder.decode(value, { stream: true })
        const lines = buffer.split('\n')
        buffer = lines.pop() || ''

        for (const line of lines) {
          const trimmed = line.trim()
          if (!trimmed || trimmed === 'data: [DONE]') continue
          if (!trimmed.startsWith('data: ')) continue

          try {
            const json = JSON.parse(trimmed.slice(6))
            const token = json.choices?.[0]?.delta?.content || ''
            if (token) {
              currentResponse.value += token
              options.onToken?.(token)
            }
          } catch {
            // Ignore parse errors
          }
        }
      }

      messages.value.push({
        role: 'assistant',
        content: currentResponse.value,
        timestamp: Date.now(),
      })

      options.onComplete?.(currentResponse.value)
    } catch (e: any) {
      if (e.name !== 'AbortError') {
        error.value = e.message
        options.onError?.(e)
      }
    } finally {
      isLoading.value = false
      abortController = null
    }
  }

  function stopGeneration() {
    abortController?.abort()
    if (currentResponse.value) {
      messages.value.push({
        role: 'assistant',
        content: currentResponse.value,
        timestamp: Date.now(),
      })
    }
    isLoading.value = false
  }

  function clearMessages() {
    messages.value = []
    currentResponse.value = ''
    error.value = null
  }

  onUnmounted(() => {
    abortController?.abort()
  })

  return {
    messages,
    currentResponse,
    isLoading,
    error,
    sendMessage,
    stopGeneration,
    clearMessages,
  }
}

Component Usage

<!-- components/AIChat.vue -->
<template>
  <div class="ai-chat">
    <div class="messages">
      <div v-for="msg in messages" :key="msg.timestamp" :class="['message', msg.role]">
        <div class="content">{{ msg.content }}</div>
      </div>
      <div v-if="isLoading" class="message assistant streaming">
        <div class="content">{{ currentResponse }}<span class="cursor">▊</span></div>
      </div>
    </div>
    <div class="input-area">
      <input
        v-model="inputText"
        placeholder="Type a message..."
        @keydown.enter="handleSend"
        :disabled="isLoading"
      />
      <button v-if="!isLoading" @click="handleSend" :disabled="!inputText.trim()">Send</button>
      <button v-else @click="stopGeneration" class="stop">Stop</button>
    </div>
  </div>
</template>

<script setup lang="ts">
import { ref } from 'vue'
import { useSSEChat } from '../composables/useSSEChat'

const inputText = ref('')
const { messages, currentResponse, isLoading, sendMessage, stopGeneration } = useSSEChat({
  apiUrl: '/api/v1/chat/completions',
  model: 'gpt-4',
})

function handleSend() {
  const text = inputText.value.trim()
  if (!text) return
  inputText.value = ''
  sendMessage(text)
}
</script>

Pattern 2: Function Calling Frontend Integration

// composables/useFunctionCalling.ts
import { ref } from 'vue'

interface FunctionDefinition {
  name: string
  description: string
  parameters: Record<string, any>
  execute: (args: any) => Promise<any>
}

export function useFunctionCalling() {
  const functions = ref<Map<string, FunctionDefinition>>(new Map())
  const executionLog = ref<Array<{ name: string; args: any; result: any; timestamp: number }>>([])

  function registerFunction(fn: FunctionDefinition) {
    functions.value.set(fn.name, fn)
  }

  function getToolsSchema() {
    return Array.from(functions.value.values()).map(fn => ({
      type: 'function',
      function: {
        name: fn.name,
        description: fn.description,
        parameters: fn.parameters,
      },
    }))
  }

  async function handleToolCalls(toolCalls: any[]) {
    const results = []
    for (const call of toolCalls) {
      const fn = functions.value.get(call.function.name)
      if (!fn) {
        results.push({
          tool_call_id: call.id,
          role: 'tool',
          content: JSON.stringify({ error: `Unknown function: ${call.function.name}` }),
        })
        continue
      }

      try {
        const args = JSON.parse(call.function.arguments)
        const result = await fn.execute(args)
        executionLog.value.push({ name: call.function.name, args, result, timestamp: Date.now() })
        results.push({
          tool_call_id: call.id,
          role: 'tool',
          content: JSON.stringify(result),
        })
      } catch (e: any) {
        results.push({
          tool_call_id: call.id,
          role: 'tool',
          content: JSON.stringify({ error: e.message }),
        })
      }
    }
    return results
  }

  return { functions, executionLog, registerFunction, getToolsSchema, handleToolCalls }
}

Pattern 3: Multi-Model Intelligent Routing

// composables/useModelRouter.ts
import { ref } from 'vue'

interface ModelConfig {
  id: string
  name: string
  maxTokens: number
  costPer1k: number
  latencyMs: number
  capabilities: string[]
}

const MODEL_REGISTRY: ModelConfig[] = [
  { id: 'gpt-4', name: 'GPT-4', maxTokens: 128000, costPer1k: 0.03, latencyMs: 2000, capabilities: ['reasoning', 'code', 'writing'] },
  { id: 'gpt-4o-mini', name: 'GPT-4o Mini', maxTokens: 128000, costPer1k: 0.00015, latencyMs: 500, capabilities: ['chat', 'summary'] },
  { id: 'claude-3.5-sonnet', name: 'Claude 3.5 Sonnet', maxTokens: 200000, costPer1k: 0.003, latencyMs: 1500, capabilities: ['reasoning', 'code', 'analysis'] },
  { id: 'deepseek-v3', name: 'DeepSeek V3', maxTokens: 128000, costPer1k: 0.00027, latencyMs: 800, capabilities: ['code', 'math', 'reasoning'] },
]

export function useModelRouter() {
  const currentModel = ref<ModelConfig>(MODEL_REGISTRY[1])
  const routingLog = ref<Array<{ input: string; model: string; reason: string }>>([])

  function selectModel(input: string, options?: { preferSpeed?: boolean; preferQuality?: boolean }) {
    const lower = input.toLowerCase()

    if (options?.preferSpeed || lower.length < 50) {
      currentModel.value = MODEL_REGISTRY[1]
      routingLog.value.push({ input: input.slice(0, 50), model: currentModel.value.id, reason: 'Speed priority / short input' })
      return currentModel.value
    }

    if (lower.includes('code') || lower.includes('debug')) {
      currentModel.value = MODEL_REGISTRY[3]
      routingLog.value.push({ input: input.slice(0, 50), model: currentModel.value.id, reason: 'Code task' })
      return currentModel.value
    }

    if (options?.preferQuality || lower.length > 500) {
      currentModel.value = MODEL_REGISTRY[0]
      routingLog.value.push({ input: input.slice(0, 50), model: currentModel.value.id, reason: 'Quality priority / long input' })
      return currentModel.value
    }

    currentModel.value = MODEL_REGISTRY[2]
    routingLog.value.push({ input: input.slice(0, 50), model: currentModel.value.id, reason: 'Default reasoning' })
    return currentModel.value
  }

  return { currentModel, routingLog, selectModel, MODEL_REGISTRY }
}

Pattern 4: AI Chat Component with Markdown Rendering

<!-- components/AIMarkdownRenderer.vue -->
<template>
  <div class="ai-markdown" v-html="renderedContent"></div>
</template>

<script setup lang="ts">
import { computed } from 'vue'
import { marked } from 'marked'

const props = defineProps<{ content: string }>()

const renderedContent = computed(() => {
  if (!props.content) return ''
  return marked.parse(props.content, { async: false }) as string
})
</script>

Pattern 5: Agent Autonomous Interaction

// composables/useAIAgent.ts
import { ref } from 'vue'
import { useSSEChat } from './useSSEChat'
import { useFunctionCalling } from './useFunctionCalling'

interface AgentStep {
  type: 'thinking' | 'tool_call' | 'tool_result' | 'response'
  content: string
  timestamp: number
}

export function useAIAgent(apiUrl: string) {
  const steps = ref<AgentStep[]>([])
  const isRunning = ref(false)
  const maxIterations = 5

  const chat = useSSEChat({ apiUrl })
  const fc = useFunctionCalling()

  fc.registerFunction({
    name: 'search_web',
    description: 'Search the internet for latest information',
    parameters: {
      type: 'object',
      properties: { query: { type: 'string', description: 'Search keywords' } },
      required: ['query'],
    },
    execute: async (args) => {
      const resp = await fetch(`/api/search?q=${encodeURIComponent(args.query)}`)
      return resp.json()
    },
  })

  fc.registerFunction({
    name: 'run_code',
    description: 'Execute code and return results',
    parameters: {
      type: 'object',
      properties: { code: { type: 'string', description: 'Code to execute' }, language: { type: 'string', description: 'Programming language' } },
      required: ['code'],
    },
    execute: async (args) => {
      const resp = await fetch('/api/execute', {
        method: 'POST',
        headers: { 'Content-Type': 'application/json' },
        body: JSON.stringify(args),
      })
      return resp.json()
    },
  })

  async function run(task: string) {
    isRunning.value = true
    steps.value = []
    let iteration = 0
    let currentInput = task

    while (iteration < maxIterations) {
      iteration++
      steps.value.push({ type: 'thinking', content: `Iteration ${iteration}...`, timestamp: Date.now() })
      await chat.sendMessage(currentInput)

      const lastAssistantMsg = chat.messages.value[chat.messages.value.length - 1]
      if (!lastAssistantMsg || lastAssistantMsg.role !== 'assistant') break

      steps.value.push({ type: 'response', content: lastAssistantMsg.content, timestamp: Date.now() })

      const hasToolCall = lastAssistantMsg.content.includes('function_call') || lastAssistantMsg.content.includes('tool_call')
      if (!hasToolCall) break

      steps.value.push({ type: 'tool_call', content: 'Executing tool call...', timestamp: Date.now() })
      break
    }

    isRunning.value = false
  }

  return { steps, isRunning, run, fc }
}

Pitfall Guide

# Pitfall Symptom Solution
1 SSE connection not properly closed Still receiving data after component unmount, memory leak Call abortController.abort() in onUnmounted
2 Incomplete streaming parse buffer Last line of data lost Keep buffer remainder, concatenate on next read
3 Function Calling argument parse failure JSON.parse(arguments) throws Wrap in try-catch, fallback to plain text response
4 Multi-model routing infinite loop Model A recommends B, B recommends A Set maxIterations limit, force default model when exceeded
5 Markdown XSS injection v-html renders malicious scripts Use DOMPurify to sanitize HTML, or use marked sanitize option

Error Troubleshooting

Error Message Cause Solution
ReadableStream is not supported Browser doesn't support streaming API Add polyfill web-streams-polyfill, or fallback to polling
net::ERR_INCOMPLETE_CHUNKED_ENCODING Server SSE format error Confirm Content-Type: text/event-stream, each message ends with \n\n
AbortError: The user aborted a request User manually cancelled Normal behavior, check e.name === 'AbortError'
429 Too Many Requests API rate limit exceeded Implement exponential backoff retry, add request queue
JSON.parse: unexpected character SSE data line format error Check data: prefix, filter empty lines and non-data lines
CORS policy: No 'Access-Control-Allow-Origin' Cross-origin request rejected Add CORS headers on server, or use Vite proxy
Cannot read property 'delta' of undefined SSE response structure changed Add optional chaining json.choices?.[0]?.delta?.content
Maximum call stack size exceeded Agent recursive calls too deep Limit maxIterations, add recursion depth check
Failed to execute 'fetch' on 'Window' Network disconnected Add network status detection, implement offline prompt and auto-reconnect
TypeError: response.body is null Response body is null Check if API endpoint supports streaming, add response.body null check

Advanced Optimization

1. Request Queue and Rate Limiting

// utils/rateLimiter.ts
export class RequestQueue {
  private queue: Array<() => Promise<any>> = []
  private running = 0
  private maxConcurrent: number
  private minInterval: number
  private lastRun = 0

  constructor(maxConcurrent = 3, minInterval = 1000) {
    this.maxConcurrent = maxConcurrent
    this.minInterval = minInterval
  }

  async add<T>(fn: () => Promise<T>): Promise<T> {
    return new Promise((resolve, reject) => {
      this.queue.push(async () => {
        try {
          const now = Date.now()
          const wait = Math.max(0, this.minInterval - (now - this.lastRun))
          if (wait > 0) await new Promise(r => setTimeout(r, wait))
          this.lastRun = Date.now()
          resolve(await fn())
        } catch (e) {
          reject(e)
        }
      })
      this.process()
    })
  }

  private async process() {
    while (this.queue.length > 0 && this.running < this.maxConcurrent) {
      this.running++
      const fn = this.queue.shift()!
      fn().finally(() => { this.running--; this.process() })
    }
  }
}

2. Response Caching

// composables/useCachedChat.ts
const responseCache = new Map<string, { content: string; timestamp: number }>()
const CACHE_TTL = 5 * 60 * 1000

export function getCachedResponse(input: string): string | null {
  const cached = responseCache.get(input)
  if (!cached) return null
  if (Date.now() - cached.timestamp > CACHE_TTL) {
    responseCache.delete(input)
    return null
  }
  return cached.content
}

export function setCachedResponse(input: string, content: string) {
  responseCache.set(input, { content, timestamp: Date.now() })
}

3. Virtual Scrolling for Long Conversations

import { useVirtualList } from '@vueuse/core'

const { list, containerProps, wrapperProps } = useVirtualList(messages, {
  itemHeight: 80,
  overscan: 10,
})

Comparison Analysis

Interaction Pattern Real-time Complexity Use Case Perceived Latency
SSE Streaming ★★★★★ Low General chat, text generation <100ms
Function Calling ★★★★ Medium Tool invocation, data analysis 200-500ms
Multi-Model Routing ★★★ Medium Cost-sensitive, multi-scenario 100-2000ms
Markdown Rendering ★★★★★ Low Code display, rich text <50ms
Agent Autonomous ★★★ High Complex tasks, automation 1-10s
Frontend AI Solution Bundle Size Streaming SSR Compatible Vue3 Integration
Custom Composable 0KB ★★★★★ ★★★★★ ★★★★★
Vercel AI SDK 12KB ★★★★ ★★★★ ★★★
LangChain.js 200KB+ ★★★ ★★★ ★★
OpenAI SDK 50KB ★★★ ★★ ★★

Summary: Vue3 + Composable is the optimal paradigm for frontend AI integration—SSE streaming resolves waiting anxiety, Function Calling enables deep AI-UI linkage, multi-model routing balances cost and quality, Agent mode gives AI autonomous execution capability. These 5 patterns are not mutually exclusive but composable—a production-grade AI app often needs streaming output, tool invocation, and intelligent routing simultaneously. In 2026, frontend engineers must not only write UI but also write AI interactions.


Try these browser-local tools — no sign-up required →

#Vue3#AI集成#大模型#流式响应#SSE#Composable#前端AI#智能交互