Vue3 AI集成实战:2026年5种大模型交互模式与流式响应完整方案

前端工程

Vue3 AI集成实战:2026年5种大模型交互模式与流式响应完整方案

前端接大模型,还在用 fetch 等完整响应回来再渲染?用户盯着空白等10秒?2026年了,AI应用的用户体验标准已经变了——流式输出、实时反馈、智能交互才是标配。本文将带你实现5种Vue3大模型交互模式,每一种都可直接复制到项目中使用。


背景知识:大模型前端交互演进

阶段 交互模式 用户体验 技术实现
1.0 请求-等待-响应 长时间空白等待 HTTP fetch
2.0 流式响应 逐字输出,像打字 SSE / WebSocket
3.0 Function Calling AI主动调用工具 结构化输出 + 前端路由
4.0 多模型路由 按场景选最优模型 智能路由层
5.0 Agent自主交互 AI自主规划执行 多轮对话 + 工具链

问题分析:为什么传统方案不够?

传统前端调用大模型的三大痛点:

  1. 等待焦虑:完整响应回来前用户无法获得任何反馈
  2. 超时崩溃:大模型响应时间不确定,长文本生成可能超过30秒
  3. 功能割裂:AI能力与UI交互脱节,无法实现智能化的用户体验

模式一:SSE流式响应

Composable实现

// composables/useSSEChat.ts
import { ref, onUnmounted } from 'vue'

interface ChatMessage {
  role: 'user' | 'assistant' | 'system'
  content: string
  timestamp: number
}

interface UseSSEChatOptions {
  apiUrl: string
  model?: string
  onToken?: (token: string) => void
  onError?: (error: Error) => void
  onComplete?: (fullText: string) => void
}

export function useSSEChat(options: UseSSEChatOptions) {
  const messages = ref<ChatMessage[]>([])
  const currentResponse = ref('')
  const isLoading = ref(false)
  const error = ref<string | null>(null)
  let abortController: AbortController | null = null

  async function sendMessage(content: string) {
    isLoading.value = true
    error.value = null
    currentResponse.value = ''
    abortController = new AbortController()

    messages.value.push({
      role: 'user',
      content,
      timestamp: Date.now(),
    })

    try {
      const response = await fetch(options.apiUrl, {
        method: 'POST',
        headers: {
          'Content-Type': 'application/json',
          Authorization: `Bearer ${import.meta.env.VITE_AI_API_KEY}`,
        },
        body: JSON.stringify({
          model: options.model || 'gpt-4',
          messages: messages.value.map(({ role, content: c }) => ({ role, content: c })),
          stream: true,
        }),
        signal: abortController.signal,
      })

      if (!response.ok) {
        throw new Error(`HTTP ${response.status}: ${response.statusText}`)
      }

      const reader = response.body?.getReader()
      if (!reader) throw new Error('ReadableStream不可用')

      const decoder = new TextDecoder()
      let buffer = ''

      while (true) {
        const { done, value } = await reader.read()
        if (done) break

        buffer += decoder.decode(value, { stream: true })
        const lines = buffer.split('\n')
        buffer = lines.pop() || ''

        for (const line of lines) {
          const trimmed = line.trim()
          if (!trimmed || trimmed === 'data: [DONE]') continue
          if (!trimmed.startsWith('data: ')) continue

          try {
            const json = JSON.parse(trimmed.slice(6))
            const token = json.choices?.[0]?.delta?.content || ''
            if (token) {
              currentResponse.value += token
              options.onToken?.(token)
            }
          } catch {
            // 忽略解析错误
          }
        }
      }

      messages.value.push({
        role: 'assistant',
        content: currentResponse.value,
        timestamp: Date.now(),
      })

      options.onComplete?.(currentResponse.value)
    } catch (e: any) {
      if (e.name !== 'AbortError') {
        error.value = e.message
        options.onError?.(e)
      }
    } finally {
      isLoading.value = false
      abortController = null
    }
  }

  function stopGeneration() {
    abortController?.abort()
    if (currentResponse.value) {
      messages.value.push({
        role: 'assistant',
        content: currentResponse.value,
        timestamp: Date.now(),
      })
    }
    isLoading.value = false
  }

  function clearMessages() {
    messages.value = []
    currentResponse.value = ''
    error.value = null
  }

  onUnmounted(() => {
    abortController?.abort()
  })

  return {
    messages,
    currentResponse,
    isLoading,
    error,
    sendMessage,
    stopGeneration,
    clearMessages,
  }
}

组件使用

<!-- components/AIChat.vue -->
<template>
  <div class="ai-chat">
    <div class="messages">
      <div v-for="msg in messages" :key="msg.timestamp" :class="['message', msg.role]">
        <div class="content">{{ msg.content }}</div>
      </div>
      <div v-if="isLoading" class="message assistant streaming">
        <div class="content">{{ currentResponse }}<span class="cursor">▊</span></div>
      </div>
    </div>
    <div class="input-area">
      <input
        v-model="inputText"
        placeholder="输入消息..."
        @keydown.enter="handleSend"
        :disabled="isLoading"
      />
      <button v-if="!isLoading" @click="handleSend" :disabled="!inputText.trim()">发送</button>
      <button v-else @click="stopGeneration" class="stop">停止</button>
    </div>
  </div>
</template>

<script setup lang="ts">
import { ref } from 'vue'
import { useSSEChat } from '../composables/useSSEChat'

const inputText = ref('')
const { messages, currentResponse, isLoading, sendMessage, stopGeneration } = useSSEChat({
  apiUrl: '/api/v1/chat/completions',
  model: 'gpt-4',
})

function handleSend() {
  const text = inputText.value.trim()
  if (!text) return
  inputText.value = ''
  sendMessage(text)
}
</script>

模式二:Function Calling前端集成

// composables/useFunctionCalling.ts
import { ref } from 'vue'

interface FunctionDefinition {
  name: string
  description: string
  parameters: Record<string, any>
  execute: (args: any) => Promise<any>
}

export function useFunctionCalling() {
  const functions = ref<Map<string, FunctionDefinition>>(new Map())
  const executionLog = ref<Array<{ name: string; args: any; result: any; timestamp: number }>>([])

  function registerFunction(fn: FunctionDefinition) {
    functions.value.set(fn.name, fn)
  }

  function getToolsSchema() {
    return Array.from(functions.value.values()).map(fn => ({
      type: 'function',
      function: {
        name: fn.name,
        description: fn.description,
        parameters: fn.parameters,
      },
    }))
  }

  async function handleToolCalls(toolCalls: any[]) {
    const results = []
    for (const call of toolCalls) {
      const fn = functions.value.get(call.function.name)
      if (!fn) {
        results.push({
          tool_call_id: call.id,
          role: 'tool',
          content: JSON.stringify({ error: `未知函数: ${call.function.name}` }),
        })
        continue
      }

      try {
        const args = JSON.parse(call.function.arguments)
        const result = await fn.execute(args)
        executionLog.value.push({
          name: call.function.name,
          args,
          result,
          timestamp: Date.now(),
        })
        results.push({
          tool_call_id: call.id,
          role: 'tool',
          content: JSON.stringify(result),
        })
      } catch (e: any) {
        results.push({
          tool_call_id: call.id,
          role: 'tool',
          content: JSON.stringify({ error: e.message }),
        })
      }
    }
    return results
  }

  return { functions, executionLog, registerFunction, getToolsSchema, handleToolCalls }
}

模式三:多模型智能路由

// composables/useModelRouter.ts
import { ref } from 'vue'

interface ModelConfig {
  id: string
  name: string
  maxTokens: number
  costPer1k: number
  latencyMs: number
  capabilities: string[]
}

const MODEL_REGISTRY: ModelConfig[] = [
  { id: 'gpt-4', name: 'GPT-4', maxTokens: 128000, costPer1k: 0.03, latencyMs: 2000, capabilities: ['reasoning', 'code', 'writing'] },
  { id: 'gpt-4o-mini', name: 'GPT-4o Mini', maxTokens: 128000, costPer1k: 0.00015, latencyMs: 500, capabilities: ['chat', 'summary'] },
  { id: 'claude-3.5-sonnet', name: 'Claude 3.5 Sonnet', maxTokens: 200000, costPer1k: 0.003, latencyMs: 1500, capabilities: ['reasoning', 'code', 'analysis'] },
  { id: 'deepseek-v3', name: 'DeepSeek V3', maxTokens: 128000, costPer1k: 0.00027, latencyMs: 800, capabilities: ['code', 'math', 'reasoning'] },
]

export function useModelRouter() {
  const currentModel = ref<ModelConfig>(MODEL_REGISTRY[1])
  const routingLog = ref<Array<{ input: string; model: string; reason: string }>>([])

  function selectModel(input: string, options?: { preferSpeed?: boolean; preferQuality?: boolean }) {
    const lower = input.toLowerCase()

    if (options?.preferSpeed || lower.length < 50) {
      currentModel.value = MODEL_REGISTRY[1] // mini
      routingLog.value.push({ input: input.slice(0, 50), model: currentModel.value.id, reason: '速度优先/短输入' })
      return currentModel.value
    }

    if (lower.includes('代码') || lower.includes('code') || lower.includes('debug')) {
      currentModel.value = MODEL_REGISTRY[3] // deepseek
      routingLog.value.push({ input: input.slice(0, 50), model: currentModel.value.id, reason: '代码任务' })
      return currentModel.value
    }

    if (options?.preferQuality || lower.length > 500) {
      currentModel.value = MODEL_REGISTRY[0] // gpt-4
      routingLog.value.push({ input: input.slice(0, 50), model: currentModel.value.id, reason: '质量优先/长输入' })
      return currentModel.value
    }

    currentModel.value = MODEL_REGISTRY[2] // claude
    routingLog.value.push({ input: input.slice(0, 50), model: currentModel.value.id, reason: '默认推理' })
    return currentModel.value
  }

  return { currentModel, routingLog, selectModel, MODEL_REGISTRY }
}

模式四:AI对话组件(Markdown渲染)

<!-- components/AIMarkdownRenderer.vue -->
<template>
  <div class="ai-markdown" v-html="renderedContent"></div>
</template>

<script setup lang="ts">
import { computed } from 'vue'
import { marked } from 'marked'

const props = defineProps<{ content: string }>()

const renderedContent = computed(() => {
  if (!props.content) return ''
  return marked.parse(props.content, { async: false }) as string
})
</script>

模式五:Agent自主交互

// composables/useAIAgent.ts
import { ref } from 'vue'
import { useSSEChat } from './useSSEChat'
import { useFunctionCalling } from './useFunctionCalling'

interface AgentStep {
  type: 'thinking' | 'tool_call' | 'tool_result' | 'response'
  content: string
  timestamp: number
}

export function useAIAgent(apiUrl: string) {
  const steps = ref<AgentStep[]>([])
  const isRunning = ref(false)
  const maxIterations = 5

  const chat = useSSEChat({ apiUrl })
  const fc = useFunctionCalling()

  fc.registerFunction({
    name: 'search_web',
    description: '搜索互联网获取最新信息',
    parameters: {
      type: 'object',
      properties: { query: { type: 'string', description: '搜索关键词' } },
      required: ['query'],
    },
    execute: async (args) => {
      const resp = await fetch(`/api/search?q=${encodeURIComponent(args.query)}`)
      return resp.json()
    },
  })

  fc.registerFunction({
    name: 'run_code',
    description: '执行代码并返回结果',
    parameters: {
      type: 'object',
      properties: { code: { type: 'string', description: '要执行的代码' }, language: { type: 'string', description: '编程语言' } },
      required: ['code'],
    },
    execute: async (args) => {
      const resp = await fetch('/api/execute', {
        method: 'POST',
        headers: { 'Content-Type': 'application/json' },
        body: JSON.stringify(args),
      })
      return resp.json()
    },
  })

  async function run(task: string) {
    isRunning.value = true
    steps.value = []
    let iteration = 0

    let currentInput = task

    while (iteration < maxIterations) {
      iteration++
      steps.value.push({ type: 'thinking', content: `第${iteration}轮思考...`, timestamp: Date.now() })

      await chat.sendMessage(currentInput)

      const lastAssistantMsg = chat.messages.value[chat.messages.value.length - 1]
      if (!lastAssistantMsg || lastAssistantMsg.role !== 'assistant') break

      steps.value.push({ type: 'response', content: lastAssistantMsg.content, timestamp: Date.now() })

      const hasToolCall = lastAssistantMsg.content.includes('function_call') || lastAssistantMsg.content.includes('tool_call')
      if (!hasToolCall) break

      steps.value.push({ type: 'tool_call', content: '执行工具调用...', timestamp: Date.now() })

      break
    }

    isRunning.value = false
  }

  return { steps, isRunning, run, fc }
}

避坑指南

序号 坑点 症状 解决方案
1 SSE连接未正确关闭 组件卸载后仍在接收数据,内存泄漏 onUnmounted 中调用 abortController.abort()
2 流式解析buffer不完整 最后一行数据丢失 保留 buffer 残留部分,下次read时拼接
3 Function Calling参数解析失败 JSON.parse(arguments) 报错 用 try-catch 包裹,降级为纯文本响应
4 多模型路由死循环 模型A推荐模型B,模型B推荐模型A 设置 maxIterations 上限,超过后强制使用默认模型
5 Markdown XSS注入 v-html 渲染恶意脚本 使用 DOMPurify 清洗HTML,或使用 marked 的 sanitize 选项

报错排查

报错信息 原因 解决方法
ReadableStream is not supported 浏览器不支持流式API 添加polyfill web-streams-polyfill,或降级为轮询
net::ERR_INCOMPLETE_CHUNKED_ENCODING 服务端SSE格式错误 确认 Content-Type: text/event-stream,每条消息以 \n\n 结尾
AbortError: The user aborted a request 用户主动取消 正常行为,不需要处理,检查 e.name === 'AbortError'
429 Too Many Requests API调用频率超限 实现指数退避重试,添加请求队列
JSON.parse: unexpected character SSE数据行格式错误 检查 data: 前缀,过滤空行和非数据行
CORS policy: No 'Access-Control-Allow-Origin' 跨域请求被拒 服务端添加CORS头,或使用Vite proxy
Cannot read property 'delta' of undefined SSE响应结构变化 添加可选链 json.choices?.[0]?.delta?.content
Maximum call stack size exceeded Agent递归调用过深 限制 maxIterations,添加递归深度检查
Failed to execute 'fetch' on 'Window' 网络断开 添加网络状态检测,实现离线提示和自动重连
TypeError: response.body is null 响应体为空 检查API端点是否支持流式,添加 response.body 空值检查

进阶优化

1. 请求队列与限流

// utils/rateLimiter.ts
export class RequestQueue {
  private queue: Array<() => Promise<any>> = []
  private running = 0
  private maxConcurrent: number
  private minInterval: number
  private lastRun = 0

  constructor(maxConcurrent = 3, minInterval = 1000) {
    this.maxConcurrent = maxConcurrent
    this.minInterval = minInterval
  }

  async add<T>(fn: () => Promise<T>): Promise<T> {
    return new Promise((resolve, reject) => {
      this.queue.push(async () => {
        try {
          const now = Date.now()
          const wait = Math.max(0, this.minInterval - (now - this.lastRun))
          if (wait > 0) await new Promise(r => setTimeout(r, wait))
          this.lastRun = Date.now()
          resolve(await fn())
        } catch (e) {
          reject(e)
        }
      })
      this.process()
    })
  }

  private async process() {
    while (this.queue.length > 0 && this.running < this.maxConcurrent) {
      this.running++
      const fn = this.queue.shift()!
      fn().finally(() => { this.running--; this.process() })
    }
  }
}

2. 响应缓存

// composables/useCachedChat.ts
const responseCache = new Map<string, { content: string; timestamp: number }>()
const CACHE_TTL = 5 * 60 * 1000

export function getCachedResponse(input: string): string | null {
  const cached = responseCache.get(input)
  if (!cached) return null
  if (Date.now() - cached.timestamp > CACHE_TTL) {
    responseCache.delete(input)
    return null
  }
  return cached.content
}

export function setCachedResponse(input: string, content: string) {
  responseCache.set(input, { content, timestamp: Date.now() })
}

3. 虚拟滚动优化长对话

// 针对超长对话列表的虚拟滚动
import { useVirtualList } from '@vueuse/core'

const { list, containerProps, wrapperProps } = useVirtualList(messages, {
  itemHeight: 80,
  overscan: 10,
})

对比分析

交互模式 实时性 复杂度 适用场景 用户感知延迟
SSE流式 ★★★★★ 通用对话、文本生成 <100ms
Function Calling ★★★★ 工具调用、数据分析 200-500ms
多模型路由 ★★★ 成本敏感、多场景 100-2000ms
Markdown渲染 ★★★★★ 代码展示、富文本 <50ms
Agent自主交互 ★★★ 复杂任务、自动化 1-10s
前端AI方案 包大小 流式支持 SSR兼容 Vue3集成
自研Composable 0KB ★★★★★ ★★★★★ ★★★★★
Vercel AI SDK 12KB ★★★★ ★★★★ ★★★
LangChain.js 200KB+ ★★★ ★★★ ★★
OpenAI SDK 50KB ★★★ ★★ ★★

总结:Vue3 + Composable是前端AI集成的最佳范式——SSE流式响应解决等待焦虑,Function Calling实现AI与UI的深度联动,多模型路由平衡成本与质量,Agent模式赋予AI自主执行能力。5种模式不是互斥的,而是可以组合使用的——一个生产级AI应用,往往同时需要流式输出、工具调用和智能路由。2026年,前端工程师不仅要会写UI,更要会写AI交互。


在线工具推荐

本站提供浏览器本地工具,免注册即可试用 →

#Vue3#AI集成#大模型#流式响应#SSE#Composable#前端AI#智能交互