Python LLM結構化輸出:從JSON Schema到函式呼叫的6種生產模式
LLM輸出一坨自由文字,你的下游系統全崩了
你讓GPT回傳JSON,它給你帶註解的JSON;你指定欄位型別為整數,它回傳字串"42";你要求列表長度為3,它給你5個。LLM結構化輸出是2026年AI工程最核心的基礎能力——沒有它,你的RAG管道、Agent工具呼叫、資料提取管線全是定時炸彈。
本文將從JSON Schema約束出發,帶你完成JSON Schema驗證→OpenAI函式呼叫→Instructor自動重試→多模型適配→串流結構化輸出→生產可靠性保障的6種生產模式,從概念到落地,一步到位。
核心收穫
- 理解LLM結構化輸出的3種核心機制:Prompt約束、JSON Schema、函式呼叫協定
- 掌握6種從簡單到複雜的Python結構化輸出模式
- 學會Instructor庫的自動重試和驗證策略
- 實現跨模型(OpenAI/Anthropic/Gemini)的結構化輸出適配
- 構建生產級可靠性保障體系
目錄
- LLM結構化輸出核心概念
- 模式1:JSON Schema約束輸出
- 模式2:OpenAI函式呼叫協定
- 模式3:Instructor庫自動重試
- 模式4:多模型結構化輸出適配
- 模式5:串流結構化輸出
- 模式6:生產級可靠性保障
- 5個常見坑及解決方案
- 10個常見報錯排查
- 進階最佳化技巧
- 對比分析:3種結構化輸出方案
- 線上工具推薦
LLM結構化輸出核心概念
| 概念 | 說明 |
|---|---|
| Structured Output | LLM輸出符合預定義Schema的結構化資料(JSON/XML) |
| JSON Schema | 描述JSON資料結構的規範,用於約束和驗證LLM輸出 |
| Function Calling | OpenAI提出的協定,讓LLM輸出符合函式參數Schema的JSON |
| Tool Use | Anthropic/Gemini對函式呼叫的實作,語義相同 |
| Constrained Decoding | 推理時約束token選擇,保證輸出100%符合Schema |
| Instructor | Python庫,基於Pydantic模型自動產生Schema+驗證+重試 |
為什麼LLM結構化輸出如此重要
傳統LLM輸出流程:
使用者Prompt → LLM自由生成 → 字串 → 正則/JSON解析 → 可能失敗 → 重試
結構化輸出流程:
使用者Prompt + Schema → LLM受約束生成 → 合法JSON → Pydantic驗證 → 成功
關鍵差異:
1. 傳統方式:輸出不可預測,解析脆弱,重試成本高
2. 結構化輸出:輸出可預測,驗證可靠,重試有保障
3種結構化輸出機制對比
| 機制 | 原理 | 可靠性 | 延遲 | 相容性 |
|---|---|---|---|---|
| Prompt約束 | 在提示詞中描述輸出格式 | ⭐低 | 無額外 | 所有模型 |
| JSON Schema | 透過Schema約束輸出結構 | ⭐⭐中 | 輕微 | 部分模型 |
| 函式呼叫協定 | 專用API通道+Constrained Decoding | ⭐⭐⭐高 | 輕微 | 特定模型 |
模式1:JSON Schema約束輸出
最基礎的結構化輸出方式:在Prompt中描述格式要求,用JSON Schema驗證結果。
import json
import re
from typing import Optional
from pydantic import BaseModel, Field, ValidationError
class MovieReview(BaseModel):
title: str = Field(description="電影名稱")
rating: int = Field(ge=1, le=10, description="評分1-10")
sentiment: str = Field(pattern="^(positive|negative|neutral)$")
summary: str = Field(max_length=200, description="簡短評價")
recommended: bool = Field(description="是否推薦")
MOVIE_REVIEW_SCHEMA = MovieReview.model_json_schema()
STRUCTURED_PROMPT = """你是一個專業的電影評論分析器。
請分析以下評論,並嚴格按照JSON Schema回傳結果。
JSON Schema:
{schema}
評論內容:
{review}
重要要求:
1. 必須回傳合法JSON
2. rating必須是1-10的整數
3. sentiment只能是positive/negative/neutral
4. 不要新增任何JSON以外的內容
5. 不要用```json```包裹
"""
def extract_json_from_response(text: str) -> Optional[dict]:
patterns = [
r'```json\s*(.*?)\s*```',
r'```\s*(.*?)\s*```',
r'(\{[^{}]*(?:\{[^{}]*\}[^{}]*)*\})',
]
for pattern in patterns:
match = re.search(pattern, text, re.DOTALL)
if match:
try:
return json.loads(match.group(1))
except json.JSONDecodeError:
continue
try:
return json.loads(text.strip())
except json.JSONDecodeError:
return None
def parse_structured_output(raw_text: str) -> Optional[MovieReview]:
parsed_json = extract_json_from_response(raw_text)
if parsed_json is None:
return None
try:
return MovieReview.model_validate(parsed_json)
except ValidationError as e:
print(f"驗證失敗: {e}")
return None
async def call_llm_with_schema(prompt: str) -> Optional[MovieReview]:
from openai import AsyncOpenAI
client = AsyncOpenAI()
formatted_prompt = STRUCTURED_PROMPT.format(
schema=json.dumps(MOVIE_REVIEW_SCHEMA, ensure_ascii=False, indent=2),
review=prompt
)
response = await client.chat.completions.create(
model="gpt-4o",
messages=[{"role": "user", "content": formatted_prompt}],
temperature=0.1,
)
raw_text = response.choices[0].message.content or ""
return parse_structured_output(raw_text)
JSON Schema驗證的侷限
問題1:LLM可能回傳不合法的JSON
→ 需要extract_json_from_response做容錯提取
問題2:LLM可能忽略Schema約束
→ rating回傳"9分"而不是9
→ sentiment回傳"很積極"而不是positive
問題3:巢狀結構容易出錯
→ 列表長度不可控
→ 可選欄位可能缺失
問題4:每次都要手寫Prompt
→ 維護成本高,容易遺漏
模式2:OpenAI函式呼叫協定
OpenAI的Function Calling協定是結構化輸出的標準方案,透過專用API通道讓LLM輸出符合Schema的JSON。
import json
from typing import Optional
from pydantic import BaseModel, Field
from openai import AsyncOpenAI
class SentimentAnalysis(BaseModel):
text: str = Field(description="被分析的文字")
sentiment: str = Field(description="情感傾向: positive/negative/neutral")
confidence: float = Field(ge=0.0, le=1.0, description="置信度0-1")
keywords: list[str] = Field(description="關鍵詞列表")
language: str = Field(description="偵測到的語言")
class EntityExtraction(BaseModel):
entities: list[dict] = Field(description="提取的實體列表")
relationships: list[dict] = Field(default_factory=list, description="實體間關係")
summary: str = Field(description="文字摘要")
def pydantic_to_function_schema(model_class: type[BaseModel]) -> dict:
schema = model_class.model_json_schema()
return {
"type": "function",
"function": {
"name": model_class.__name__,
"description": model_class.__doc__ or f"Extract {model_class.__name__}",
"parameters": {
"type": "object",
"properties": schema.get("properties", {}),
"required": schema.get("required", []),
}
}
}
async def function_calling_extract(
text: str,
model_class: type[BaseModel],
model: str = "gpt-4o"
) -> Optional[BaseModel]:
client = AsyncOpenAI()
function_schema = pydantic_to_function_schema(model_class)
response = await client.chat.completions.create(
model=model,
messages=[
{
"role": "system",
"content": "你是一個精確的資料提取助手。使用提供的函式來結構化輸出結果。"
},
{
"role": "user",
"content": text
}
],
tools=[function_schema],
tool_choice={"type": "function", "function": {"name": model_class.__name__}},
)
message = response.choices[0].message
if message.tool_calls and len(message.tool_calls) > 0:
tool_call = message.tool_calls[0]
try:
args = json.loads(tool_call.function.arguments)
return model_class.model_validate(args)
except (json.JSONDecodeError, Exception) as e:
print(f"解析函式呼叫結果失敗: {e}")
return None
return None
async def multi_function_calling(
text: str,
model_classes: list[type[BaseModel]],
model: str = "gpt-4o"
) -> dict[str, BaseModel]:
client = AsyncOpenAI()
tool_schemas = [pydantic_to_function_schema(cls) for cls in model_classes]
response = await client.chat.completions.create(
model=model,
messages=[
{"role": "system", "content": "你是一個多工資料提取助手。"},
{"role": "user", "content": text}
],
tools=tool_schemas,
tool_choice="auto",
)
results = {}
message = response.choices[0].message
if message.tool_calls:
for tool_call in message.tool_calls:
for cls in model_classes:
if tool_call.function.name == cls.__name__:
try:
args = json.loads(tool_call.function.arguments)
results[cls.__name__] = cls.model_validate(args)
except Exception as e:
print(f"解析{cls.__name__}失敗: {e}")
return results
函式呼叫協定的Strict Mode
from openai import AsyncOpenAI
async def strict_structured_output(
text: str,
model_class: type[BaseModel],
model: str = "gpt-4o-2024-08-06"
) -> Optional[BaseModel]:
client = AsyncOpenAI()
schema = model_class.model_json_schema()
response = await client.chat.completions.create(
model=model,
messages=[
{"role": "system", "content": "提取結構化資料"},
{"role": "user", "content": text}
],
response_format={
"type": "json_schema",
"json_schema": {
"name": model_class.__name__,
"strict": True,
"schema": schema,
}
}
)
raw = response.choices[0].message.content
if raw:
try:
return model_class.model_validate(json.loads(raw))
except Exception as e:
print(f"Strict mode解析失敗: {e}")
return None
模式3:Instructor庫自動重試
Instructor庫是Python LLM結構化輸出的最佳實踐,基於Pydantic模型自動產生Schema、驗證輸出、自動重試。
import instructor
from pydantic import BaseModel, Field, field_validator
from openai import AsyncOpenAI
class ProductInfo(BaseModel):
name: str = Field(description="產品名稱")
price: float = Field(gt=0, description="價格,必須大於0")
category: str = Field(description="產品分類")
features: list[str] = Field(description="產品特性列表", min_length=1, max_length=5)
in_stock: bool = Field(description="是否有庫存")
@field_validator("price")
@classmethod
def round_price(cls, v: float) -> float:
return round(v, 2)
@field_validator("category")
@classmethod
def normalize_category(cls, v: str) -> str:
return v.strip().lower()
class ArticleMetadata(BaseModel):
title: str = Field(description="文章標題")
author: str = Field(description="作者")
publish_date: str = Field(description="發布日期,格式YYYY-MM-DD")
tags: list[str] = Field(description="標籤列表")
word_count: int = Field(ge=0, description="字數")
reading_time_minutes: int = Field(ge=1, description="預計閱讀時間(分鐘)")
@field_validator("publish_date")
@classmethod
def validate_date_format(cls, v: str) -> str:
import re
if not re.match(r'^\d{4}-\d{2}-\d{2}$', v):
raise ValueError(f"日期格式錯誤: {v},需要YYYY-MM-DD")
return v
async def instructor_extract_product(text: str) -> ProductInfo:
client = instructor.from_openai(AsyncOpenAI())
result = await client.chat.completions.create(
model="gpt-4o",
messages=[
{"role": "user", "content": f"從以下文字中提取產品資訊:\n\n{text}"}
],
response_model=ProductInfo,
max_retries=3,
temperature=0.1,
)
return result
async def instructor_extract_with_mode(
text: str,
mode: instructor.Mode = instructor.Mode.TOOLS
) -> ArticleMetadata:
client = instructor.from_openai(AsyncOpenAI(), mode=mode)
result = await client.chat.completions.create(
model="gpt-4o",
messages=[
{"role": "user", "content": f"提取文章元資料:\n\n{text}"}
],
response_model=ArticleMetadata,
max_retries=3,
)
return result
async def instructor_partial_streaming(text: str):
client = instructor.from_openai(AsyncOpenAI())
article = await client.chat.completions.create_partial(
model="gpt-4o",
messages=[
{"role": "user", "content": f"提取文章元資料:\n\n{text}"}
],
response_model=ArticleMetadata,
max_retries=3,
)
async for partial in article:
print(f"部分結果: {partial.model_dump_json(exclude_none=True)}")
Instructor的Mode選擇
Mode.JSON_SCHEMA → OpenAI的response_format=json_schema(推薦,最可靠)
Mode.TOOLS → OpenAI的function calling(相容性好)
Mode.JSON → 在Prompt中要求JSON輸出(最通用,可靠性最低)
Mode.ANTHROPIC_TOOLS→ Anthropic的tool_use
Mode.GEMINI_JSON → Gemini的JSON模式
Instructor重試策略詳解
import instructor
from pydantic import BaseModel, Field, ValidationError
from openai import AsyncOpenAI
from tenacity import retry, stop_after_attempt, wait_exponential
class StrictUser(BaseModel):
name: str = Field(min_length=2, max_length=50)
age: int = Field(ge=0, le=150)
email: str = Field(pattern=r'^[\w\.-]+@[\w\.-]+\.\w+$')
async def instructor_with_custom_retry(text: str) -> StrictUser:
client = instructor.from_openai(
AsyncOpenAI(),
mode=instructor.Mode.JSON_SCHEMA,
)
result, completion = await client.chat.completions.create_with_completion(
model="gpt-4o",
messages=[{"role": "user", "content": text}],
response_model=StrictUser,
max_retries=3,
validation_context={"strict": True},
)
print(f"Token使用: prompt={completion.usage.prompt_tokens}, "
f"completion={completion.usage.completion_tokens}")
return result
async def instructor_batch_extract(
texts: list[str],
) -> list[StrictUser]:
client = instructor.from_openai(AsyncOpenAI())
results = []
for text in texts:
try:
result = await client.chat.completions.create(
model="gpt-4o",
messages=[{"role": "user", "content": text}],
response_model=StrictUser,
max_retries=2,
)
results.append(result)
except instructor.exceptions.InstructorRetryException as e:
print(f"批次提取失敗,跳過: {e}")
results.append(None)
return results
模式4:多模型結構化輸出適配
不同LLM廠商的結構化輸出API各不相同,需要適配層統一處理。
import json
from abc import ABC, abstractmethod
from typing import Optional, TypeVar
from pydantic import BaseModel
from openai import AsyncOpenAI
T = TypeVar("T", bound=BaseModel)
class StructuredOutputAdapter(ABC):
@abstractmethod
async def extract(self, text: str, model_class: type[T]) -> Optional[T]:
pass
class OpenAIStructuredAdapter(StructuredOutputAdapter):
def __init__(self, model: str = "gpt-4o"):
self.client = AsyncOpenAI()
self.model = model
async def extract(self, text: str, model_class: type[T]) -> Optional[T]:
import instructor
client = instructor.from_openai(self.client, mode=instructor.Mode.JSON_SCHEMA)
return await client.chat.completions.create(
model=self.model,
messages=[{"role": "user", "content": text}],
response_model=model_class,
max_retries=3,
)
class AnthropicStructuredAdapter(StructuredOutputAdapter):
def __init__(self, model: str = "claude-sonnet-4-20250514"):
try:
import anthropic
self.client = anthropic.AsyncAnthropic()
except ImportError:
raise ImportError("請安裝anthropic: pip install anthropic")
self.model = model
async def extract(self, text: str, model_class: type[T]) -> Optional[T]:
import anthropic
import instructor
client = instructor.from_anthropic(
self.client,
mode=instructor.Mode.ANTHROPIC_TOOLS,
)
return await client.chat.completions.create(
model=self.model,
max_tokens=4096,
messages=[{"role": "user", "content": text}],
response_model=model_class,
max_retries=3,
)
class GeminiStructuredAdapter(StructuredOutputAdapter):
def __init__(self, model: str = "gemini-2.0-flash"):
self.model_name = model
async def extract(self, text: str, model_class: type[T]) -> Optional[T]:
import google.generativeai as genai
import os
genai.configure(api_key=os.environ.get("GEMINI_API_KEY"))
model = genai.GenerativeModel(self.model_name)
schema = model_class.model_json_schema()
prompt = f"""從以下文字中提取結構化資料。
嚴格按照JSON Schema回傳結果,不要新增任何額外內容。
JSON Schema:
{json.dumps(schema, ensure_ascii=False, indent=2)}
文字:
{text}"""
response = await model.generate_content_async(prompt)
raw = response.text.strip()
if raw.startswith("```json"):
raw = raw[7:]
if raw.endswith("```"):
raw = raw[:-3]
raw = raw.strip()
try:
return model_class.model_validate(json.loads(raw))
except Exception as e:
print(f"Gemini解析失敗: {e}")
return None
class MultiModelStructuredExtractor:
def __init__(self):
self.adapters: dict[str, StructuredOutputAdapter] = {}
def register(self, name: str, adapter: StructuredOutputAdapter):
self.adapters[name] = adapter
async def extract(
self,
text: str,
model_class: type[T],
preferred: str = "openai",
fallback: bool = True,
) -> Optional[T]:
order = [preferred]
if fallback:
order.extend([k for k in self.adapters if k != preferred])
for model_name in order:
adapter = self.adapters.get(model_name)
if adapter is None:
continue
try:
result = await adapter.extract(text, model_class)
if result is not None:
return result
except Exception as e:
print(f"[{model_name}] 提取失敗: {e}")
continue
return None
async def extract_consensus(
self,
text: str,
model_class: type[T],
min_agreement: int = 2,
) -> Optional[T]:
import asyncio
tasks = {
name: adapter.extract(text, model_class)
for name, adapter in self.adapters.items()
}
results = await asyncio.gather(*tasks.values(), return_exceptions=True)
valid_results = []
for (name, _), result in zip(tasks.items(), results):
if isinstance(result, Exception):
print(f"[{name}] 異常: {result}")
continue
if result is not None:
valid_results.append(result)
if len(valid_results) >= min_agreement:
return valid_results[0]
return valid_results[0] if valid_results else None
多模型適配架構
┌─────────────────────┐
│ MultiModelExtractor │
│ (統一介面) │
└──────────┬──────────┘
│
┌────────────────┼────────────────┐
│ │ │
┌─────────▼──────┐ ┌──────▼───────┐ ┌──────▼───────┐
│ OpenAI Adapter │ │Anthropic Adp │ │ Gemini Adptr │
│ JSON_SCHEMA │ │ANTHROPIC_TOOLS│ │ Prompt+Parse │
│ Instructor │ │ Instructor │ │ 手動解析 │
└────────────────┘ └──────────────┘ └──────────────┘
模式5:串流結構化輸出
LLM結構化輸出結合串流傳輸,實現即時解析和漸進式展示。
import json
import asyncio
from typing import AsyncIterator, Optional
from pydantic import BaseModel, Field
from openai import AsyncOpenAI
class StreamingJsonParser:
def __init__(self):
self.buffer = ""
self.depth = 0
self.in_string = False
self.escape_next = False
self.started = False
def feed(self, chunk: str) -> list[dict]:
self.buffer += chunk
results = []
for char in chunk:
if self.escape_next:
self.escape_next = False
continue
if char == '\\' and self.in_string:
self.escape_next = True
continue
if char == '"' and not self.escape_next:
self.in_string = not self.in_string
continue
if self.in_string:
continue
if char == '{':
if not self.started:
self.started = True
idx = self.buffer.rfind('{')
self.buffer = self.buffer[idx:]
self.depth += 1
elif char == '}':
self.depth -= 1
if self.depth == 0 and self.started:
try:
parsed = json.loads(self.buffer)
results.append(parsed)
except json.JSONDecodeError:
pass
self.buffer = ""
self.started = False
return results
class PartialModelBuilder:
def __init__(self, model_class: type[BaseModel]):
self.model_class = model_class
self.current_json = {}
self.last_valid = None
def update(self, json_data: dict) -> Optional[BaseModel]:
self.current_json.update(json_data)
try:
self.last_valid = self.model_class.model_validate(self.current_json)
return self.last_valid
except Exception:
return self.last_valid
class StreamingEvent(BaseModel):
event_type: str = Field(description="事件型別")
data: dict = Field(description="事件資料")
confidence: float = Field(ge=0.0, le=1.0, description="置信度")
async def stream_structured_output(
prompt: str,
model_class: type[BaseModel],
model: str = "gpt-4o",
) -> AsyncIterator[BaseModel]:
import instructor
client = instructor.from_openai(AsyncOpenAI())
stream = await client.chat.completions.create_partial(
model=model,
messages=[{"role": "user", "content": prompt}],
response_model=model_class,
max_retries=2,
)
async for partial in stream:
yield partial
async def stream_with_raw_parser(
prompt: str,
model: str = "gpt-4o",
) -> AsyncIterator[dict]:
client = AsyncOpenAI()
schema_prompt = f"""請以JSON格式回傳結果。只回傳JSON,不要其他內容。
{prompt}"""
stream = await client.chat.completions.create(
model=model,
messages=[{"role": "user", "content": schema_prompt}],
stream=True,
temperature=0.1,
)
parser = StreamingJsonParser()
full_content = ""
async for chunk in stream:
if chunk.choices and chunk.choices[0].delta.content:
content = chunk.choices[0].delta.content
full_content += content
parsed_results = parser.feed(content)
for result in parsed_results:
yield result
if not parsed_results and full_content:
try:
yield json.loads(full_content)
except json.JSONDecodeError:
pass
async def stream_sse_structured(
prompt: str,
model_class: type[BaseModel],
):
from fastapi import FastAPI
from fastapi.responses import StreamingResponse
app = FastAPI()
async def generate():
async for partial in stream_structured_output(prompt, model_class):
data = partial.model_dump_json(exclude_none=True)
yield f"data: {data}\n\n"
return StreamingResponse(
generate(),
media_type="text/event-stream",
headers={"X-Accel-Buffering": "no"},
)
串流結構化輸出架構
使用者請求
│
▼
FastAPI SSE端點
│
▼
Instructor create_partial()
│
├──→ chunk1: {"name": "產品A"...}
├──→ chunk2: {"name": "產品A", "price": 99...}
├──→ chunk3: {"name": "產品A", "price": 99.0, "category": "電子"...}
└──→ 最終: 完整的Pydantic模型實例
每個chunk透過SSE推送到使用者端
使用者端漸進式渲染UI
模式6:生產級可靠性保障
將所有模式整合為生產可用的結構化輸出服務。
import json
import time
import logging
from typing import Optional
from dataclasses import dataclass, field
from enum import Enum
from pydantic import BaseModel, Field
from openai import AsyncOpenAI
logger = logging.getLogger(__name__)
class OutputStatus(str, Enum):
SUCCESS = "success"
VALIDATION_FAILED = "validation_failed"
PARSE_FAILED = "parse_failed"
LLM_ERROR = "llm_error"
TIMEOUT = "timeout"
RETRY_EXHAUSTED = "retry_exhausted"
@dataclass
class ExtractionResult:
data: Optional[BaseModel] = None
status: OutputStatus = OutputStatus.SUCCESS
attempts: int = 0
latency_ms: float = 0.0
error_message: str = ""
model_used: str = ""
tokens_used: dict = field(default_factory=dict)
class CircuitBreaker:
def __init__(
self,
failure_threshold: int = 5,
recovery_timeout: float = 60.0,
):
self.failure_threshold = failure_threshold
self.recovery_timeout = recovery_timeout
self.failure_count = 0
self.last_failure_time: Optional[float] = None
self.is_open = False
def record_failure(self):
self.failure_count += 1
self.last_failure_time = time.time()
if self.failure_count >= self.failure_threshold:
self.is_open = True
def record_success(self):
self.failure_count = 0
self.is_open = False
def can_execute(self) -> bool:
if not self.is_open:
return True
if self.last_failure_time and \
time.time() - self.last_failure_time > self.recovery_timeout:
self.is_open = False
self.failure_count = 0
return True
return False
class StructuredOutputService:
def __init__(
self,
max_retries: int = 3,
timeout: float = 30.0,
fallback_models: Optional[list[str]] = None,
):
self.client = AsyncOpenAI()
self.max_retries = max_retries
self.timeout = timeout
self.fallback_models = fallback_models or ["gpt-4o", "gpt-4o-mini"]
self.circuit_breakers: dict[str, CircuitBreaker] = {}
def _get_breaker(self, model: str) -> CircuitBreaker:
if model not in self.circuit_breakers:
self.circuit_breakers[model] = CircuitBreaker()
return self.circuit_breakers[model]
async def extract(
self,
text: str,
model_class: type[BaseModel],
preferred_model: Optional[str] = None,
) -> ExtractionResult:
import instructor
models = [preferred_model] if preferred_model else self.fallback_models
models = [m for m in models if self._get_breaker(m).can_execute()]
if not models:
return ExtractionResult(
status=OutputStatus.RETRY_EXHAUSTED,
error_message="所有模型熔斷器已開啟",
)
for model in models:
result = await self._try_extract(text, model_class, model)
if result.status == OutputStatus.SUCCESS:
self._get_breaker(model).record_success()
return result
else:
self._get_breaker(model).record_failure()
logger.warning(f"模型{model}提取失敗: {result.error_message}")
return result
async def _try_extract(
self,
text: str,
model_class: type[BaseModel],
model: str,
) -> ExtractionResult:
import instructor
start_time = time.time()
client = instructor.from_openai(self.client, mode=instructor.Mode.JSON_SCHEMA)
for attempt in range(1, self.max_retries + 1):
try:
result = await client.chat.completions.create(
model=model,
messages=[{"role": "user", "content": text}],
response_model=model_class,
max_retries=0,
timeout=self.timeout,
)
latency = (time.time() - start_time) * 1000
return ExtractionResult(
data=result,
status=OutputStatus.SUCCESS,
attempts=attempt,
latency_ms=latency,
model_used=model,
)
except instructor.exceptions.InstructorRetryException as e:
logger.warning(f"嘗試{attempt}驗證失敗: {e}")
continue
except Exception as e:
error_msg = str(e)
if "timeout" in error_msg.lower():
return ExtractionResult(
status=OutputStatus.TIMEOUT,
attempts=attempt,
latency_ms=(time.time() - start_time) * 1000,
error_message=error_msg,
model_used=model,
)
logger.error(f"嘗試{attempt}LLM錯誤: {e}")
continue
return ExtractionResult(
status=OutputStatus.RETRY_EXHAUSTED,
attempts=self.max_retries,
latency_ms=(time.time() - start_time) * 1000,
error_message=f"重試{self.max_retries}次後仍失敗",
model_used=model,
)
class StructuredOutputCache:
def __init__(self, ttl: float = 3600.0, max_size: int = 1000):
self.ttl = ttl
self.max_size = max_size
self._cache: dict[str, tuple[float, BaseModel]] = {}
def _make_key(self, text: str, model_class: type[BaseModel]) -> str:
import hashlib
content_hash = hashlib.sha256(text.encode()).hexdigest()[:16]
return f"{model_class.__name__}:{content_hash}"
def get(self, text: str, model_class: type[BaseModel]) -> Optional[BaseModel]:
key = self._make_key(text, model_class)
if key in self._cache:
timestamp, data = self._cache[key]
if time.time() - timestamp < self.ttl:
return data
del self._cache[key]
return None
def set(self, text: str, model_class: type[BaseModel], data: BaseModel):
if len(self._cache) >= self.max_size:
oldest_key = min(self._cache, key=lambda k: self._cache[k][0])
del self._cache[oldest_key]
key = self._make_key(text, model_class)
self._cache[key] = (time.time(), data)
生產架構全景
┌────────────────────────────┐
│ StructuredOutputService │
│ (統一入口) │
└──────────┬─────────────────┘
│
┌──────────▼─────────────────┐
│ CircuitBreaker │
│ (模型級熔斷) │
└──────────┬─────────────────┘
│
┌────────────────┼────────────────┐
│ │ │
┌─────────▼──────┐ ┌──────▼───────┐ ┌──────▼───────┐
│ gpt-4o │ │ gpt-4o-mini │ │ fallback │
│ JSON_SCHEMA │ │ JSON_SCHEMA │ │ Prompt+Parse│
│ +Instructor │ │ +Instructor │ │ +重試 │
└────────────────┘ └──────────────┘ └──────────────┘
│ │ │
└────────────────┼────────────────┘
│
┌──────────▼─────────────────┐
│ StructuredOutputCache │
│ (結果快取) │
└────────────────────────────┘
5個常見坑及解決方案
坑1:LLM回傳的JSON帶註解
import json
import re
def strip_json_comments(text: str) -> str:
text = re.sub(r'//.*?$', '', text, flags=re.MULTILINE)
text = re.sub(r'/\*.*?\*/', '', text, flags=re.DOTALL)
return text
raw = '''{
"name": "產品A", // 這是註解
"price": 99.0
/* 多行
註解 */
}'''
clean = strip_json_comments(raw)
data = json.loads(clean)
坑2:巢狀Schema導致輸出截斷
from pydantic import BaseModel, Field
class Address(BaseModel):
street: str
city: str
zip_code: str
class PersonFlat(BaseModel):
name: str
street: str = Field(description="街道地址")
city: str = Field(description="城市")
zip_code: str = Field(description="郵遞區號")
class PersonNested(BaseModel):
name: str
address: Address
# 建議:巢狀層級不超過2層,超過則展平
# 不推薦:Person → Address → GeoLocation → Coordinates
# 推薦:PersonFlat(所有欄位在同一層級)
坑3:列舉值LLM不遵守
from enum import Enum
from pydantic import BaseModel, Field, field_validator
class Sentiment(str, Enum):
POSITIVE = "positive"
NEGATIVE = "negative"
NEUTRAL = "neutral"
class ReviewWithEnum(BaseModel):
text: str
sentiment: Sentiment
@field_validator("sentiment", mode="before")
@classmethod
def normalize_sentiment(cls, v):
if isinstance(v, str):
v = v.strip().lower()
mapping = {
"積極": "positive", "正面": "positive", "好": "positive",
"消極": "negative", "負面": "negative", "差": "negative",
"中性": "neutral", "一般": "neutral",
}
return mapping.get(v, v)
return v
坑4:列表長度不可控
from pydantic import BaseModel, Field, field_validator
class TaggedContent(BaseModel):
content: str
tags: list[str] = Field(min_length=1, max_length=5)
@field_validator("tags")
@classmethod
def deduplicate_tags(cls, v: list[str]) -> list[str]:
seen = set()
result = []
for tag in v:
normalized = tag.strip().lower()
if normalized not in seen:
seen.add(normalized)
result.append(tag.strip())
return result[:5]
坑5:Strict Mode不支援所有Schema特性
from pydantic import BaseModel, Field
# Strict Mode不支援的特性:
# 1. additionalProperties: false 必須顯式設定
# 2. 可選欄位必須有default值
# 3. 不支援union型別(部分模型)
# 4. 不支援複雜的正則pattern
# 解決方案:簡化Schema + field_validator補償
class SimpleProduct(BaseModel):
name: str
price: float = Field(gt=0)
category: str = Field(default="other")
tags: list[str] = Field(default_factory=list)
10個常見報錯排查
| # | 報錯資訊 | 原因 | 解決方案 |
|---|---|---|---|
| 1 | json.decoder.JSONDecodeError |
LLM回傳的不是合法JSON | 使用extract_json_from_response容錯提取 |
| 2 | ValidationError: field required |
LLM遺漏了必填欄位 | 新增default值或使用Instructor自動重試 |
| 3 | InstructorRetryException: max retries |
重試3次仍無法通過驗證 | 檢查Schema是否過於複雜,簡化巢狀 |
| 4 | TypeError: 'NoneType' object |
tool_calls為空,LLM未呼叫函式 | 檢查tool_choice設定,確認模型支援函式呼叫 |
| 5 | RateLimitError: 429 |
API呼叫頻率超限 | 新增指數退避重試,降低並行 |
| 6 | Timeout: request timed out |
LLM推理逾時 | 減小Schema複雜度,增加timeout參數 |
| 7 | BadRequestError: Invalid schema |
Schema不符合模型要求 | 檢查strict mode限制,簡化Schema |
| 8 | ValidationError: string too long |
LLM回傳超長字串 | 新增max_length約束 |
| 9 | KeyError: 'tool_calls' |
模型不支援函式呼叫 | 切換到JSON Schema模式或Prompt模式 |
| 10 | RecursionError: maximum depth |
Schema巢狀層級過深 | 展平巢狀結構,最多2層 |
進階最佳化技巧
技巧1:Few-shot範例提升準確率
from pydantic import BaseModel, Field
from openai import AsyncOpenAI
import instructor
class Classification(BaseModel):
category: str = Field(description="分類")
confidence: float = Field(ge=0.0, le=1.0)
async def few_shot_extract(text: str) -> Classification:
client = instructor.from_openai(AsyncOpenAI())
examples = [
{"role": "user", "content": "這個產品太棒了,強烈推薦!"},
{"role": "assistant", "content": '{"category": "positive", "confidence": 0.95}'},
{"role": "user", "content": "品質一般,價格偏高"},
{"role": "assistant", "content": '{"category": "neutral", "confidence": 0.7}'},
]
return await client.chat.completions.create(
model="gpt-4o",
messages=examples + [{"role": "user", "content": text}],
response_model=Classification,
max_retries=2,
)
技巧2:Schema描述最佳化
from pydantic import BaseModel, Field
class BadSchema(BaseModel):
type: str
value: str
class GoodSchema(BaseModel):
type: str = Field(
description="實體型別,只能是: person, organization, location, date"
)
value: str = Field(
description="實體的標準化值。person用全名,organization用官方名稱,"
"date用YYYY-MM-DD格式,location用城市+國家"
)
技巧3:分步提取複雜結構
from pydantic import BaseModel, Field
from openai import AsyncOpenAI
import instructor
class BasicInfo(BaseModel):
title: str
summary: str
class DetailedInfo(BasicInfo):
key_points: list[str]
entities: list[str]
sentiment: str
async def progressive_extract(text: str) -> DetailedInfo:
client = instructor.from_openai(AsyncOpenAI())
basic = await client.chat.completions.create(
model="gpt-4o",
messages=[{"role": "user", "content": f"提取基本資訊:\n{text}"}],
response_model=BasicInfo,
max_retries=2,
)
detailed = await client.chat.completions.create(
model="gpt-4o",
messages=[
{"role": "user", "content": f"基於以下基本資訊,提取詳細分析:\n"
f"標題: {basic.title}\n摘要: {basic.summary}\n\n原文:\n{text}"}
],
response_model=DetailedInfo,
max_retries=2,
)
return detailed
技巧4:輸出品質自檢
from pydantic import BaseModel, Field, model_validator
class SelfValidatingOutput(BaseModel):
question: str
answer: str
sources: list[str] = Field(min_length=1)
confidence: float = Field(ge=0.0, le=1.0)
@model_validator(mode="after")
def check_answer_quality(self):
if len(self.answer) < 10:
raise ValueError("回答太短,可能不完整")
if self.confidence > 0.9 and len(self.sources) < 2:
raise ValueError("高置信度但來源不足,請重新驗證")
return self
對比分析:3種結構化輸出方案
| 維度 | Prompt+JSON解析 | 函式呼叫協定 | Instructor庫 |
|---|---|---|---|
| 可靠性 | ⭐⭐ 60-80% | ⭐⭐⭐⭐ 90-95% | ⭐⭐⭐⭐⭐ 95-99% |
| 實作複雜度 | 低 | 中 | 低(封裝後) |
| 模型相容性 | 所有模型 | OpenAI/部分模型 | OpenAI/Anthropic/Gemini |
| 自動重試 | ❌需手動 | ❌需手動 | ✅內建 |
| 串流支援 | ❌困難 | ⚠️有限 | ✅create_partial |
| Schema驗證 | ❌需手動 | ⚠️部分 | ✅Pydantic自動 |
| 除錯難度 | 高 | 中 | 低 |
| 生產推薦度 | ⭐不推薦 | ⭐⭐⭐推薦 | ⭐⭐⭐⭐⭐強烈推薦 |
| Token開銷 | 低 | 中(+tool定義) | 中(+tool定義) |
| 巢狀深度 | 無限制 | 有限制 | 有限制 |
選型決策樹
是否需要結構化輸出?
├── 否 → 直接使用Chat Completion
└── 是 → 使用什麼模型?
├── 僅OpenAI → Instructor + Mode.JSON_SCHEMA
├── OpenAI + Anthropic → Instructor + 適配器模式
├── 任意模型 → Prompt+JSON解析 + 嚴格驗證
└── 需要串流 → Instructor + create_partial
線上工具推薦
- JSON格式化驗證:/zh-TW/json/format
- JSONPath查詢:/zh-TW/json/jsonpath
- cURL轉程式碼:/zh-TW/dev/curl-to-code
總結:Python LLM結構化輸出是AI工程的核心基礎設施。6種模式從簡到繁:JSON Schema約束→函式呼叫協定→Instructor自動重試→多模型適配→串流結構化輸出→生產可靠性保障。生產環境首選Instructor庫,配合Pydantic驗證和自動重試,可靠率達95%以上。關鍵注意點:1)巢狀Schema不超過2層,2)列舉值用field_validator歸一化,3)熔斷器保護下游模型,4)快取減少重複呼叫。多模型場景用適配器模式統一介面,串流場景用create_partial漸進式輸出。
相關閱讀
本站提供瀏覽器本地工具,免註冊即可試用 →