Python Pydantic V2 Data Validation: 7 Production Patterns from Model Design to Custom Validators
Pydantic V2: Still Writing if-else for Data Validation?
Missing field validation in API parameters, dirty data written to databases, None values from config parsing — these production incidents all stem from lax data validation. You write if-else checks that are verbose and error-prone; you use V1's @validator, only to find everything breaks in V2; you configure model_config, but serialization results are still wrong. In 2026, Pydantic V2 has fully replaced V1 with 5-50x performance improvements, but the API changes are massive and migration is full of pitfalls.
This article covers 7 production patterns, guiding you through basic model → field validation → custom validators → serialization → JSON Schema → performance optimization → FastAPI integration with complete code and pitfall guides.
Pydantic V2 Core Concepts
| Concept | Description |
|---|---|
| BaseModel | Pydantic's core class for defining data models with automatic validation |
| Field | Field configuration supporting defaults, descriptions, and constraints |
| field_validator | V2's new field validator, replacing V1's @validator |
| model_validator | Model-level validator for cross-field validation |
| model_config | Model configuration controlling serialization, strict mode, etc. |
| TypeAdapter | Validation adapter for non-BaseModel types |
| JSON Schema | Auto-generated JSON Schema from models for API documentation |
| Serialize | Serialization control with exclude, alias, and custom serialization |
Problem Analysis: 5 Major Data Validation Pain Points
- Hand-written validation is verbose and error-prone: Writing if-else for every API endpoint, missing fields causes bugs, high maintenance cost
- V1 to V2 API incompatibility:
@validatorbecomes@field_validator,Configclass becomesmodel_config, lots of code needs changes - Nested model serialization out of control: Circular references when converting ORM objects to JSON, sensitive field leaks, field names not matching frontend conventions
- Cross-field validation is hard: Password confirmation, date ranges, conditional required fields need multi-field validation
- Performance bottlenecks: V1 is slow with large data volumes; V2 is faster but misconfiguration can make it slower
Step-by-Step: 7 Pydantic V2 Production Patterns
Pattern 1: Basic Model Design and Field Constraints
from pydantic import BaseModel, Field, EmailStr
from typing import Optional
from datetime import datetime
from enum import Enum
class UserStatus(str, Enum):
ACTIVE = "active"
INACTIVE = "inactive"
SUSPENDED = "suspended"
class UserCreate(BaseModel):
model_config = {"str_strip_whitespace": True, "str_min_length": 1}
username: str = Field(
min_length=3,
max_length=20,
pattern=r"^[a-zA-Z0-9_]+$",
description="Username, 3-20 alphanumeric characters and underscores"
)
email: EmailStr = Field(description="Email address")
password: str = Field(
min_length=8,
max_length=128,
description="Password, 8-128 characters"
)
age: Optional[int] = Field(
default=None,
ge=0,
le=150,
description="Age, 0-150"
)
status: UserStatus = Field(default=UserStatus.ACTIVE)
created_at: datetime = Field(default_factory=datetime.now)
class UserResponse(BaseModel):
id: int = Field(gt=0)
username: str
email: EmailStr
status: UserStatus
created_at: datetime
user = UserCreate(
username="zhang_san",
email="zhang@example.com",
password="secureP@ss123",
age=28
)
print(user.model_dump())
Pattern 2: Field-Level Validators with field_validator
from pydantic import BaseModel, Field, field_validator
import re
class RegisterRequest(BaseModel):
username: str = Field(min_length=3, max_length=20)
password: str = Field(min_length=8)
confirm_password: str
@field_validator("username")
@classmethod
def username_must_be_valid(cls, v: str) -> str:
if not re.match(r"^[a-zA-Z0-9_]+$", v):
raise ValueError("Username can only contain letters, numbers, and underscores")
if v.startswith("_"):
raise ValueError("Username cannot start with an underscore")
return v.lower()
@field_validator("password")
@classmethod
def password_strength_check(cls, v: str) -> str:
if not re.search(r"[A-Z]", v):
raise ValueError("Password must contain at least one uppercase letter")
if not re.search(r"[a-z]", v):
raise ValueError("Password must contain at least one lowercase letter")
if not re.search(r"\d", v):
raise ValueError("Password must contain at least one digit")
if not re.search(r"[!@#$%^&*(),.?\":{}|<>]", v):
raise ValueError("Password must contain at least one special character")
return v
class ProductCreate(BaseModel):
name: str = Field(min_length=1, max_length=200)
price: float = Field(gt=0)
tags: list[str] = Field(default_factory=list)
@field_validator("tags")
@classmethod
def tags_deduplicate(cls, v: list[str]) -> list[str]:
seen = set()
result = []
for tag in v:
tag_lower = tag.lower().strip()
if tag_lower and tag_lower not in seen:
seen.add(tag_lower)
result.append(tag_lower)
return result
@field_validator("price")
@classmethod
def price_round_to_cents(cls, v: float) -> float:
return round(v, 2)
Pattern 3: Model-Level Validators with model_validator
from pydantic import BaseModel, Field, model_validator
from datetime import date, timedelta
from typing import Optional
class DateRangeQuery(BaseModel):
start_date: date
end_date: date
@model_validator(mode="after")
def validate_date_range(self) -> "DateRangeQuery":
if self.start_date > self.end_date:
raise ValueError("Start date cannot be after end date")
if (self.end_date - self.start_date).days > 365:
raise ValueError("Query range cannot exceed 365 days")
return self
class EventCreate(BaseModel):
title: str = Field(min_length=1, max_length=200)
event_type: str
start_time: datetime
end_time: Optional[datetime] = None
location: Optional[str] = None
online_url: Optional[str] = None
@model_validator(mode="after")
def validate_event(self) -> "EventCreate":
if self.event_type == "offline" and not self.location:
raise ValueError("Offline events must have a location")
if self.event_type == "online" and not self.online_url:
raise ValueError("Online events must have a URL")
if self.event_type == "hybrid":
if not self.location:
raise ValueError("Hybrid events must have an offline location")
if not self.online_url:
raise ValueError("Hybrid events must have an online URL")
if self.end_time and self.start_time >= self.end_time:
raise ValueError("End time must be after start time")
return self
class PasswordChange(BaseModel):
old_password: str = Field(min_length=1)
new_password: str = Field(min_length=8)
confirm_password: str
@model_validator(mode="after")
def passwords_match(self) -> "PasswordChange":
if self.new_password != self.confirm_password:
raise ValueError("New passwords do not match")
if self.old_password == self.new_password:
raise ValueError("New password cannot be the same as old password")
return self
Pattern 4: Serialization Control and Aliases
from pydantic import BaseModel, Field, ConfigDict
from typing import Optional
class UserORM(BaseModel):
model_config = ConfigDict(
from_attributes=True,
populate_by_name=True,
)
id: int
username: str = Field(alias="user_name")
email: str = Field(alias="email_address")
hashed_password: str = Field(exclude=True)
phone: Optional[str] = Field(default=None, exclude=True)
avatar_url: Optional[str] = Field(default=None, serialization_alias="avatar")
created_at: datetime
updated_at: Optional[datetime] = None
class ArticleResponse(BaseModel):
model_config = ConfigDict(populate_by_name=True)
id: int
title: str
content: str = Field(exclude=True)
summary: Optional[str] = None
author_id: int = Field(serialization_alias="authorId")
tags: list[str] = Field(default_factory=list)
view_count: int = Field(default=0, serialization_alias="viewCount")
created_at: datetime = Field(serialization_alias="createdAt")
updated_at: Optional[datetime] = Field(default=None, serialization_alias="updatedAt")
def get_summary(self) -> str:
if self.summary:
return self.summary
return self.content[:200] + "..." if len(self.content) > 200 else self.content
article = ArticleResponse(
id=1,
title="Pydantic V2 Practical Guide",
content="This is a very long article content..." * 50,
author_id=42,
tags=["Python", "Pydantic"],
view_count=1024,
created_at=datetime.now()
)
print(article.model_dump(by_alias=True))
Pattern 5: JSON Schema Generation and API Documentation
from pydantic import BaseModel, Field
import json
class APIRequest(BaseModel):
"""Create order request"""
product_id: int = Field(gt=0, description="Product ID")
quantity: int = Field(ge=1, le=999, description="Purchase quantity")
coupon_code: Optional[str] = Field(default=None, pattern=r"^[A-Z0-9]{6,12}$", description="Coupon code")
shipping_address: str = Field(min_length=5, max_length=500, description="Shipping address")
remark: Optional[str] = Field(default=None, max_length=200, description="Order remark")
class APIResponse(BaseModel):
"""Create order response"""
order_id: str = Field(description="Order ID")
total_amount: float = Field(description="Total amount")
discount_amount: float = Field(default=0.0, description="Discount amount")
final_amount: float = Field(description="Final amount to pay")
status: str = Field(description="Order status")
schema = APIRequest.model_json_schema()
print(json.dumps(schema, indent=2, ensure_ascii=False))
class NestedModel(BaseModel):
tag_name: str
tag_value: str
class ComplexRequest(BaseModel):
name: str
items: list[NestedModel]
metadata: dict[str, str]
complex_schema = ComplexRequest.model_json_schema()
print(json.dumps(complex_schema, indent=2, ensure_ascii=False))
Pattern 6: TypeAdapter and Generic Validation
from pydantic import BaseModel, TypeAdapter, Field
from typing import Generic, TypeVar, Optional
T = TypeVar("T")
class PageResponse(BaseModel, Generic[T]):
items: list[T]
total: int = Field(ge=0)
page: int = Field(ge=1)
page_size: int = Field(ge=1, le=100)
has_next: bool
class UserItem(BaseModel):
id: int
username: str
email: str
user_page_type = PageResponse[UserItem]
adapter = TypeAdapter(user_page_type)
json_data = {
"items": [
{"id": 1, "username": "alice", "email": "alice@example.com"},
{"id": 2, "username": "bob", "email": "bob@example.com"},
],
"total": 100,
"page": 1,
"page_size": 10,
"has_next": True
}
page = adapter.validate_python(json_data)
print(page.model_dump())
raw_list_adapter = TypeAdapter(list[int])
result = raw_list_adapter.validate_python(["1", "2", "3"])
print(result)
config_adapter = TypeAdapter(dict[str, int])
config = config_adapter.validate_python({"timeout": "30", "retries": "3"})
print(config)
Pattern 7: FastAPI Integration Production Practice
from fastapi import FastAPI, HTTPException, Depends, Query
from pydantic import BaseModel, Field, field_validator, model_validator
from typing import Optional
app = FastAPI(title="User Management API")
class UserCreateRequest(BaseModel):
username: str = Field(min_length=3, max_length=20, pattern=r"^[a-zA-Z0-9_]+$")
email: str = Field(pattern=r"^[\w.-]+@[\w.-]+\.\w+$")
password: str = Field(min_length=8, max_length=128)
role: str = Field(default="user", pattern=r"^(admin|user|guest)$")
@field_validator("password")
@classmethod
def password_strength(cls, v: str) -> str:
has_upper = any(c.isupper() for c in v)
has_lower = any(c.islower() for c in v)
has_digit = any(c.isdigit() for c in v)
if not (has_upper and has_lower and has_digit):
raise ValueError("Password must contain uppercase, lowercase, and digit")
return v
class UserUpdateRequest(BaseModel):
email: Optional[str] = None
role: Optional[str] = None
status: Optional[str] = None
@model_validator(mode="after")
def at_least_one_field(self) -> "UserUpdateRequest":
if self.email is None and self.role is None and self.status is None:
raise ValueError("At least one field must be updated")
return self
class UserDetailResponse(BaseModel):
id: int
username: str
email: str
role: str
status: str
created_at: datetime
class ErrorResponse(BaseModel):
error_code: int
message: str
detail: Optional[str] = None
@app.post("/users", response_model=UserDetailResponse, responses={400: {"model": ErrorResponse}})
async def create_user(req: UserCreateRequest):
user_data = req.model_dump()
user_data["id"] = 1
user_data["status"] = "active"
user_data["created_at"] = datetime.now()
return user_data
@app.patch("/users/{user_id}", response_model=UserDetailResponse)
async def update_user(user_id: int, req: UserUpdateRequest):
update_data = req.model_dump(exclude_none=True)
if not update_data:
raise HTTPException(status_code=400, detail="No fields to update")
return {"id": user_id, "username": "test", "email": "test@example.com", "role": "user", "status": "active", "created_at": datetime.now()}
@app.get("/users", response_model=PageResponse[UserDetailResponse])
async def list_users(
page: int = Query(ge=1, default=1),
page_size: int = Query(ge=1, le=100, default=20),
role: Optional[str] = Query(default=None, pattern=r"^(admin|user|guest)$"),
):
return {
"items": [],
"total": 0,
"page": page,
"page_size": page_size,
"has_next": False
}
Pitfall Guide
Pitfall 1: Renaming V1's @validator to @field_validator Without Proper Changes
# ❌ Wrong: V1 style with just a name change, missing cls and mode
from pydantic import field_validator
class Bad(BaseModel):
name: str
@field_validator("name")
def validate_name(v):
return v.upper()
# ✅ Correct: V2 requires @classmethod and mode parameter
class Good(BaseModel):
name: str
@field_validator("name")
@classmethod
def validate_name(cls, v: str) -> str:
return v.upper()
Pitfall 2: Writing model_config as Inner Class
# ❌ Wrong: V1's Config inner class, deprecated in V2
class OldWay(BaseModel):
name: str
class Config:
orm_mode = True
# ✅ Correct: V2 uses model_config dictionary
class NewWay(BaseModel):
model_config = {"from_attributes": True}
name: str
# ✅ Better: Use ConfigDict for type hints
from pydantic import ConfigDict
class BestWay(BaseModel):
model_config = ConfigDict(from_attributes=True)
name: str
Pitfall 3: Exclude Not Working During Serialization
class User(BaseModel):
id: int
name: str
password: str = Field(exclude=True)
user = User(id=1, name="test", password="secret")
# ❌ Wrong: model_dump() doesn't apply serialization aliases by default
print(user.model_dump())
# {'id': 1, 'name': 'test', 'password': 'secret'} # password still there!
# ✅ Correct: Need mode parameter
print(user.model_dump(mode="python"))
# {'id': 1, 'name': 'test'} # password excluded
# ✅ JSON serialization
print(user.model_dump_json())
# {"id":1,"name":"test"} # password excluded
Pitfall 4: from_attributes Mismatch with ORM Fields
# ❌ Wrong: ORM field names don't match model field names, from_attributes silently skips
class ORMUser:
def __init__(self):
self.user_name = "test" # ORM field name
self.email_addr = "t@e.com"
class PydanticUser(BaseModel):
model_config = ConfigDict(from_attributes=True)
username: str # Doesn't match user_name
email: str # Doesn't match email_addr
# ✅ Correct: Use Field(alias=...) to map ORM field names
class PydanticUserFixed(BaseModel):
model_config = ConfigDict(from_attributes=True, populate_by_name=True)
username: str = Field(alias="user_name")
email: str = Field(alias="email_addr")
Pitfall 5: Optional Fields Skip Validation When None
# ❌ Wrong: Optional field passes None without validation
class Bad(BaseModel):
age: Optional[int] = Field(None, ge=0, le=150)
Bad(age=None) # Passes, but None is not a valid age
# ✅ Correct: Distinguish "optional" from "allows None"
from typing import Union
class Good(BaseModel):
age: Union[int, None] = Field(None, ge=0, le=150)
# ✅ Better: If None is meaningful, handle with custom validator
class Better(BaseModel):
age: Optional[int] = Field(None, ge=0, le=150)
@field_validator("age")
@classmethod
def age_not_none_if_provided(cls, v: Optional[int]) -> Optional[int]:
if v is not None and v < 0:
raise ValueError("Age cannot be negative")
return v
Error Troubleshooting
| # | Error Message | Cause | Solution |
|---|---|---|---|
| 1 | ValidationError: field required |
Required field not provided | Check if field has default or default_factory |
| 2 | ValidationError: string too short |
String length insufficient | Adjust min_length or provide longer input |
| 3 | PydanticUserWarning: @validator is deprecated |
Using V1's @validator | Replace with @field_validator and add @classmethod |
| 4 | AttributeError: 'Config' class not supported |
V2 doesn't support inner Config class | Use model_config dict or ConfigDict |
| 5 | ValidationError: Input should be a valid integer |
Type conversion failed | Check if input is a valid numeric string |
| 6 | ValueError: field_validator missing cls |
field_validator missing @classmethod | Add @classmethod below @field_validator |
| 7 | ValidationError: Extra inputs are not permitted |
Extra fields rejected in strict mode | Set model_config's extra="ignore" or "allow" |
| 8 | TypeError: Unable to generate pydantic-core schema |
Unsupported type annotation | Check for complex generics or unsupported types |
| 9 | RecursionError: maximum recursion depth exceeded |
Circular reference in nested models | Use Optional forward references or restructure |
| 10 | SerializationError: circular reference detected |
Circular reference during serialization | Use exclude parameter or custom serializer |
Advanced Optimization
1. Strict Mode vs Lax Mode Switching
from pydantic import BaseModel, ConfigDict, StrictInt, StrictStr
class StrictModel(BaseModel):
model_config = ConfigDict(strict=True)
id: int
name: str
class LaxModel(BaseModel):
model_config = ConfigDict(strict=False)
id: int
name: str
strict_result = StrictModel(id=1, name="test")
lax_result = LaxModel(id="1", name="test")
class HybridModel(BaseModel):
model_config = ConfigDict(strict=False)
id: StrictInt
name: str
2. Custom Types with Annotated
from pydantic import BaseModel, BeforeValidator, AfterValidator
from typing import Annotated
def normalize_phone(v: str) -> str:
return v.replace("-", "").replace(" ", "").replace("+86", "")
def check_phone_format(v: str) -> str:
if not v.startswith("1") or len(v) != 11:
raise ValueError("Invalid phone number format")
return v
PhoneNumber = Annotated[str, BeforeValidator(normalize_phone), AfterValidator(check_phone_format)]
def cents_to_yuan(v: int) -> float:
return v / 100
def yuan_to_cents(v: float) -> int:
return int(v * 100)
YuanFromCents = Annotated[float, BeforeValidator(lambda v: v / 100 if isinstance(v, int) else v)]
class PaymentRequest(BaseModel):
phone: PhoneNumber
amount: YuanFromCents = Field(gt=0, description="Amount in yuan")
payment = PaymentRequest(phone="+86-138-0013-8000", amount=9900)
print(payment.model_dump())
3. Performance Optimization: Caching and Pre-compilation
from pydantic import BaseModel, TypeAdapter
import time
class LargeModel(BaseModel):
field1: str
field2: int
field3: float
field4: bool
field5: str
field6: int
field7: float
field8: bool
adapter = TypeAdapter(LargeModel)
data = {"field1": "a", "field2": 1, "field3": 1.0, "field4": True, "field5": "b", "field6": 2, "field7": 2.0, "field8": False}
start = time.perf_counter()
for _ in range(100000):
LargeModel(**data)
v1_time = time.perf_counter() - start
start = time.perf_counter()
for _ in range(100000):
adapter.validate_python(data)
adapter_time = time.perf_counter() - start
print(f"Direct: {v1_time:.3f}s, TypeAdapter: {adapter_time:.3f}s")
Comparison
| Dimension | Pydantic V1 | Pydantic V2 | Hand-written if-else | Marshmallow |
|---|---|---|---|---|
| Validation Performance | ⭐⭐ Slow | ⭐⭐⭐⭐⭐ 5-50x faster | ⭐⭐⭐⭐ Fast | ⭐⭐ Slow |
| Type Hint Integration | ⚠️ Partial | ✅ Complete | ❌ None | ❌ None |
| Error Messages | ⚠️ General | ✅ Detailed | ❌ Custom | ⚠️ General |
| JSON Schema | ✅ Supported | ✅ Comprehensive | ❌ None | ✅ Supported |
| Serialization Control | ⚠️ Limited | ✅ Flexible | ❌ Manual | ✅ Flexible |
| Learning Curve | ⭐⭐ Low | ⭐⭐⭐ Medium | ⭐ Lowest | ⭐⭐⭐ Medium |
| FastAPI Integration | ✅ Native | ✅ Native | ❌ None | ⚠️ Needs Adapter |
| Production Recommendation | Legacy Projects | First Choice | Simple Scripts | Complex Transforms |
Summary: Pydantic V2 is not just a version upgrade — it's a qualitative leap from "validation library" to "data engineering infrastructure." Three core principles: use Field constraints instead of hand-written validation, use model_validator for cross-field logic, use model_config to control serialization behavior. The V1 to V2 migration is painful, but 5-50x performance improvement and a more complete type system make it worthwhile. FastAPI + Pydantic V2 is the de facto standard for Python web development in 2026.
Recommended Online Tools
- JSON Formatter: /en/json/format
- Base64 Encode/Decode: /en/encode/base64
- Hash Calculator: /en/encode/hash
- JWT Decoder: /en/encode/jwt-decode
Try these browser-local tools — no sign-up required →