Go错误处理最佳实践:从错误包装到恢复的6种生产模式
当错误信息丢失遇上panic雪崩:Go错误处理的至暗时刻
凌晨2点,线上支付服务报错率飙升到30%。日志里全是 "database error: sql: no rows in result set",但完全不知道是哪个接口、哪条SQL、什么业务场景触发的。更糟的是,一个未捕获的panic在goroutine里炸开,连带整个HTTP服务502。排查4小时后才发现:错误被层层吞掉,上下文信息全部丢失,panic没有recover,监控告警形同虚设。
这不是个例。Go的 if err != nil 虽然简单——每个函数调用后检查错误即可,但生产环境的错误处理远不止检查nil那么简单。你需要保留错误链路、实现类型化错误、优雅恢复panic、构建错误中间件链、接入可观测性体系。本文将从6种生产级错误处理模式出发,帮你构建健壮的Go错误处理体系。
核心收获
- 错误包装:用
fmt.Errorf("%w", err)保留完整错误链,而非字符串拼接丢失上下文 - 自定义错误类型:用sentinel error + 自定义struct实现业务语义化的错误分类
- errors.Is/As:取代字符串匹配,用类型安全的错误检查应对错误链嵌套
- Panic Recovery:在HTTP middleware和goroutine中recover panic,防止级联崩溃
- 错误中间件链:将日志、指标、追踪统一编织进错误处理流程
- 生产可观测性:用OpenTelemetry将错误接入分布式追踪和告警体系
目录
- 错误处理架构总览
- Pattern 1:Error Wrapping with fmt.Errorf and %w verb
- Pattern 2:Custom Error Types with Sentinel Errors
- Pattern 3:errors.Is / errors.As for Error Inspection
- Pattern 4:Panic Recovery Middleware
- Pattern 5:Error Middleware Chain
- Pattern 6:Production Error Observability with OpenTelemetry
- 5个常见陷阱与解决方案
- 10个常见错误排查表
- 高级优化技巧
- 错误处理方式对比分析
- 推荐在线工具
- 总结
错误处理架构总览
┌─────────────────────────────────────────────────────────────┐
│ HTTP Request │
└──────────────────────────┬──────────────────────────────────┘
│
▼
┌──────────────────────────────────────────────────────────────┐
│ Middleware Chain │
│ ┌──────────┐ ┌──────────┐ ┌──────────┐ ┌──────────┐ │
│ │ Recovery │→│ Logging │→│ Metrics │→│ Tracing │ │
│ └──────────┘ └──────────┘ └──────────┘ └──────────┘ │
└──────────────────────────┬───────────────────────────────────┘
│
▼
┌──────────────────────────────────────────────────────────────┐
│ Business Layer │
│ ┌─────────────┐ ┌──────────────┐ ┌──────────────────┐ │
│ │ fmt.Errorf │ │ Custom Error │ │ errors.Is/As │ │
│ │ (%w wrap) │ │ (sentinel) │ │ (inspection) │ │
│ └─────────────┘ └──────────────┘ └──────────────────┘ │
└──────────────────────────┬───────────────────────────────────┘
│
▼
┌──────────────────────────────────────────────────────────────┐
│ Observability Layer │
│ ┌──────────────┐ ┌──────────────┐ ┌──────────────────┐ │
│ │ OpenTelemetry│ │ Alerting │ │ Dashboard │ │
│ │ Tracing │ │ (PagerDuty)│ │ (Grafana) │ │
│ └──────────────┘ └──────────────┘ └──────────────────┘ │
└──────────────────────────────────────────────────────────────┘
Pattern 1:Error Wrapping with fmt.Errorf and %w verb
错误包装是Go错误处理的基石。核心思想:用 %w 动词包装错误,保留完整的错误链路,而非用字符串拼接丢失原始错误信息。
反面教材:字符串拼接丢失上下文
func getUser(id int) (*User, error) {
row := db.QueryRow("SELECT * FROM users WHERE id = ?", id)
var u User
if err := row.Scan(&u.ID, &u.Name, &u.Email); err != nil {
return nil, fmt.Errorf("scan user failed: %v", err)
}
return &u, nil
}
%v 把错误转成字符串,原始的 sql: no rows in result set 变成了不可检查的纯文本。调用方无法用 errors.Is 或 errors.As 判断错误类型。
正确做法:用 %w 保留错误链
package repository
import (
"database/sql"
"fmt"
)
type User struct {
ID int
Name string
Email string
}
func GetUser(id int) (*User, error) {
row := db.QueryRow("SELECT * FROM users WHERE id = ?", id)
var u User
if err := row.Scan(&u.ID, &u.Name, &u.Email); err != nil {
return nil, fmt.Errorf("get user id=%d: %w", id, err)
}
return &u, nil
}
func HandleGetUser(id int) {
user, err := GetUser(id)
if err != nil {
if errors.Is(err, sql.ErrNoRows) {
log.Printf("user not found: id=%d", id)
return
}
log.Printf("unexpected error: %v", err)
}
_ = user
}
多层包装的错误链
package service
import (
"fmt"
"myapp/repository"
)
func ProcessOrder(orderID int) error {
user, err := repository.GetUser(orderID)
if err != nil {
return fmt.Errorf("process order %d: %w", orderID, err)
}
_ = user
return nil
}
func HandleOrder(orderID int) {
if err := ProcessOrder(orderID); err != nil {
var sqlErr *sqlite.Error
if errors.As(err, &sqlErr) {
log.Printf("database error: %v", sqlErr)
}
}
}
关键规则:
- 始终用
%w而非%v包装错误 - 包装时添加业务上下文(函数名、参数、操作描述)
- 错误链最多3-4层,超过说明抽象层次有问题
- 只包装一次,避免同一错误被多层重复包装
Pattern 2:Custom Error Types with Sentinel Errors
自定义错误类型让错误携带业务语义。核心思想:用sentinel error定义可预知的错误状态,用自定义struct携带结构化上下文。
Sentinel Error:可预知的错误常量
package apperr
import "errors"
var (
ErrNotFound = errors.New("resource not found")
ErrUnauthorized = errors.New("unauthorized access")
ErrForbidden = errors.New("forbidden operation")
ErrConflict = errors.New("resource conflict")
ErrRateLimited = errors.New("rate limit exceeded")
ErrValidation = errors.New("validation failed")
ErrInternal = errors.New("internal server error")
)
自定义错误类型:携带结构化上下文
package apperr
import (
"fmt"
"time"
)
type DomainError struct {
Code string
Message string
Domain string
Timestamp time.Time
Err error
}
func (e *DomainError) Error() string {
if e.Err != nil {
return fmt.Sprintf("[%s] %s: %s: %v", e.Domain, e.Code, e.Message, e.Err)
}
return fmt.Sprintf("[%s] %s: %s", e.Domain, e.Code, e.Message)
}
func (e *DomainError) Unwrap() error {
return e.Err
}
func NewDomainError(domain, code, message string, err error) *DomainError {
return &DomainError{
Code: code,
Message: message,
Domain: domain,
Timestamp: time.Now(),
Err: err,
}
}
业务层使用自定义错误
package order
import (
"apperr"
"fmt"
)
type OrderService struct {
repo OrderRepository
}
func (s *OrderService) CreateOrder(req CreateOrderRequest) (*Order, error) {
if err := validateOrder(req); err != nil {
return nil, apperr.NewDomainError(
"order",
"VALIDATION_ERROR",
fmt.Sprintf("invalid order request: user_id=%d", req.UserID),
err,
)
}
existing, err := s.repo.FindByUserAndProduct(req.UserID, req.ProductID)
if err != nil {
return nil, apperr.NewDomainError(
"order",
"DB_ERROR",
"failed to check existing order",
err,
)
}
if existing != nil {
return nil, apperr.NewDomainError(
"order",
"CONFLICT",
fmt.Sprintf("duplicate order: user=%d product=%d", req.UserID, req.ProductID),
apperr.ErrConflict,
)
}
order := &Order{
UserID: req.UserID,
ProductID: req.ProductID,
Quantity: req.Quantity,
}
if err := s.repo.Save(order); err != nil {
return nil, apperr.NewDomainError(
"order",
"DB_ERROR",
"failed to save order",
err,
)
}
return order, nil
}
关键规则:
- Sentinel error用
errors.New定义,命名以Err开头 - 自定义错误类型必须实现
Error()和Unwrap()方法 - 错误码用大写蛇形命名(
VALIDATION_ERROR),便于日志检索 - 每个业务域定义自己的错误码空间,避免冲突
Pattern 3:errors.Is / errors.As for Error Inspection
错误检查是错误处理的核心操作。核心思想:用 errors.Is 检查错误值,用 errors.As 提取错误类型,取代脆弱的字符串匹配。
errors.Is:检查错误链中的特定错误值
package handler
import (
"apperr"
"database/sql"
"errors"
"net/http"
)
func GetUserHandler(w http.ResponseWriter, r *http.Request) {
id := r.URL.Query().Get("id")
user, err := userService.GetUser(id)
if err != nil {
switch {
case errors.Is(err, sql.ErrNoRows):
http.Error(w, "user not found", http.StatusNotFound)
case errors.Is(err, apperr.ErrUnauthorized):
http.Error(w, "unauthorized", http.StatusUnauthorized)
case errors.Is(err, apperr.ErrRateLimited):
http.Error(w, "rate limited", http.StatusTooManyRequests)
default:
http.Error(w, "internal error", http.StatusInternalServerError)
}
return
}
writeJSON(w, http.StatusOK, user)
}
errors.As:提取错误链中的特定类型
package handler
import (
"apperr"
"errors"
"net/http"
)
func CreateOrderHandler(w http.ResponseWriter, r *http.Request) {
var req CreateOrderRequest
if err := decodeJSON(r, &req); err != nil {
http.Error(w, "bad request", http.StatusBadRequest)
return
}
order, err := orderService.CreateOrder(req)
if err != nil {
var domainErr *apperr.DomainError
if errors.As(err, &domainErr) {
switch domainErr.Code {
case "VALIDATION_ERROR":
http.Error(w, domainErr.Message, http.StatusBadRequest)
case "CONFLICT":
http.Error(w, domainErr.Message, http.StatusConflict)
case "DB_ERROR":
log.Printf("database error: %v", err)
http.Error(w, "internal error", http.StatusInternalServerError)
default:
http.Error(w, "internal error", http.StatusInternalServerError)
}
return
}
log.Printf("unhandled error: %v", err)
http.Error(w, "internal error", http.StatusInternalServerError)
return
}
writeJSON(w, http.StatusCreated, order)
}
常见错误:用字符串匹配检查错误
// ❌ 错误:字符串匹配脆弱且不可靠
if strings.Contains(err.Error(), "not found") {
// ...
}
// ✅ 正确:用 errors.Is 检查错误值
if errors.Is(err, apperr.ErrNotFound) {
// ...
}
// ❌ 错误:类型断言不遍历错误链
if de, ok := err.(*apperr.DomainError); ok {
// 如果 err 是被包装过的,这里 ok=false
}
// ✅ 正确:用 errors.As 遍历错误链
var de *apperr.DomainError
if errors.As(err, &de) {
// 即使 err 被多层包装也能正确提取
}
关键规则:
- 永远不要用
err.Error()做字符串匹配 errors.Is用于检查sentinel error,errors.As用于提取自定义错误类型errors.As的第二个参数必须是指向目标类型的指针- 在HTTP handler层统一做错误映射,业务层只返回错误
Pattern 4:Panic Recovery Middleware
Panic是Go的"核武器"——不到万不得已不要使用,但必须有防御手段。核心思想:在HTTP middleware和goroutine入口处recover panic,防止级联崩溃。
HTTP Server Panic Recovery
package middleware
import (
"log"
"net/http"
"runtime/debug"
)
func Recovery(next http.Handler) http.Handler {
return http.HandlerFunc(func(w http.ResponseWriter, r *http.Request) {
defer func() {
if rec := recover(); rec != nil {
stack := debug.Stack()
log.Printf(
"[PANIC] path=%s method=%s error=%v\n%s",
r.URL.Path, r.Method, rec, stack,
)
http.Error(w, "internal server error", http.StatusInternalServerError)
}
}()
next.ServeHTTP(w, r)
})
}
Goroutine Panic Recovery
package goroutine
import (
"log"
"runtime/debug"
)
func SafeGo(fn func(), panicHandler func(recover any, stack []byte)) {
go func() {
defer func() {
if rec := recover(); rec != nil {
stack := debug.Stack()
log.Printf("[GOROUTINE PANIC] recovered=%v\n%s", rec, stack)
if panicHandler != nil {
panicHandler(rec, stack)
}
}
}()
fn()
}()
}
带错误通道的Goroutine Recovery
package worker
import (
"runtime/debug"
)
type PanicError struct {
Recover any
Stack []byte
}
func (e *PanicError) Error() string {
return "goroutine panic recovered"
}
func SafeGoWithErr(fn func() error, errCh chan<- error) {
go func() {
defer func() {
if rec := recover(); rec != nil {
errCh <- &PanicError{
Recover: rec,
Stack: debug.Stack(),
}
}
}()
if err := fn(); err != nil {
errCh <- err
}
}()
}
func ProcessBatch(items []Item) []error {
var errs []error
errCh := make(chan error, len(items))
for _, item := range items {
SafeGoWithErr(func() error {
return processItem(item)
}, errCh)
}
for i := 0; i < len(items); i++ {
if err := <-errCh; err != nil {
errs = append(errs, err)
}
}
return errs
}
Gin框架的Recovery中间件
package middleware
import (
"log"
"net/http"
"runtime/debug"
"github.com/gin-gonic/gin"
)
func GinRecovery() gin.HandlerFunc {
return func(c *gin.Context) {
defer func() {
if rec := recover(); rec != nil {
stack := debug.Stack()
log.Printf(
"[PANIC] path=%s method=%s client_ip=%s error=%v\n%s",
c.Request.URL.Path,
c.Request.Method,
c.ClientIP(),
rec,
stack,
)
c.AbortWithStatusJSON(http.StatusInternalServerError, gin.H{
"error": "internal server error",
})
}
}()
c.Next()
}
}
关键规则:
- HTTP server必须在最外层middleware添加recovery
- 每个goroutine入口处必须有recovery,否则panic会导致整个进程退出
- recover后必须记录完整stack trace,否则无法定位问题
- recover后不要继续处理请求,返回500即可
- 只recover自己代码的panic,不要recover标准库的panic(如map并发读写)
Pattern 5:Error Middleware Chain
错误中间件链将横切关注点统一编织进错误处理。核心思想:日志、指标、追踪不在业务代码中散落,而是通过中间件链统一处理。
错误中间件架构
Request → Recovery → Logging → Metrics → Tracing → Handler → Response
│ │ │ │
▼ ▼ ▼ ▼
log panic log error emit counter span error
完整的错误中间件链实现
package middleware
import (
"log"
"net/http"
"runtime/debug"
"time"
)
type ResponseRecorder struct {
http.ResponseWriter
StatusCode int
Body []byte
}
func (r *ResponseRecorder) WriteHeader(code int) {
r.StatusCode = code
r.ResponseWriter.WriteHeader(code)
}
func (r *ResponseRecorder) Write(b []byte) (int, error) {
r.Body = append(r.Body, b...)
return r.ResponseWriter.Write(b)
}
func ErrorChain(next http.Handler) http.Handler {
return Recovery(
Logging(
Metrics(
Tracing(next),
),
),
)
}
func Recovery(next http.Handler) http.Handler {
return http.HandlerFunc(func(w http.ResponseWriter, r *http.Request) {
defer func() {
if rec := recover(); rec != nil {
log.Printf("[PANIC] path=%s error=%v\n%s", r.URL.Path, rec, debug.Stack())
http.Error(w, "internal server error", http.StatusInternalServerError)
}
}()
next.ServeHTTP(w, r)
})
}
func Logging(next http.Handler) http.Handler {
return http.HandlerFunc(func(w http.ResponseWriter, r *http.Request) {
start := time.Now()
rec := &ResponseRecorder{ResponseWriter: w, StatusCode: http.StatusOK}
next.ServeHTTP(rec, r)
duration := time.Since(start)
if rec.StatusCode >= 400 {
log.Printf(
"[ERROR] method=%s path=%s status=%d duration=%s",
r.Method, r.URL.Path, rec.StatusCode, duration,
)
} else {
log.Printf(
"[INFO] method=%s path=%s status=%d duration=%s",
r.Method, r.URL.Path, rec.StatusCode, duration,
)
}
})
}
func Metrics(next http.Handler) http.Handler {
return http.HandlerFunc(func(w http.ResponseWriter, r *http.Request) {
rec := &ResponseRecorder{ResponseWriter: w, StatusCode: http.StatusOK}
next.ServeHTTP(rec, r)
statusCode := rec.StatusCode
if statusCode >= 500 {
errorCounter.WithLabelValues(r.URL.Path, "5xx").Inc()
} else if statusCode >= 400 {
errorCounter.WithLabelValues(r.URL.Path, "4xx").Inc()
}
requestDuration.WithLabelValues(r.URL.Path).Observe(float64(statusCode))
})
}
func Tracing(next http.Handler) http.Handler {
return http.HandlerFunc(func(w http.ResponseWriter, r *http.Request) {
ctx, span := tracer.Start(r.Context(), r.URL.Path)
defer span.End()
rec := &ResponseRecorder{ResponseWriter: w, StatusCode: http.StatusOK}
next.ServeHTTP(rec, r.WithContext(ctx))
if rec.StatusCode >= 400 {
span.SetAttributes(attribute.Int("http.status_code", rec.StatusCode))
span.SetStatus(codes.Error, http.StatusText(rec.StatusCode))
}
})
}
基于函数选项的错误中间件
package middleware
import (
"log"
"net/http"
)
type ErrorMiddlewareOption struct {
LogErrors bool
EmitMetrics bool
TraceErrors bool
OnPanic func(recover any, stack []byte)
}
func ErrorMiddleware(opts ErrorMiddlewareOption) func(http.Handler) http.Handler {
return func(next http.Handler) http.Handler {
return http.HandlerFunc(func(w http.ResponseWriter, r *http.Request) {
defer func() {
if rec := recover(); rec != nil {
stack := debug.Stack()
if opts.OnPanic != nil {
opts.OnPanic(rec, stack)
}
log.Printf("[PANIC] %v\n%s", rec, stack)
http.Error(w, "internal server error", http.StatusInternalServerError)
}
}()
rec := &ResponseRecorder{ResponseWriter: w, StatusCode: http.StatusOK}
next.ServeHTTP(rec, r)
if rec.StatusCode >= 400 {
if opts.LogErrors {
log.Printf("[ERROR] path=%s status=%d", r.URL.Path, rec.StatusCode)
}
if opts.EmitMetrics {
errorCounter.WithLabelValues(r.URL.Path).Inc()
}
if opts.TraceErrors {
span := trace.SpanFromContext(r.Context())
span.SetStatus(codes.Error, http.StatusText(rec.StatusCode))
}
}
})
}
}
关键规则:
- Recovery必须在最外层,确保所有panic都能被捕获
- Logging记录请求上下文(路径、方法、状态码、耗时)
- Metrics按路径和状态码分类统计错误率
- Tracing将错误状态写入span,便于分布式追踪
- 中间件顺序:Recovery → Logging → Metrics → Tracing → Handler
Pattern 6:Production Error Observability with OpenTelemetry
生产可观测性是错误处理的最后一公里。核心思想:用OpenTelemetry将错误接入分布式追踪、指标和日志三大支柱,实现从发现到定位的全链路可观测。
OpenTelemetry错误追踪集成
package telemetry
import (
"context"
"fmt"
"go.opentelemetry.io/otel"
"go.opentelemetry.io/otel/attribute"
"go.opentelemetry.io/otel/codes"
"go.opentelemetry.io/otel/trace"
)
func RecordError(ctx context.Context, err error, attrs ...attribute.KeyValue) {
span := trace.SpanFromContext(ctx)
if !span.IsRecording() {
return
}
span.SetStatus(codes.Error, err.Error())
span.SetAttributes(attrs...)
span.RecordError(err, trace.WithAttributes(attrs...))
}
func WrapWithSpan(ctx context.Context, operation string, fn func(ctx context.Context) error) error {
ctx, span := otel.Tracer("app").Start(ctx, operation)
defer span.End()
if err := fn(ctx); err != nil {
RecordError(ctx, err,
attribute.String("operation", operation),
)
return err
}
return nil
}
带错误追踪的Repository层
package repository
import (
"context"
"database/sql"
"fmt"
"apperr"
"telemetry"
"go.opentelemetry.io/otel/attribute"
)
type UserRepository struct {
db *sql.DB
}
func (r *UserRepository) GetUser(ctx context.Context, id int) (*User, error) {
ctx, span := otel.Tracer("repository").Start(ctx, "UserRepository.GetUser")
defer span.End()
span.SetAttributes(attribute.Int("user.id", id))
row := r.db.QueryRowContext(ctx, "SELECT id, name, email FROM users WHERE id = ?", id)
var u User
if err := row.Scan(&u.ID, &u.Name, &u.Email); err != nil {
if err == sql.ErrNoRows {
span.SetAttributes(attribute.String("error.type", "not_found"))
return nil, fmt.Errorf("get user id=%d: %w", id, apperr.ErrNotFound)
}
telemetry.RecordError(ctx, err,
attribute.String("db.operation", "SELECT"),
attribute.Int("user.id", id),
)
return nil, fmt.Errorf("get user id=%d: %w", id, err)
}
span.SetAttributes(attribute.String("user.name", u.Name))
return &u, nil
}
错误指标采集
package metrics
import (
"go.opentelemetry.io/otel/metric"
)
var (
errorCounter metric.Int64Counter
errorLatency metric.Float64Histogram
)
func InitMetrics(meter metric.Meter) {
errorCounter, _ = meter.Int64Counter(
"app.errors.total",
metric.WithDescription("Total number of errors"),
)
errorLatency, _ = meter.Float64Histogram(
"app.errors.latency_seconds",
metric.WithDescription("Error handling latency"),
)
}
func RecordErrorMetric(domain, code string) {
errorCounter.Add(context.Background(), 1,
metric.WithAttributes(
attribute.String("domain", domain),
attribute.String("code", code),
),
)
}
错误告警规则
groups:
- name: error_alerts
rules:
- alert: HighErrorRate
expr: |
sum(rate(app_errors_total{code=~"5.."}[5m]))
/
sum(rate(http_requests_total[5m]))
> 0.05
for: 2m
labels:
severity: critical
annotations:
summary: "Error rate exceeds 5%"
description: "5xx error rate is {{ $value | humanizePercentage }}"
- alert: PanicDetected
expr: increase(app_errors_total{code="PANIC"}[1m]) > 0
for: 0m
labels:
severity: critical
annotations:
summary: "Panic recovered in production"
description: "{{ $value }} panic(s) detected in the last minute"
关键规则:
- 每个错误必须记录到至少一个追踪span
- 错误指标按domain和code分类,便于聚合分析
- 告警规则区分4xx(客户端错误)和5xx(服务端错误)
- Panic告警必须立即触发(
for: 0m),不等累积 - 日志、指标、追踪必须关联同一个trace_id
5个常见陷阱与解决方案
陷阱1:错误被吞掉
// ❌ 错误:忽略返回值
_, _ = io.Copy(dst, src)
// ✅ 正确:检查错误
written, err := io.Copy(dst, src)
if err != nil {
return fmt.Errorf("copy data: %w", err)
}
陷阱2:错误包装丢失原始信息
// ❌ 错误:用 %v 丢失错误链
return fmt.Errorf("query failed: %v", err)
// ✅ 正确:用 %w 保留错误链
return fmt.Errorf("query failed: %w", err)
陷阱3:goroutine中的panic未recover
// ❌ 错误:goroutine panic导致整个进程退出
go func() {
result := doSomething()
ch <- result
}()
// ✅ 正确:goroutine入口处recover
go func() {
defer func() {
if rec := recover(); rec != nil {
log.Printf("goroutine panic: %v\n%s", rec, debug.Stack())
ch <- nil
}
}()
result := doSomething()
ch <- result
}()
陷阱4:错误检查用字符串匹配
// ❌ 错误:字符串匹配脆弱
if strings.Contains(err.Error(), "timeout") {
// ...
}
// ✅ 正确:用 errors.Is 检查
if errors.Is(err, context.DeadlineExceeded) {
// ...
}
陷阱5:重复包装同一错误
// ❌ 错误:多层重复包装导致错误信息冗余
func a() error {
return fmt.Errorf("a: %w", err)
}
func b() error {
return fmt.Errorf("b: %w", a()) // "b: a: original error"
}
func c() error {
return fmt.Errorf("c: %w", b()) // "c: b: a: original error"
}
// ✅ 正确:只在边界层包装,内部直接传递
func a() error {
return err // 内部直接传递
}
func b() error {
return a()
}
func c() error {
return fmt.Errorf("process order: %w", b()) // 只在边界包装一次
}
10个常见错误排查表
| 错误现象 | 可能原因 | 排查方法 | 解决方案 |
|---|---|---|---|
sql: no rows in result set |
QueryRow无结果 | 检查SQL WHERE条件 | 用 errors.Is(err, sql.ErrNoRows) 判断 |
context deadline exceeded |
操作超时 | 检查context超时设置 | 增加超时时间或优化查询性能 |
panic: concurrent map writes |
map并发写入 | 检查goroutine共享map | 用 sync.Map 或加锁 |
panic: send on closed channel |
向已关闭channel发送 | 检查channel关闭时机 | 用sync.Once或单向channel控制 |
connection refused |
服务未启动或端口错误 | 检查服务状态和端口 | 确认服务启动,检查网络配置 |
i/o timeout |
网络超时 | 检查网络连通性 | 增加超时,检查防火墙规则 |
record not found |
数据不存在 | 检查查询条件 | 区分"不存在"和"查询失败" |
duplicate key value |
唯一约束冲突 | 检查插入数据 | 用 errors.As 提取PG错误码 |
panic: nil pointer dereference |
空指针解引用 | 检查指针初始化 | 添加nil检查,用Recovery中间件 |
too many open files |
文件描述符耗尽 | 检查连接池和文件操作 | 增加ulimit,检查连接泄漏 |
高级优化技巧
技巧1:错误分组与聚合
package apperr
import "strings"
type ErrorGroup struct {
Errors []error
}
func (g *ErrorGroup) Add(err error) {
if err != nil {
g.Errors = append(g.Errors, err)
}
}
func (g *ErrorGroup) Err() error {
if len(g.Errors) == 0 {
return nil
}
return g
}
func (g *ErrorGroup) Error() string {
msgs := make([]string, len(g.Errors))
for i, err := range g.Errors {
msgs[i] = err.Error()
}
return strings.Join(msgs, "; ")
}
func (g *ErrorGroup) Unwrap() []error {
return g.Errors
}
技巧2:错误重试与退避
package retry
import (
"context"
"fmt"
"math"
"time"
)
type Config struct {
MaxAttempts int
BaseDelay time.Duration
MaxDelay time.Duration
Retryable func(error) bool
}
func Do(ctx context.Context, cfg Config, fn func() error) error {
var lastErr error
for attempt := 0; attempt < cfg.MaxAttempts; attempt++ {
if err := fn(); err != nil {
if !cfg.Retryable(err) {
return fmt.Errorf("non-retryable error: %w", err)
}
lastErr = err
delay := time.Duration(
float64(cfg.BaseDelay) * math.Pow(2, float64(attempt)),
)
if delay > cfg.MaxDelay {
delay = cfg.MaxDelay
}
select {
case <-ctx.Done():
return ctx.Err()
case <-time.After(delay):
continue
}
}
return nil
}
return fmt.Errorf("max attempts (%d) exceeded: %w", cfg.MaxAttempts, lastErr)
}
技巧3:错误码映射HTTP状态
package handler
import (
"apperr"
"errors"
"net/http"
)
var codeToStatus = map[string]int{
"NOT_FOUND": http.StatusNotFound,
"VALIDATION_ERROR": http.StatusBadRequest,
"UNAUTHORIZED": http.StatusUnauthorized,
"FORBIDDEN": http.StatusForbidden,
"CONFLICT": http.StatusConflict,
"RATE_LIMITED": http.StatusTooManyRequests,
"DB_ERROR": http.StatusInternalServerError,
}
func MapErrorToHTTP(err error) (int, string) {
var domainErr *apperr.DomainError
if errors.As(err, &domainErr) {
if status, ok := codeToStatus[domainErr.Code]; ok {
return status, domainErr.Message
}
}
if errors.Is(err, apperr.ErrNotFound) {
return http.StatusNotFound, "resource not found"
}
if errors.Is(err, apperr.ErrUnauthorized) {
return http.StatusUnauthorized, "unauthorized"
}
return http.StatusInternalServerError, "internal server error"
}
技巧4:结构化错误日志
package logger
import (
"log/slog"
"apperr"
"errors"
)
func LogError(err error, context ...slog.Attr) {
attrs := []slog.Attr{
slog.String("error.message", err.Error()),
}
var domainErr *apperr.DomainError
if errors.As(err, &domainErr) {
attrs = append(attrs,
slog.String("error.domain", domainErr.Domain),
slog.String("error.code", domainErr.Code),
slog.Time("error.timestamp", domainErr.Timestamp),
)
}
attrs = append(attrs, context...)
slog.LogAttrs(nil, slog.LevelError, "error occurred", attrs...)
}
技巧5:错误断言辅助函数
package apperr
import "errors"
func IsNotFound(err error) bool {
return errors.Is(err, ErrNotFound)
}
func IsConflict(err error) bool {
return errors.Is(err, ErrConflict)
}
func IsRateLimited(err error) bool {
return errors.Is(err, ErrRateLimited)
}
func GetDomainError(err error) (*DomainError, bool) {
var de *DomainError
return de, errors.As(err, &de)
}
func GetCode(err error) string {
var de *DomainError
if errors.As(err, &de) {
return de.Code
}
return "UNKNOWN"
}
错误处理方式对比分析
| 特性 | fmt.Errorf %w | 自定义错误类型 | errors.Is/As | Panic Recovery | 错误中间件链 | OpenTelemetry |
|---|---|---|---|---|---|---|
| 保留错误链 | ✅ | ✅ | ✅ | ❌ | ❌ | ✅ |
| 携带业务语义 | ❌ | ✅ | ❌ | ❌ | ❌ | ✅ |
| 类型安全检查 | ❌ | ✅ | ✅ | ❌ | ❌ | ❌ |
| 防止进程崩溃 | ❌ | ❌ | ❌ | ✅ | ✅ | ❌ |
| 统一横切处理 | ❌ | ❌ | ❌ | ❌ | ✅ | ✅ |
| 分布式追踪 | ❌ | ❌ | ❌ | ❌ | 部分 | ✅ |
| 指标采集 | ❌ | ❌ | ❌ | ❌ | ✅ | ✅ |
| 实现复杂度 | 低 | 中 | 低 | 低 | 中 | 高 |
| 生产必需度 | ⭐⭐⭐⭐⭐ | ⭐⭐⭐⭐⭐ | ⭐⭐⭐⭐⭐ | ⭐⭐⭐⭐⭐ | ⭐⭐⭐⭐ | ⭐⭐⭐⭐ |
| 适用阶段 | 所有 | 业务层 | 检查层 | 运行时 | HTTP层 | 可观测层 |
推荐在线工具
- JSON格式化工具 — 格式化错误响应的JSON结构,快速排查API错误信息
- 哈希计算工具 — 计算请求签名和数据校验,确保错误日志的数据完整性
- cURL转代码工具 — 将cURL命令转为Go代码,快速复现和调试HTTP错误
总结
Go的错误处理不是"加个
if err != nil就行",而是要回答五个问题:错误从哪来?是什么类型?怎么传播?怎么恢复?怎么观测?fmt.Errorf("%w", err)回答了"从哪来",自定义错误类型回答了"什么类型",errors.Is/As回答了"怎么检查",Recovery中间件回答了"怎么恢复",OpenTelemetry回答了"怎么观测"。掌握这6种模式,你就掌握了生产级Go错误处理的核心方法论。
延伸阅读
- Go Blog: Working with Errors — Go官方错误处理设计文档
- Go Blog: Error Handling in Go 1.13 — %w动词和错误链详解
- OpenTelemetry Go SDK — 分布式追踪集成指南
- Effective Go: Errors — Go错误处理最佳实践
- Go 1.24 Release Notes — 最新版本错误处理改进
本站提供浏览器本地工具,免注册即可试用 →