Go错误处理最佳实践:从错误包装到恢复的6种生产模式

编程语言

当错误信息丢失遇上panic雪崩:Go错误处理的至暗时刻

凌晨2点,线上支付服务报错率飙升到30%。日志里全是 "database error: sql: no rows in result set",但完全不知道是哪个接口、哪条SQL、什么业务场景触发的。更糟的是,一个未捕获的panic在goroutine里炸开,连带整个HTTP服务502。排查4小时后才发现:错误被层层吞掉,上下文信息全部丢失,panic没有recover,监控告警形同虚设。

这不是个例。Go的 if err != nil 虽然简单——每个函数调用后检查错误即可,但生产环境的错误处理远不止检查nil那么简单。你需要保留错误链路、实现类型化错误、优雅恢复panic、构建错误中间件链、接入可观测性体系。本文将从6种生产级错误处理模式出发,帮你构建健壮的Go错误处理体系。


核心收获

  • 错误包装:用 fmt.Errorf("%w", err) 保留完整错误链,而非字符串拼接丢失上下文
  • 自定义错误类型:用sentinel error + 自定义struct实现业务语义化的错误分类
  • errors.Is/As:取代字符串匹配,用类型安全的错误检查应对错误链嵌套
  • Panic Recovery:在HTTP middleware和goroutine中recover panic,防止级联崩溃
  • 错误中间件链:将日志、指标、追踪统一编织进错误处理流程
  • 生产可观测性:用OpenTelemetry将错误接入分布式追踪和告警体系

目录

  1. 错误处理架构总览
  2. Pattern 1:Error Wrapping with fmt.Errorf and %w verb
  3. Pattern 2:Custom Error Types with Sentinel Errors
  4. Pattern 3:errors.Is / errors.As for Error Inspection
  5. Pattern 4:Panic Recovery Middleware
  6. Pattern 5:Error Middleware Chain
  7. Pattern 6:Production Error Observability with OpenTelemetry
  8. 5个常见陷阱与解决方案
  9. 10个常见错误排查表
  10. 高级优化技巧
  11. 错误处理方式对比分析
  12. 推荐在线工具
  13. 总结

错误处理架构总览

┌─────────────────────────────────────────────────────────────┐
│                    HTTP Request                              │
└──────────────────────────┬──────────────────────────────────┘
                           │
                           ▼
┌──────────────────────────────────────────────────────────────┐
│  Middleware Chain                                             │
│  ┌──────────┐  ┌──────────┐  ┌──────────┐  ┌──────────┐    │
│  │ Recovery │→│  Logging │→│ Metrics  │→│ Tracing  │    │
│  └──────────┘  └──────────┘  └──────────┘  └──────────┘    │
└──────────────────────────┬───────────────────────────────────┘
                           │
                           ▼
┌──────────────────────────────────────────────────────────────┐
│  Business Layer                                              │
│  ┌─────────────┐  ┌──────────────┐  ┌──────────────────┐   │
│  │ fmt.Errorf  │  │ Custom Error │  │ errors.Is/As    │   │
│  │  (%w wrap)  │  │  (sentinel)  │  │  (inspection)   │   │
│  └─────────────┘  └──────────────┘  └──────────────────┘   │
└──────────────────────────┬───────────────────────────────────┘
                           │
                           ▼
┌──────────────────────────────────────────────────────────────┐
│  Observability Layer                                         │
│  ┌──────────────┐  ┌──────────────┐  ┌──────────────────┐  │
│  │  OpenTelemetry│  │   Alerting   │  │   Dashboard     │  │
│  │    Tracing    │  │   (PagerDuty)│  │   (Grafana)     │  │
│  └──────────────┘  └──────────────┘  └──────────────────┘  │
└──────────────────────────────────────────────────────────────┘

Pattern 1:Error Wrapping with fmt.Errorf and %w verb

错误包装是Go错误处理的基石。核心思想:%w 动词包装错误,保留完整的错误链路,而非用字符串拼接丢失原始错误信息。

反面教材:字符串拼接丢失上下文

func getUser(id int) (*User, error) {
    row := db.QueryRow("SELECT * FROM users WHERE id = ?", id)
    var u User
    if err := row.Scan(&u.ID, &u.Name, &u.Email); err != nil {
        return nil, fmt.Errorf("scan user failed: %v", err)
    }
    return &u, nil
}

%v 把错误转成字符串,原始的 sql: no rows in result set 变成了不可检查的纯文本。调用方无法用 errors.Iserrors.As 判断错误类型。

正确做法:用 %w 保留错误链

package repository

import (
    "database/sql"
    "fmt"
)

type User struct {
    ID    int
    Name  string
    Email string
}

func GetUser(id int) (*User, error) {
    row := db.QueryRow("SELECT * FROM users WHERE id = ?", id)
    var u User
    if err := row.Scan(&u.ID, &u.Name, &u.Email); err != nil {
        return nil, fmt.Errorf("get user id=%d: %w", id, err)
    }
    return &u, nil
}

func HandleGetUser(id int) {
    user, err := GetUser(id)
    if err != nil {
        if errors.Is(err, sql.ErrNoRows) {
            log.Printf("user not found: id=%d", id)
            return
        }
        log.Printf("unexpected error: %v", err)
    }
    _ = user
}

多层包装的错误链

package service

import (
    "fmt"
    "myapp/repository"
)

func ProcessOrder(orderID int) error {
    user, err := repository.GetUser(orderID)
    if err != nil {
        return fmt.Errorf("process order %d: %w", orderID, err)
    }
    _ = user
    return nil
}

func HandleOrder(orderID int) {
    if err := ProcessOrder(orderID); err != nil {
        var sqlErr *sqlite.Error
        if errors.As(err, &sqlErr) {
            log.Printf("database error: %v", sqlErr)
        }
    }
}

关键规则

  • 始终用 %w 而非 %v 包装错误
  • 包装时添加业务上下文(函数名、参数、操作描述)
  • 错误链最多3-4层,超过说明抽象层次有问题
  • 只包装一次,避免同一错误被多层重复包装

Pattern 2:Custom Error Types with Sentinel Errors

自定义错误类型让错误携带业务语义。核心思想:用sentinel error定义可预知的错误状态,用自定义struct携带结构化上下文

Sentinel Error:可预知的错误常量

package apperr

import "errors"

var (
    ErrNotFound       = errors.New("resource not found")
    ErrUnauthorized   = errors.New("unauthorized access")
    ErrForbidden      = errors.New("forbidden operation")
    ErrConflict       = errors.New("resource conflict")
    ErrRateLimited    = errors.New("rate limit exceeded")
    ErrValidation     = errors.New("validation failed")
    ErrInternal       = errors.New("internal server error")
)

自定义错误类型:携带结构化上下文

package apperr

import (
    "fmt"
    "time"
)

type DomainError struct {
    Code      string
    Message   string
    Domain    string
    Timestamp time.Time
    Err       error
}

func (e *DomainError) Error() string {
    if e.Err != nil {
        return fmt.Sprintf("[%s] %s: %s: %v", e.Domain, e.Code, e.Message, e.Err)
    }
    return fmt.Sprintf("[%s] %s: %s", e.Domain, e.Code, e.Message)
}

func (e *DomainError) Unwrap() error {
    return e.Err
}

func NewDomainError(domain, code, message string, err error) *DomainError {
    return &DomainError{
        Code:      code,
        Message:   message,
        Domain:    domain,
        Timestamp: time.Now(),
        Err:       err,
    }
}

业务层使用自定义错误

package order

import (
    "apperr"
    "fmt"
)

type OrderService struct {
    repo OrderRepository
}

func (s *OrderService) CreateOrder(req CreateOrderRequest) (*Order, error) {
    if err := validateOrder(req); err != nil {
        return nil, apperr.NewDomainError(
            "order",
            "VALIDATION_ERROR",
            fmt.Sprintf("invalid order request: user_id=%d", req.UserID),
            err,
        )
    }

    existing, err := s.repo.FindByUserAndProduct(req.UserID, req.ProductID)
    if err != nil {
        return nil, apperr.NewDomainError(
            "order",
            "DB_ERROR",
            "failed to check existing order",
            err,
        )
    }
    if existing != nil {
        return nil, apperr.NewDomainError(
            "order",
            "CONFLICT",
            fmt.Sprintf("duplicate order: user=%d product=%d", req.UserID, req.ProductID),
            apperr.ErrConflict,
        )
    }

    order := &Order{
        UserID:    req.UserID,
        ProductID: req.ProductID,
        Quantity:  req.Quantity,
    }
    if err := s.repo.Save(order); err != nil {
        return nil, apperr.NewDomainError(
            "order",
            "DB_ERROR",
            "failed to save order",
            err,
        )
    }
    return order, nil
}

关键规则

  • Sentinel error用 errors.New 定义,命名以 Err 开头
  • 自定义错误类型必须实现 Error()Unwrap() 方法
  • 错误码用大写蛇形命名(VALIDATION_ERROR),便于日志检索
  • 每个业务域定义自己的错误码空间,避免冲突

Pattern 3:errors.Is / errors.As for Error Inspection

错误检查是错误处理的核心操作。核心思想:errors.Is 检查错误值,用 errors.As 提取错误类型,取代脆弱的字符串匹配。

errors.Is:检查错误链中的特定错误值

package handler

import (
    "apperr"
    "database/sql"
    "errors"
    "net/http"
)

func GetUserHandler(w http.ResponseWriter, r *http.Request) {
    id := r.URL.Query().Get("id")

    user, err := userService.GetUser(id)
    if err != nil {
        switch {
        case errors.Is(err, sql.ErrNoRows):
            http.Error(w, "user not found", http.StatusNotFound)
        case errors.Is(err, apperr.ErrUnauthorized):
            http.Error(w, "unauthorized", http.StatusUnauthorized)
        case errors.Is(err, apperr.ErrRateLimited):
            http.Error(w, "rate limited", http.StatusTooManyRequests)
        default:
            http.Error(w, "internal error", http.StatusInternalServerError)
        }
        return
    }

    writeJSON(w, http.StatusOK, user)
}

errors.As:提取错误链中的特定类型

package handler

import (
    "apperr"
    "errors"
    "net/http"
)

func CreateOrderHandler(w http.ResponseWriter, r *http.Request) {
    var req CreateOrderRequest
    if err := decodeJSON(r, &req); err != nil {
        http.Error(w, "bad request", http.StatusBadRequest)
        return
    }

    order, err := orderService.CreateOrder(req)
    if err != nil {
        var domainErr *apperr.DomainError
        if errors.As(err, &domainErr) {
            switch domainErr.Code {
            case "VALIDATION_ERROR":
                http.Error(w, domainErr.Message, http.StatusBadRequest)
            case "CONFLICT":
                http.Error(w, domainErr.Message, http.StatusConflict)
            case "DB_ERROR":
                log.Printf("database error: %v", err)
                http.Error(w, "internal error", http.StatusInternalServerError)
            default:
                http.Error(w, "internal error", http.StatusInternalServerError)
            }
            return
        }

        log.Printf("unhandled error: %v", err)
        http.Error(w, "internal error", http.StatusInternalServerError)
        return
    }

    writeJSON(w, http.StatusCreated, order)
}

常见错误:用字符串匹配检查错误

// ❌ 错误:字符串匹配脆弱且不可靠
if strings.Contains(err.Error(), "not found") {
    // ...
}

// ✅ 正确:用 errors.Is 检查错误值
if errors.Is(err, apperr.ErrNotFound) {
    // ...
}

// ❌ 错误:类型断言不遍历错误链
if de, ok := err.(*apperr.DomainError); ok {
    // 如果 err 是被包装过的,这里 ok=false
}

// ✅ 正确:用 errors.As 遍历错误链
var de *apperr.DomainError
if errors.As(err, &de) {
    // 即使 err 被多层包装也能正确提取
}

关键规则

  • 永远不要用 err.Error() 做字符串匹配
  • errors.Is 用于检查sentinel error,errors.As 用于提取自定义错误类型
  • errors.As 的第二个参数必须是指向目标类型的指针
  • 在HTTP handler层统一做错误映射,业务层只返回错误

Pattern 4:Panic Recovery Middleware

Panic是Go的"核武器"——不到万不得已不要使用,但必须有防御手段。核心思想:在HTTP middleware和goroutine入口处recover panic,防止级联崩溃

HTTP Server Panic Recovery

package middleware

import (
    "log"
    "net/http"
    "runtime/debug"
)

func Recovery(next http.Handler) http.Handler {
    return http.HandlerFunc(func(w http.ResponseWriter, r *http.Request) {
        defer func() {
            if rec := recover(); rec != nil {
                stack := debug.Stack()
                log.Printf(
                    "[PANIC] path=%s method=%s error=%v\n%s",
                    r.URL.Path, r.Method, rec, stack,
                )
                http.Error(w, "internal server error", http.StatusInternalServerError)
            }
        }()
        next.ServeHTTP(w, r)
    })
}

Goroutine Panic Recovery

package goroutine

import (
    "log"
    "runtime/debug"
)

func SafeGo(fn func(), panicHandler func(recover any, stack []byte)) {
    go func() {
        defer func() {
            if rec := recover(); rec != nil {
                stack := debug.Stack()
                log.Printf("[GOROUTINE PANIC] recovered=%v\n%s", rec, stack)
                if panicHandler != nil {
                    panicHandler(rec, stack)
                }
            }
        }()
        fn()
    }()
}

带错误通道的Goroutine Recovery

package worker

import (
    "runtime/debug"
)

type PanicError struct {
    Recover any
    Stack   []byte
}

func (e *PanicError) Error() string {
    return "goroutine panic recovered"
}

func SafeGoWithErr(fn func() error, errCh chan<- error) {
    go func() {
        defer func() {
            if rec := recover(); rec != nil {
                errCh <- &PanicError{
                    Recover: rec,
                    Stack:   debug.Stack(),
                }
            }
        }()
        if err := fn(); err != nil {
            errCh <- err
        }
    }()
}

func ProcessBatch(items []Item) []error {
    var errs []error
    errCh := make(chan error, len(items))

    for _, item := range items {
        SafeGoWithErr(func() error {
            return processItem(item)
        }, errCh)
    }

    for i := 0; i < len(items); i++ {
        if err := <-errCh; err != nil {
            errs = append(errs, err)
        }
    }
    return errs
}

Gin框架的Recovery中间件

package middleware

import (
    "log"
    "net/http"
    "runtime/debug"

    "github.com/gin-gonic/gin"
)

func GinRecovery() gin.HandlerFunc {
    return func(c *gin.Context) {
        defer func() {
            if rec := recover(); rec != nil {
                stack := debug.Stack()
                log.Printf(
                    "[PANIC] path=%s method=%s client_ip=%s error=%v\n%s",
                    c.Request.URL.Path,
                    c.Request.Method,
                    c.ClientIP(),
                    rec,
                    stack,
                )
                c.AbortWithStatusJSON(http.StatusInternalServerError, gin.H{
                    "error": "internal server error",
                })
            }
        }()
        c.Next()
    }
}

关键规则

  • HTTP server必须在最外层middleware添加recovery
  • 每个goroutine入口处必须有recovery,否则panic会导致整个进程退出
  • recover后必须记录完整stack trace,否则无法定位问题
  • recover后不要继续处理请求,返回500即可
  • 只recover自己代码的panic,不要recover标准库的panic(如map并发读写)

Pattern 5:Error Middleware Chain

错误中间件链将横切关注点统一编织进错误处理。核心思想:日志、指标、追踪不在业务代码中散落,而是通过中间件链统一处理

错误中间件架构

Request → Recovery → Logging → Metrics → Tracing → Handler → Response
              │          │         │          │
              ▼          ▼         ▼          ▼
          log panic  log error  emit counter  span error

完整的错误中间件链实现

package middleware

import (
    "log"
    "net/http"
    "runtime/debug"
    "time"
)

type ResponseRecorder struct {
    http.ResponseWriter
    StatusCode int
    Body       []byte
}

func (r *ResponseRecorder) WriteHeader(code int) {
    r.StatusCode = code
    r.ResponseWriter.WriteHeader(code)
}

func (r *ResponseRecorder) Write(b []byte) (int, error) {
    r.Body = append(r.Body, b...)
    return r.ResponseWriter.Write(b)
}

func ErrorChain(next http.Handler) http.Handler {
    return Recovery(
        Logging(
            Metrics(
                Tracing(next),
            ),
        ),
    )
}

func Recovery(next http.Handler) http.Handler {
    return http.HandlerFunc(func(w http.ResponseWriter, r *http.Request) {
        defer func() {
            if rec := recover(); rec != nil {
                log.Printf("[PANIC] path=%s error=%v\n%s", r.URL.Path, rec, debug.Stack())
                http.Error(w, "internal server error", http.StatusInternalServerError)
            }
        }()
        next.ServeHTTP(w, r)
    })
}

func Logging(next http.Handler) http.Handler {
    return http.HandlerFunc(func(w http.ResponseWriter, r *http.Request) {
        start := time.Now()
        rec := &ResponseRecorder{ResponseWriter: w, StatusCode: http.StatusOK}

        next.ServeHTTP(rec, r)

        duration := time.Since(start)
        if rec.StatusCode >= 400 {
            log.Printf(
                "[ERROR] method=%s path=%s status=%d duration=%s",
                r.Method, r.URL.Path, rec.StatusCode, duration,
            )
        } else {
            log.Printf(
                "[INFO] method=%s path=%s status=%d duration=%s",
                r.Method, r.URL.Path, rec.StatusCode, duration,
            )
        }
    })
}

func Metrics(next http.Handler) http.Handler {
    return http.HandlerFunc(func(w http.ResponseWriter, r *http.Request) {
        rec := &ResponseRecorder{ResponseWriter: w, StatusCode: http.StatusOK}
        next.ServeHTTP(rec, r)

        statusCode := rec.StatusCode
        if statusCode >= 500 {
            errorCounter.WithLabelValues(r.URL.Path, "5xx").Inc()
        } else if statusCode >= 400 {
            errorCounter.WithLabelValues(r.URL.Path, "4xx").Inc()
        }
        requestDuration.WithLabelValues(r.URL.Path).Observe(float64(statusCode))
    })
}

func Tracing(next http.Handler) http.Handler {
    return http.HandlerFunc(func(w http.ResponseWriter, r *http.Request) {
        ctx, span := tracer.Start(r.Context(), r.URL.Path)
        defer span.End()

        rec := &ResponseRecorder{ResponseWriter: w, StatusCode: http.StatusOK}
        next.ServeHTTP(rec, r.WithContext(ctx))

        if rec.StatusCode >= 400 {
            span.SetAttributes(attribute.Int("http.status_code", rec.StatusCode))
            span.SetStatus(codes.Error, http.StatusText(rec.StatusCode))
        }
    })
}

基于函数选项的错误中间件

package middleware

import (
    "log"
    "net/http"
)

type ErrorMiddlewareOption struct {
    LogErrors   bool
    EmitMetrics bool
    TraceErrors bool
    OnPanic     func(recover any, stack []byte)
}

func ErrorMiddleware(opts ErrorMiddlewareOption) func(http.Handler) http.Handler {
    return func(next http.Handler) http.Handler {
        return http.HandlerFunc(func(w http.ResponseWriter, r *http.Request) {
            defer func() {
                if rec := recover(); rec != nil {
                    stack := debug.Stack()
                    if opts.OnPanic != nil {
                        opts.OnPanic(rec, stack)
                    }
                    log.Printf("[PANIC] %v\n%s", rec, stack)
                    http.Error(w, "internal server error", http.StatusInternalServerError)
                }
            }()

            rec := &ResponseRecorder{ResponseWriter: w, StatusCode: http.StatusOK}
            next.ServeHTTP(rec, r)

            if rec.StatusCode >= 400 {
                if opts.LogErrors {
                    log.Printf("[ERROR] path=%s status=%d", r.URL.Path, rec.StatusCode)
                }
                if opts.EmitMetrics {
                    errorCounter.WithLabelValues(r.URL.Path).Inc()
                }
                if opts.TraceErrors {
                    span := trace.SpanFromContext(r.Context())
                    span.SetStatus(codes.Error, http.StatusText(rec.StatusCode))
                }
            }
        })
    }
}

关键规则

  • Recovery必须在最外层,确保所有panic都能被捕获
  • Logging记录请求上下文(路径、方法、状态码、耗时)
  • Metrics按路径和状态码分类统计错误率
  • Tracing将错误状态写入span,便于分布式追踪
  • 中间件顺序:Recovery → Logging → Metrics → Tracing → Handler

Pattern 6:Production Error Observability with OpenTelemetry

生产可观测性是错误处理的最后一公里。核心思想:用OpenTelemetry将错误接入分布式追踪、指标和日志三大支柱,实现从发现到定位的全链路可观测。

OpenTelemetry错误追踪集成

package telemetry

import (
    "context"
    "fmt"

    "go.opentelemetry.io/otel"
    "go.opentelemetry.io/otel/attribute"
    "go.opentelemetry.io/otel/codes"
    "go.opentelemetry.io/otel/trace"
)

func RecordError(ctx context.Context, err error, attrs ...attribute.KeyValue) {
    span := trace.SpanFromContext(ctx)
    if !span.IsRecording() {
        return
    }

    span.SetStatus(codes.Error, err.Error())
    span.SetAttributes(attrs...)
    span.RecordError(err, trace.WithAttributes(attrs...))
}

func WrapWithSpan(ctx context.Context, operation string, fn func(ctx context.Context) error) error {
    ctx, span := otel.Tracer("app").Start(ctx, operation)
    defer span.End()

    if err := fn(ctx); err != nil {
        RecordError(ctx, err,
            attribute.String("operation", operation),
        )
        return err
    }
    return nil
}

带错误追踪的Repository层

package repository

import (
    "context"
    "database/sql"
    "fmt"

    "apperr"
    "telemetry"

    "go.opentelemetry.io/otel/attribute"
)

type UserRepository struct {
    db *sql.DB
}

func (r *UserRepository) GetUser(ctx context.Context, id int) (*User, error) {
    ctx, span := otel.Tracer("repository").Start(ctx, "UserRepository.GetUser")
    defer span.End()

    span.SetAttributes(attribute.Int("user.id", id))

    row := r.db.QueryRowContext(ctx, "SELECT id, name, email FROM users WHERE id = ?", id)
    var u User
    if err := row.Scan(&u.ID, &u.Name, &u.Email); err != nil {
        if err == sql.ErrNoRows {
            span.SetAttributes(attribute.String("error.type", "not_found"))
            return nil, fmt.Errorf("get user id=%d: %w", id, apperr.ErrNotFound)
        }
        telemetry.RecordError(ctx, err,
            attribute.String("db.operation", "SELECT"),
            attribute.Int("user.id", id),
        )
        return nil, fmt.Errorf("get user id=%d: %w", id, err)
    }

    span.SetAttributes(attribute.String("user.name", u.Name))
    return &u, nil
}

错误指标采集

package metrics

import (
    "go.opentelemetry.io/otel/metric"
)

var (
    errorCounter metric.Int64Counter
    errorLatency metric.Float64Histogram
)

func InitMetrics(meter metric.Meter) {
    errorCounter, _ = meter.Int64Counter(
        "app.errors.total",
        metric.WithDescription("Total number of errors"),
    )
    errorLatency, _ = meter.Float64Histogram(
        "app.errors.latency_seconds",
        metric.WithDescription("Error handling latency"),
    )
}

func RecordErrorMetric(domain, code string) {
    errorCounter.Add(context.Background(), 1,
        metric.WithAttributes(
            attribute.String("domain", domain),
            attribute.String("code", code),
        ),
    )
}

错误告警规则

groups:
  - name: error_alerts
    rules:
      - alert: HighErrorRate
        expr: |
          sum(rate(app_errors_total{code=~"5.."}[5m]))
          /
          sum(rate(http_requests_total[5m]))
          > 0.05
        for: 2m
        labels:
          severity: critical
        annotations:
          summary: "Error rate exceeds 5%"
          description: "5xx error rate is {{ $value | humanizePercentage }}"

      - alert: PanicDetected
        expr: increase(app_errors_total{code="PANIC"}[1m]) > 0
        for: 0m
        labels:
          severity: critical
        annotations:
          summary: "Panic recovered in production"
          description: "{{ $value }} panic(s) detected in the last minute"

关键规则

  • 每个错误必须记录到至少一个追踪span
  • 错误指标按domain和code分类,便于聚合分析
  • 告警规则区分4xx(客户端错误)和5xx(服务端错误)
  • Panic告警必须立即触发(for: 0m),不等累积
  • 日志、指标、追踪必须关联同一个trace_id

5个常见陷阱与解决方案

陷阱1:错误被吞掉

// ❌ 错误:忽略返回值
_, _ = io.Copy(dst, src)

// ✅ 正确:检查错误
written, err := io.Copy(dst, src)
if err != nil {
    return fmt.Errorf("copy data: %w", err)
}

陷阱2:错误包装丢失原始信息

// ❌ 错误:用 %v 丢失错误链
return fmt.Errorf("query failed: %v", err)

// ✅ 正确:用 %w 保留错误链
return fmt.Errorf("query failed: %w", err)

陷阱3:goroutine中的panic未recover

// ❌ 错误:goroutine panic导致整个进程退出
go func() {
    result := doSomething()
    ch <- result
}()

// ✅ 正确:goroutine入口处recover
go func() {
    defer func() {
        if rec := recover(); rec != nil {
            log.Printf("goroutine panic: %v\n%s", rec, debug.Stack())
            ch <- nil
        }
    }()
    result := doSomething()
    ch <- result
}()

陷阱4:错误检查用字符串匹配

// ❌ 错误:字符串匹配脆弱
if strings.Contains(err.Error(), "timeout") {
    // ...
}

// ✅ 正确:用 errors.Is 检查
if errors.Is(err, context.DeadlineExceeded) {
    // ...
}

陷阱5:重复包装同一错误

// ❌ 错误:多层重复包装导致错误信息冗余
func a() error {
    return fmt.Errorf("a: %w", err)
}
func b() error {
    return fmt.Errorf("b: %w", a())  // "b: a: original error"
}
func c() error {
    return fmt.Errorf("c: %w", b())  // "c: b: a: original error"
}

// ✅ 正确:只在边界层包装,内部直接传递
func a() error {
    return err  // 内部直接传递
}
func b() error {
    return a()
}
func c() error {
    return fmt.Errorf("process order: %w", b())  // 只在边界包装一次
}

10个常见错误排查表

错误现象 可能原因 排查方法 解决方案
sql: no rows in result set QueryRow无结果 检查SQL WHERE条件 errors.Is(err, sql.ErrNoRows) 判断
context deadline exceeded 操作超时 检查context超时设置 增加超时时间或优化查询性能
panic: concurrent map writes map并发写入 检查goroutine共享map sync.Map 或加锁
panic: send on closed channel 向已关闭channel发送 检查channel关闭时机 用sync.Once或单向channel控制
connection refused 服务未启动或端口错误 检查服务状态和端口 确认服务启动,检查网络配置
i/o timeout 网络超时 检查网络连通性 增加超时,检查防火墙规则
record not found 数据不存在 检查查询条件 区分"不存在"和"查询失败"
duplicate key value 唯一约束冲突 检查插入数据 errors.As 提取PG错误码
panic: nil pointer dereference 空指针解引用 检查指针初始化 添加nil检查,用Recovery中间件
too many open files 文件描述符耗尽 检查连接池和文件操作 增加ulimit,检查连接泄漏

高级优化技巧

技巧1:错误分组与聚合

package apperr

import "strings"

type ErrorGroup struct {
    Errors []error
}

func (g *ErrorGroup) Add(err error) {
    if err != nil {
        g.Errors = append(g.Errors, err)
    }
}

func (g *ErrorGroup) Err() error {
    if len(g.Errors) == 0 {
        return nil
    }
    return g
}

func (g *ErrorGroup) Error() string {
    msgs := make([]string, len(g.Errors))
    for i, err := range g.Errors {
        msgs[i] = err.Error()
    }
    return strings.Join(msgs, "; ")
}

func (g *ErrorGroup) Unwrap() []error {
    return g.Errors
}

技巧2:错误重试与退避

package retry

import (
    "context"
    "fmt"
    "math"
    "time"
)

type Config struct {
    MaxAttempts int
    BaseDelay   time.Duration
    MaxDelay    time.Duration
    Retryable   func(error) bool
}

func Do(ctx context.Context, cfg Config, fn func() error) error {
    var lastErr error
    for attempt := 0; attempt < cfg.MaxAttempts; attempt++ {
        if err := fn(); err != nil {
            if !cfg.Retryable(err) {
                return fmt.Errorf("non-retryable error: %w", err)
            }
            lastErr = err

            delay := time.Duration(
                float64(cfg.BaseDelay) * math.Pow(2, float64(attempt)),
            )
            if delay > cfg.MaxDelay {
                delay = cfg.MaxDelay
            }

            select {
            case <-ctx.Done():
                return ctx.Err()
            case <-time.After(delay):
                continue
            }
        }
        return nil
    }
    return fmt.Errorf("max attempts (%d) exceeded: %w", cfg.MaxAttempts, lastErr)
}

技巧3:错误码映射HTTP状态

package handler

import (
    "apperr"
    "errors"
    "net/http"
)

var codeToStatus = map[string]int{
    "NOT_FOUND":       http.StatusNotFound,
    "VALIDATION_ERROR": http.StatusBadRequest,
    "UNAUTHORIZED":    http.StatusUnauthorized,
    "FORBIDDEN":       http.StatusForbidden,
    "CONFLICT":        http.StatusConflict,
    "RATE_LIMITED":    http.StatusTooManyRequests,
    "DB_ERROR":        http.StatusInternalServerError,
}

func MapErrorToHTTP(err error) (int, string) {
    var domainErr *apperr.DomainError
    if errors.As(err, &domainErr) {
        if status, ok := codeToStatus[domainErr.Code]; ok {
            return status, domainErr.Message
        }
    }

    if errors.Is(err, apperr.ErrNotFound) {
        return http.StatusNotFound, "resource not found"
    }
    if errors.Is(err, apperr.ErrUnauthorized) {
        return http.StatusUnauthorized, "unauthorized"
    }

    return http.StatusInternalServerError, "internal server error"
}

技巧4:结构化错误日志

package logger

import (
    "log/slog"
    "apperr"
    "errors"
)

func LogError(err error, context ...slog.Attr) {
    attrs := []slog.Attr{
        slog.String("error.message", err.Error()),
    }

    var domainErr *apperr.DomainError
    if errors.As(err, &domainErr) {
        attrs = append(attrs,
            slog.String("error.domain", domainErr.Domain),
            slog.String("error.code", domainErr.Code),
            slog.Time("error.timestamp", domainErr.Timestamp),
        )
    }

    attrs = append(attrs, context...)
    slog.LogAttrs(nil, slog.LevelError, "error occurred", attrs...)
}

技巧5:错误断言辅助函数

package apperr

import "errors"

func IsNotFound(err error) bool {
    return errors.Is(err, ErrNotFound)
}

func IsConflict(err error) bool {
    return errors.Is(err, ErrConflict)
}

func IsRateLimited(err error) bool {
    return errors.Is(err, ErrRateLimited)
}

func GetDomainError(err error) (*DomainError, bool) {
    var de *DomainError
    return de, errors.As(err, &de)
}

func GetCode(err error) string {
    var de *DomainError
    if errors.As(err, &de) {
        return de.Code
    }
    return "UNKNOWN"
}

错误处理方式对比分析

特性 fmt.Errorf %w 自定义错误类型 errors.Is/As Panic Recovery 错误中间件链 OpenTelemetry
保留错误链
携带业务语义
类型安全检查
防止进程崩溃
统一横切处理
分布式追踪 部分
指标采集
实现复杂度
生产必需度 ⭐⭐⭐⭐⭐ ⭐⭐⭐⭐⭐ ⭐⭐⭐⭐⭐ ⭐⭐⭐⭐⭐ ⭐⭐⭐⭐ ⭐⭐⭐⭐
适用阶段 所有 业务层 检查层 运行时 HTTP层 可观测层

推荐在线工具


总结

Go的错误处理不是"加个 if err != nil 就行",而是要回答五个问题:错误从哪来?是什么类型?怎么传播?怎么恢复?怎么观测? fmt.Errorf("%w", err) 回答了"从哪来",自定义错误类型回答了"什么类型",errors.Is/As 回答了"怎么检查",Recovery中间件回答了"怎么恢复",OpenTelemetry回答了"怎么观测"。掌握这6种模式,你就掌握了生产级Go错误处理的核心方法论。


延伸阅读

本站提供浏览器本地工具,免注册即可试用 →

#Go错误处理#错误包装#自定义错误#panic恢复#Go 1.24#2026#编程语言