Go錯誤處理最佳實踐:從錯誤包裝到恢復的6種生產模式

编程语言

當錯誤資訊遺失遇上panic雪崩:Go錯誤處理的至暗時刻

凌晨2點,線上支付服務報錯率飆升到30%。日誌裡全是 "database error: sql: no rows in result set",但完全不知道是哪個介面、哪條SQL、什麼業務場景觸發的。更糟的是,一個未捕獲的panic在goroutine裡炸開,連帶整個HTTP服務502。排查4小時後才發現:錯誤被層層吞掉,上下文資訊全部遺失,panic沒有recover,監控告警形同虛設。

這不是個例。Go的 if err != nil 雖然簡單——每個函式呼叫後檢查錯誤即可,但生產環境的錯誤處理遠不止檢查nil那麼簡單。你需要保留錯誤鏈路、實現型別化錯誤、優雅恢復panic、構建錯誤中介鏈、接入可觀測性體系。本文將從6種生產級錯誤處理模式出發,幫你構建健壯的Go錯誤處理體系。


核心收穫

  • 錯誤包裝:用 fmt.Errorf("%w", err) 保留完整錯誤鏈,而非字串拼接遺失上下文
  • 自訂錯誤型別:用sentinel error + 自訂struct實現業務語義化的錯誤分類
  • errors.Is/As:取代字串匹配,用型別安全的錯誤檢查應對錯誤鏈巢狀
  • Panic Recovery:在HTTP中介層和goroutine中recover panic,防止級聯崩潰
  • 錯誤中介鏈:將日誌、指標、追蹤統一編織進錯誤處理流程
  • 生產可觀測性:用OpenTelemetry將錯誤接入分散式追蹤和告警體系

目錄

  1. 錯誤處理架構總覽
  2. Pattern 1:Error Wrapping with fmt.Errorf and %w verb
  3. Pattern 2:Custom Error Types with Sentinel Errors
  4. Pattern 3:errors.Is / errors.As for Error Inspection
  5. Pattern 4:Panic Recovery Middleware
  6. Pattern 5:Error Middleware Chain
  7. Pattern 6:Production Error Observability with OpenTelemetry
  8. 5個常見陷阱與解決方案
  9. 10個常見錯誤排查表
  10. 進階最佳化技巧
  11. 錯誤處理方式對比分析
  12. 推薦線上工具
  13. 總結

錯誤處理架構總覽

┌─────────────────────────────────────────────────────────────┐
│                    HTTP Request                              │
└──────────────────────────┬──────────────────────────────────┘
                           │
                           ▼
┌──────────────────────────────────────────────────────────────┐
│  Middleware Chain                                             │
│  ┌──────────┐  ┌──────────┐  ┌──────────┐  ┌──────────┐    │
│  │ Recovery │→│  Logging │→│ Metrics  │→│ Tracing  │    │
│  └──────────┘  └──────────┘  └──────────┘  └──────────┘    │
└──────────────────────────┬───────────────────────────────────┘
                           │
                           ▼
┌──────────────────────────────────────────────────────────────┐
│  Business Layer                                              │
│  ┌─────────────┐  ┌──────────────┐  ┌──────────────────┐   │
│  │ fmt.Errorf  │  │ Custom Error │  │ errors.Is/As    │   │
│  │  (%w wrap)  │  │  (sentinel)  │  │  (inspection)   │   │
│  └─────────────┘  └──────────────┘  └──────────────────┘   │
└──────────────────────────┬───────────────────────────────────┘
                           │
                           ▼
┌──────────────────────────────────────────────────────────────┐
│  Observability Layer                                         │
│  ┌──────────────┐  ┌──────────────┐  ┌──────────────────┐  │
│  │  OpenTelemetry│  │   Alerting   │  │   Dashboard     │  │
│  │    Tracing    │  │   (PagerDuty)│  │   (Grafana)     │  │
│  └──────────────┘  └──────────────┘  └──────────────────┘  │
└──────────────────────────────────────────────────────────────┘

Pattern 1:Error Wrapping with fmt.Errorf and %w verb

錯誤包裝是Go錯誤處理的基石。核心思想:%w 動詞包裝錯誤,保留完整的錯誤鏈路,而非用字串拼接遺失原始錯誤資訊。

反面教材:字串拼接遺失上下文

func getUser(id int) (*User, error) {
    row := db.QueryRow("SELECT * FROM users WHERE id = ?", id)
    var u User
    if err := row.Scan(&u.ID, &u.Name, &u.Email); err != nil {
        return nil, fmt.Errorf("scan user failed: %v", err)
    }
    return &u, nil
}

%v 把錯誤轉成字串,原始的 sql: no rows in result set 變成了不可檢查的純文字。呼叫方無法用 errors.Iserrors.As 判斷錯誤型別。

正確做法:用 %w 保留錯誤鏈

package repository

import (
    "database/sql"
    "fmt"
)

type User struct {
    ID    int
    Name  string
    Email string
}

func GetUser(id int) (*User, error) {
    row := db.QueryRow("SELECT * FROM users WHERE id = ?", id)
    var u User
    if err := row.Scan(&u.ID, &u.Name, &u.Email); err != nil {
        return nil, fmt.Errorf("get user id=%d: %w", id, err)
    }
    return &u, nil
}

func HandleGetUser(id int) {
    user, err := GetUser(id)
    if err != nil {
        if errors.Is(err, sql.ErrNoRows) {
            log.Printf("user not found: id=%d", id)
            return
        }
        log.Printf("unexpected error: %v", err)
    }
    _ = user
}

多層包裝的錯誤鏈

package service

import (
    "fmt"
    "myapp/repository"
)

func ProcessOrder(orderID int) error {
    user, err := repository.GetUser(orderID)
    if err != nil {
        return fmt.Errorf("process order %d: %w", orderID, err)
    }
    _ = user
    return nil
}

func HandleOrder(orderID int) {
    if err := ProcessOrder(orderID); err != nil {
        var sqlErr *sqlite.Error
        if errors.As(err, &sqlErr) {
            log.Printf("database error: %v", sqlErr)
        }
    }
}

關鍵規則

  • 始終用 %w 而非 %v 包裝錯誤
  • 包裝時新增業務上下文(函式名、引數、操作描述)
  • 錯誤鏈最多3-4層,超過說明抽象層次有問題
  • 只在邊界層包裝一次,避免同一錯誤被多層重複包裝

Pattern 2:Custom Error Types with Sentinel Errors

自訂錯誤型別讓錯誤攜帶業務語義。核心思想:用sentinel error定義可預知的錯誤狀態,用自訂struct攜帶結構化上下文

Sentinel Error:可預知的錯誤常數

package apperr

import "errors"

var (
    ErrNotFound       = errors.New("resource not found")
    ErrUnauthorized   = errors.New("unauthorized access")
    ErrForbidden      = errors.New("forbidden operation")
    ErrConflict       = errors.New("resource conflict")
    ErrRateLimited    = errors.New("rate limit exceeded")
    ErrValidation     = errors.New("validation failed")
    ErrInternal       = errors.New("internal server error")
)

自訂錯誤型別:攜帶結構化上下文

package apperr

import (
    "fmt"
    "time"
)

type DomainError struct {
    Code      string
    Message   string
    Domain    string
    Timestamp time.Time
    Err       error
}

func (e *DomainError) Error() string {
    if e.Err != nil {
        return fmt.Sprintf("[%s] %s: %s: %v", e.Domain, e.Code, e.Message, e.Err)
    }
    return fmt.Sprintf("[%s] %s: %s", e.Domain, e.Code, e.Message)
}

func (e *DomainError) Unwrap() error {
    return e.Err
}

func NewDomainError(domain, code, message string, err error) *DomainError {
    return &DomainError{
        Code:      code,
        Message:   message,
        Domain:    domain,
        Timestamp: time.Now(),
        Err:       err,
    }
}

業務層使用自訂錯誤

package order

import (
    "apperr"
    "fmt"
)

type OrderService struct {
    repo OrderRepository
}

func (s *OrderService) CreateOrder(req CreateOrderRequest) (*Order, error) {
    if err := validateOrder(req); err != nil {
        return nil, apperr.NewDomainError(
            "order",
            "VALIDATION_ERROR",
            fmt.Sprintf("invalid order request: user_id=%d", req.UserID),
            err,
        )
    }

    existing, err := s.repo.FindByUserAndProduct(req.UserID, req.ProductID)
    if err != nil {
        return nil, apperr.NewDomainError(
            "order",
            "DB_ERROR",
            "failed to check existing order",
            err,
        )
    }
    if existing != nil {
        return nil, apperr.NewDomainError(
            "order",
            "CONFLICT",
            fmt.Sprintf("duplicate order: user=%d product=%d", req.UserID, req.ProductID),
            apperr.ErrConflict,
        )
    }

    order := &Order{
        UserID:    req.UserID,
        ProductID: req.ProductID,
        Quantity:  req.Quantity,
    }
    if err := s.repo.Save(order); err != nil {
        return nil, apperr.NewDomainError(
            "order",
            "DB_ERROR",
            "failed to save order",
            err,
        )
    }
    return order, nil
}

關鍵規則

  • Sentinel error用 errors.New 定義,命名以 Err 開頭
  • 自訂錯誤型別必須實作 Error()Unwrap() 方法
  • 錯誤碼用大寫蛇形命名(VALIDATION_ERROR),便於日誌檢索
  • 每個業務域定義自己的錯誤碼空間,避免衝突

Pattern 3:errors.Is / errors.As for Error Inspection

錯誤檢查是錯誤處理的核心操作。核心思想:errors.Is 檢查錯誤值,用 errors.As 提取錯誤型別,取代脆弱的字串匹配。

errors.Is:檢查錯誤鏈中的特定錯誤值

package handler

import (
    "apperr"
    "database/sql"
    "errors"
    "net/http"
)

func GetUserHandler(w http.ResponseWriter, r *http.Request) {
    id := r.URL.Query().Get("id")

    user, err := userService.GetUser(id)
    if err != nil {
        switch {
        case errors.Is(err, sql.ErrNoRows):
            http.Error(w, "user not found", http.StatusNotFound)
        case errors.Is(err, apperr.ErrUnauthorized):
            http.Error(w, "unauthorized", http.StatusUnauthorized)
        case errors.Is(err, apperr.ErrRateLimited):
            http.Error(w, "rate limited", http.StatusTooManyRequests)
        default:
            http.Error(w, "internal error", http.StatusInternalServerError)
        }
        return
    }

    writeJSON(w, http.StatusOK, user)
}

errors.As:提取錯誤鏈中的特定型別

package handler

import (
    "apperr"
    "errors"
    "net/http"
)

func CreateOrderHandler(w http.ResponseWriter, r *http.Request) {
    var req CreateOrderRequest
    if err := decodeJSON(r, &req); err != nil {
        http.Error(w, "bad request", http.StatusBadRequest)
        return
    }

    order, err := orderService.CreateOrder(req)
    if err != nil {
        var domainErr *apperr.DomainError
        if errors.As(err, &domainErr) {
            switch domainErr.Code {
            case "VALIDATION_ERROR":
                http.Error(w, domainErr.Message, http.StatusBadRequest)
            case "CONFLICT":
                http.Error(w, domainErr.Message, http.StatusConflict)
            case "DB_ERROR":
                log.Printf("database error: %v", err)
                http.Error(w, "internal error", http.StatusInternalServerError)
            default:
                http.Error(w, "internal error", http.StatusInternalServerError)
            }
            return
        }

        log.Printf("unhandled error: %v", err)
        http.Error(w, "internal error", http.StatusInternalServerError)
        return
    }

    writeJSON(w, http.StatusCreated, order)
}

常見錯誤:用字串匹配檢查錯誤

// ❌ 錯誤:字串匹配脆弱且不可靠
if strings.Contains(err.Error(), "not found") {
    // ...
}

// ✅ 正確:用 errors.Is 檢查錯誤值
if errors.Is(err, apperr.ErrNotFound) {
    // ...
}

// ❌ 錯誤:型別斷言不遍歷錯誤鏈
if de, ok := err.(*apperr.DomainError); ok {
    // 如果 err 是被包裝過的,這裡 ok=false
}

// ✅ 正確:用 errors.As 遍歷錯誤鏈
var de *apperr.DomainError
if errors.As(err, &de) {
    // 即使 err 被多層包裝也能正確提取
}

關鍵規則

  • 永遠不要用 err.Error() 做字串匹配
  • errors.Is 用於檢查sentinel error,errors.As 用於提取自訂錯誤型別
  • errors.As 的第二個引數必須是指向目標型別的指標
  • 在HTTP handler層統一做錯誤映射,業務層只回傳錯誤

Pattern 4:Panic Recovery Middleware

Panic是Go的「核武器」——不到萬不得已不要使用,但必須有防禦手段。核心思想:在HTTP中介層和goroutine入口處recover panic,防止級聯崩潰

HTTP Server Panic Recovery

package middleware

import (
    "log"
    "net/http"
    "runtime/debug"
)

func Recovery(next http.Handler) http.Handler {
    return http.HandlerFunc(func(w http.ResponseWriter, r *http.Request) {
        defer func() {
            if rec := recover(); rec != nil {
                stack := debug.Stack()
                log.Printf(
                    "[PANIC] path=%s method=%s error=%v\n%s",
                    r.URL.Path, r.Method, rec, stack,
                )
                http.Error(w, "internal server error", http.StatusInternalServerError)
            }
        }()
        next.ServeHTTP(w, r)
    })
}

Goroutine Panic Recovery

package goroutine

import (
    "log"
    "runtime/debug"
)

func SafeGo(fn func(), panicHandler func(recover any, stack []byte)) {
    go func() {
        defer func() {
            if rec := recover(); rec != nil {
                stack := debug.Stack()
                log.Printf("[GOROUTINE PANIC] recovered=%v\n%s", rec, stack)
                if panicHandler != nil {
                    panicHandler(rec, stack)
                }
            }
        }()
        fn()
    }()
}

帶錯誤通道的Goroutine Recovery

package worker

import (
    "runtime/debug"
)

type PanicError struct {
    Recover any
    Stack   []byte
}

func (e *PanicError) Error() string {
    return "goroutine panic recovered"
}

func SafeGoWithErr(fn func() error, errCh chan<- error) {
    go func() {
        defer func() {
            if rec := recover(); rec != nil {
                errCh <- &PanicError{
                    Recover: rec,
                    Stack:   debug.Stack(),
                }
            }
        }()
        if err := fn(); err != nil {
            errCh <- err
        }
    }()
}

func ProcessBatch(items []Item) []error {
    var errs []error
    errCh := make(chan error, len(items))

    for _, item := range items {
        SafeGoWithErr(func() error {
            return processItem(item)
        }, errCh)
    }

    for i := 0; i < len(items); i++ {
        if err := <-errCh; err != nil {
            errs = append(errs, err)
        }
    }
    return errs
}

Gin框架的Recovery中介層

package middleware

import (
    "log"
    "net/http"
    "runtime/debug"

    "github.com/gin-gonic/gin"
)

func GinRecovery() gin.HandlerFunc {
    return func(c *gin.Context) {
        defer func() {
            if rec := recover(); rec != nil {
                stack := debug.Stack()
                log.Printf(
                    "[PANIC] path=%s method=%s client_ip=%s error=%v\n%s",
                    c.Request.URL.Path,
                    c.Request.Method,
                    c.ClientIP(),
                    rec,
                    stack,
                )
                c.AbortWithStatusJSON(http.StatusInternalServerError, gin.H{
                    "error": "internal server error",
                })
            }
        }()
        c.Next()
    }
}

關鍵規則

  • HTTP server必須在最外層中介層新增recovery
  • 每個goroutine入口處必須有recovery,否則panic會導致整個程序退出
  • recover後必須記錄完整stack trace,否則無法定位問題
  • recover後不要繼續處理請求,回傳500即可
  • 只recover自己程式碼的panic,不要recover標準庫的panic(如map併發讀寫)

Pattern 5:Error Middleware Chain

錯誤中介鏈將橫切關注點統一編織進錯誤處理。核心思想:日誌、指標、追蹤不在業務程式碼中散落,而是透過中介鏈統一處理

錯誤中介層架構

Request → Recovery → Logging → Metrics → Tracing → Handler → Response
              │          │         │          │
              ▼          ▼         ▼          ▼
          log panic  log error  emit counter  span error

完整的錯誤中介鏈實作

package middleware

import (
    "log"
    "net/http"
    "runtime/debug"
    "time"
)

type ResponseRecorder struct {
    http.ResponseWriter
    StatusCode int
    Body       []byte
}

func (r *ResponseRecorder) WriteHeader(code int) {
    r.StatusCode = code
    r.ResponseWriter.WriteHeader(code)
}

func (r *ResponseRecorder) Write(b []byte) (int, error) {
    r.Body = append(r.Body, b...)
    return r.ResponseWriter.Write(b)
}

func ErrorChain(next http.Handler) http.Handler {
    return Recovery(
        Logging(
            Metrics(
                Tracing(next),
            ),
        ),
    )
}

func Recovery(next http.Handler) http.Handler {
    return http.HandlerFunc(func(w http.ResponseWriter, r *http.Request) {
        defer func() {
            if rec := recover(); rec != nil {
                log.Printf("[PANIC] path=%s error=%v\n%s", r.URL.Path, rec, debug.Stack())
                http.Error(w, "internal server error", http.StatusInternalServerError)
            }
        }()
        next.ServeHTTP(w, r)
    })
}

func Logging(next http.Handler) http.Handler {
    return http.HandlerFunc(func(w http.ResponseWriter, r *http.Request) {
        start := time.Now()
        rec := &ResponseRecorder{ResponseWriter: w, StatusCode: http.StatusOK}

        next.ServeHTTP(rec, r)

        duration := time.Since(start)
        if rec.StatusCode >= 400 {
            log.Printf(
                "[ERROR] method=%s path=%s status=%d duration=%s",
                r.Method, r.URL.Path, rec.StatusCode, duration,
            )
        } else {
            log.Printf(
                "[INFO] method=%s path=%s status=%d duration=%s",
                r.Method, r.URL.Path, rec.StatusCode, duration,
            )
        }
    })
}

func Metrics(next http.Handler) http.Handler {
    return http.HandlerFunc(func(w http.ResponseWriter, r *http.Request) {
        rec := &ResponseRecorder{ResponseWriter: w, StatusCode: http.StatusOK}
        next.ServeHTTP(rec, r)

        statusCode := rec.StatusCode
        if statusCode >= 500 {
            errorCounter.WithLabelValues(r.URL.Path, "5xx").Inc()
        } else if statusCode >= 400 {
            errorCounter.WithLabelValues(r.URL.Path, "4xx").Inc()
        }
        requestDuration.WithLabelValues(r.URL.Path).Observe(float64(statusCode))
    })
}

func Tracing(next http.Handler) http.Handler {
    return http.HandlerFunc(func(w http.ResponseWriter, r *http.Request) {
        ctx, span := tracer.Start(r.Context(), r.URL.Path)
        defer span.End()

        rec := &ResponseRecorder{ResponseWriter: w, StatusCode: http.StatusOK}
        next.ServeHTTP(rec, r.WithContext(ctx))

        if rec.StatusCode >= 400 {
            span.SetAttributes(attribute.Int("http.status_code", rec.StatusCode))
            span.SetStatus(codes.Error, http.StatusText(rec.StatusCode))
        }
    })
}

基於函式選項的錯誤中介層

package middleware

import (
    "log"
    "net/http"
)

type ErrorMiddlewareOption struct {
    LogErrors   bool
    EmitMetrics bool
    TraceErrors bool
    OnPanic     func(recover any, stack []byte)
}

func ErrorMiddleware(opts ErrorMiddlewareOption) func(http.Handler) http.Handler {
    return func(next http.Handler) http.Handler {
        return http.HandlerFunc(func(w http.ResponseWriter, r *http.Request) {
            defer func() {
                if rec := recover(); rec != nil {
                    stack := debug.Stack()
                    if opts.OnPanic != nil {
                        opts.OnPanic(rec, stack)
                    }
                    log.Printf("[PANIC] %v\n%s", rec, stack)
                    http.Error(w, "internal server error", http.StatusInternalServerError)
                }
            }()

            rec := &ResponseRecorder{ResponseWriter: w, StatusCode: http.StatusOK}
            next.ServeHTTP(rec, r)

            if rec.StatusCode >= 400 {
                if opts.LogErrors {
                    log.Printf("[ERROR] path=%s status=%d", r.URL.Path, rec.StatusCode)
                }
                if opts.EmitMetrics {
                    errorCounter.WithLabelValues(r.URL.Path).Inc()
                }
                if opts.TraceErrors {
                    span := trace.SpanFromContext(r.Context())
                    span.SetStatus(codes.Error, http.StatusText(rec.StatusCode))
                }
            }
        })
    }
}

關鍵規則

  • Recovery必須在最外層,確保所有panic都能被捕獲
  • Logging記錄請求上下文(路徑、方法、狀態碼、耗時)
  • Metrics按路徑和狀態碼分類統計錯誤率
  • Tracing將錯誤狀態寫入span,便於分散式追蹤
  • 中介層順序:Recovery → Logging → Metrics → Tracing → Handler

Pattern 6:Production Error Observability with OpenTelemetry

生產可觀測性是錯誤處理的最後一公里。核心思想:用OpenTelemetry將錯誤接入分散式追蹤、指標和日誌三大支柱,實現從發現到定位的全鏈路可觀測。

OpenTelemetry錯誤追蹤整合

package telemetry

import (
    "context"
    "fmt"

    "go.opentelemetry.io/otel"
    "go.opentelemetry.io/otel/attribute"
    "go.opentelemetry.io/otel/codes"
    "go.opentelemetry.io/otel/trace"
)

func RecordError(ctx context.Context, err error, attrs ...attribute.KeyValue) {
    span := trace.SpanFromContext(ctx)
    if !span.IsRecording() {
        return
    }

    span.SetStatus(codes.Error, err.Error())
    span.SetAttributes(attrs...)
    span.RecordError(err, trace.WithAttributes(attrs...))
}

func WrapWithSpan(ctx context.Context, operation string, fn func(ctx context.Context) error) error {
    ctx, span := otel.Tracer("app").Start(ctx, operation)
    defer span.End()

    if err := fn(ctx); err != nil {
        RecordError(ctx, err,
            attribute.String("operation", operation),
        )
        return err
    }
    return nil
}

帶錯誤追蹤的Repository層

package repository

import (
    "context"
    "database/sql"
    "fmt"

    "apperr"
    "telemetry"

    "go.opentelemetry.io/otel/attribute"
)

type UserRepository struct {
    db *sql.DB
}

func (r *UserRepository) GetUser(ctx context.Context, id int) (*User, error) {
    ctx, span := otel.Tracer("repository").Start(ctx, "UserRepository.GetUser")
    defer span.End()

    span.SetAttributes(attribute.Int("user.id", id))

    row := r.db.QueryRowContext(ctx, "SELECT id, name, email FROM users WHERE id = ?", id)
    var u User
    if err := row.Scan(&u.ID, &u.Name, &u.Email); err != nil {
        if err == sql.ErrNoRows {
            span.SetAttributes(attribute.String("error.type", "not_found"))
            return nil, fmt.Errorf("get user id=%d: %w", id, apperr.ErrNotFound)
        }
        telemetry.RecordError(ctx, err,
            attribute.String("db.operation", "SELECT"),
            attribute.Int("user.id", id),
        )
        return nil, fmt.Errorf("get user id=%d: %w", id, err)
    }

    span.SetAttributes(attribute.String("user.name", u.Name))
    return &u, nil
}

錯誤指標採集

package metrics

import (
    "go.opentelemetry.io/otel/metric"
)

var (
    errorCounter metric.Int64Counter
    errorLatency metric.Float64Histogram
)

func InitMetrics(meter metric.Meter) {
    errorCounter, _ = meter.Int64Counter(
        "app.errors.total",
        metric.WithDescription("Total number of errors"),
    )
    errorLatency, _ = meter.Float64Histogram(
        "app.errors.latency_seconds",
        metric.WithDescription("Error handling latency"),
    )
}

func RecordErrorMetric(domain, code string) {
    errorCounter.Add(context.Background(), 1,
        metric.WithAttributes(
            attribute.String("domain", domain),
            attribute.String("code", code),
        ),
    )
}

錯誤告警規則

groups:
  - name: error_alerts
    rules:
      - alert: HighErrorRate
        expr: |
          sum(rate(app_errors_total{code=~"5.."}[5m]))
          /
          sum(rate(http_requests_total[5m]))
          > 0.05
        for: 2m
        labels:
          severity: critical
        annotations:
          summary: "Error rate exceeds 5%"
          description: "5xx error rate is {{ $value | humanizePercentage }}"

      - alert: PanicDetected
        expr: increase(app_errors_total{code="PANIC"}[1m]) > 0
        for: 0m
        labels:
          severity: critical
        annotations:
          summary: "Panic recovered in production"
          description: "{{ $value }} panic(s) detected in the last minute"

關鍵規則

  • 每個錯誤必須記錄到至少一個追蹤span
  • 錯誤指標按domain和code分類,便於聚合分析
  • 告警規則區分4xx(客戶端錯誤)和5xx(服務端錯誤)
  • Panic告警必須立即觸發(for: 0m),不等待累積
  • 日誌、指標、追蹤必須關聯同一個trace_id

5個常見陷阱與解決方案

陷阱1:錯誤被吞掉

// ❌ 錯誤:忽略回傳值
_, _ = io.Copy(dst, src)

// ✅ 正確:檢查錯誤
written, err := io.Copy(dst, src)
if err != nil {
    return fmt.Errorf("copy data: %w", err)
}

陷阱2:錯誤包裝遺失原始資訊

// ❌ 錯誤:用 %v 遺失錯誤鏈
return fmt.Errorf("query failed: %v", err)

// ✅ 正確:用 %w 保留錯誤鏈
return fmt.Errorf("query failed: %w", err)

陷阱3:goroutine中的panic未recover

// ❌ 錯誤:goroutine panic導致整個程序退出
go func() {
    result := doSomething()
    ch <- result
}()

// ✅ 正確:goroutine入口處recover
go func() {
    defer func() {
        if rec := recover(); rec != nil {
            log.Printf("goroutine panic: %v\n%s", rec, debug.Stack())
            ch <- nil
        }
    }()
    result := doSomething()
    ch <- result
}()

陷阱4:錯誤檢查用字串匹配

// ❌ 錯誤:字串匹配脆弱
if strings.Contains(err.Error(), "timeout") {
    // ...
}

// ✅ 正確:用 errors.Is 檢查
if errors.Is(err, context.DeadlineExceeded) {
    // ...
}

陷阱5:重複包裝同一錯誤

// ❌ 錯誤:多層重複包裝導致錯誤資訊冗餘
func a() error {
    return fmt.Errorf("a: %w", err)
}
func b() error {
    return fmt.Errorf("b: %w", a())  // "b: a: original error"
}
func c() error {
    return fmt.Errorf("c: %w", b())  // "c: b: a: original error"
}

// ✅ 正確:只在邊界層包裝,內部直接傳遞
func a() error {
    return err  // 內部直接傳遞
}
func b() error {
    return a()
}
func c() error {
    return fmt.Errorf("process order: %w", b())  // 只在邊界包裝一次
}

10個常見錯誤排查表

錯誤現象 可能原因 排查方法 解決方案
sql: no rows in result set QueryRow無結果 檢查SQL WHERE條件 errors.Is(err, sql.ErrNoRows) 判斷
context deadline exceeded 操作超時 檢查context超時設定 增加超時時間或最佳化查詢效能
panic: concurrent map writes map併發寫入 檢查goroutine共享map sync.Map 或加鎖
panic: send on closed channel 向已關閉channel傳送 檢查channel關閉時機 用sync.Once或單向channel控制
connection refused 服務未啟動或埠錯誤 檢查服務狀態和埠 確認服務啟動,檢查網路設定
i/o timeout 網路超時 檢查網路連通性 增加超時,檢查防火牆規則
record not found 資料不存在 檢查查詢條件 區分「不存在」和「查詢失敗」
duplicate key value 唯一約束衝突 檢查插入資料 errors.As 提取PG錯誤碼
panic: nil pointer dereference 空指標解參考 檢查指標初始化 新增nil檢查,用Recovery中介層
too many open files 檔案描述符耗盡 檢查連線池和檔案操作 增加ulimit,檢查連線洩漏

進階最佳化技巧

技巧1:錯誤分組與聚合

package apperr

import "strings"

type ErrorGroup struct {
    Errors []error
}

func (g *ErrorGroup) Add(err error) {
    if err != nil {
        g.Errors = append(g.Errors, err)
    }
}

func (g *ErrorGroup) Err() error {
    if len(g.Errors) == 0 {
        return nil
    }
    return g
}

func (g *ErrorGroup) Error() string {
    msgs := make([]string, len(g.Errors))
    for i, err := range g.Errors {
        msgs[i] = err.Error()
    }
    return strings.Join(msgs, "; ")
}

func (g *ErrorGroup) Unwrap() []error {
    return g.Errors
}

技巧2:錯誤重試與退避

package retry

import (
    "context"
    "fmt"
    "math"
    "time"
)

type Config struct {
    MaxAttempts int
    BaseDelay   time.Duration
    MaxDelay    time.Duration
    Retryable   func(error) bool
}

func Do(ctx context.Context, cfg Config, fn func() error) error {
    var lastErr error
    for attempt := 0; attempt < cfg.MaxAttempts; attempt++ {
        if err := fn(); err != nil {
            if !cfg.Retryable(err) {
                return fmt.Errorf("non-retryable error: %w", err)
            }
            lastErr = err

            delay := time.Duration(
                float64(cfg.BaseDelay) * math.Pow(2, float64(attempt)),
            )
            if delay > cfg.MaxDelay {
                delay = cfg.MaxDelay
            }

            select {
            case <-ctx.Done():
                return ctx.Err()
            case <-time.After(delay):
                continue
            }
        }
        return nil
    }
    return fmt.Errorf("max attempts (%d) exceeded: %w", cfg.MaxAttempts, lastErr)
}

技巧3:錯誤碼映射HTTP狀態

package handler

import (
    "apperr"
    "errors"
    "net/http"
)

var codeToStatus = map[string]int{
    "NOT_FOUND":       http.StatusNotFound,
    "VALIDATION_ERROR": http.StatusBadRequest,
    "UNAUTHORIZED":    http.StatusUnauthorized,
    "FORBIDDEN":       http.StatusForbidden,
    "CONFLICT":        http.StatusConflict,
    "RATE_LIMITED":    http.StatusTooManyRequests,
    "DB_ERROR":        http.StatusInternalServerError,
}

func MapErrorToHTTP(err error) (int, string) {
    var domainErr *apperr.DomainError
    if errors.As(err, &domainErr) {
        if status, ok := codeToStatus[domainErr.Code]; ok {
            return status, domainErr.Message
        }
    }

    if errors.Is(err, apperr.ErrNotFound) {
        return http.StatusNotFound, "resource not found"
    }
    if errors.Is(err, apperr.ErrUnauthorized) {
        return http.StatusUnauthorized, "unauthorized"
    }

    return http.StatusInternalServerError, "internal server error"
}

技巧4:結構化錯誤日誌

package logger

import (
    "log/slog"
    "apperr"
    "errors"
)

func LogError(err error, context ...slog.Attr) {
    attrs := []slog.Attr{
        slog.String("error.message", err.Error()),
    }

    var domainErr *apperr.DomainError
    if errors.As(err, &domainErr) {
        attrs = append(attrs,
            slog.String("error.domain", domainErr.Domain),
            slog.String("error.code", domainErr.Code),
            slog.Time("error.timestamp", domainErr.Timestamp),
        )
    }

    attrs = append(attrs, context...)
    slog.LogAttrs(nil, slog.LevelError, "error occurred", attrs...)
}

技巧5:錯誤斷言輔助函式

package apperr

import "errors"

func IsNotFound(err error) bool {
    return errors.Is(err, ErrNotFound)
}

func IsConflict(err error) bool {
    return errors.Is(err, ErrConflict)
}

func IsRateLimited(err error) bool {
    return errors.Is(err, ErrRateLimited)
}

func GetDomainError(err error) (*DomainError, bool) {
    var de *DomainError
    return de, errors.As(err, &de)
}

func GetCode(err error) string {
    var de *DomainError
    if errors.As(err, &de) {
        return de.Code
    }
    return "UNKNOWN"
}

錯誤處理方式對比分析

特性 fmt.Errorf %w 自訂錯誤型別 errors.Is/As Panic Recovery 錯誤中介鏈 OpenTelemetry
保留錯誤鏈
攜帶業務語義
型別安全檢查
防止程序崩潰
統一橫切處理
分散式追蹤 部分
指標採集
實作複雜度
生產必需度 ⭐⭐⭐⭐⭐ ⭐⭐⭐⭐⭐ ⭐⭐⭐⭐⭐ ⭐⭐⭐⭐⭐ ⭐⭐⭐⭐ ⭐⭐⭐⭐
適用階段 所有 業務層 檢查層 執行時 HTTP層 可觀測層

推薦線上工具


總結

Go的錯誤處理不是「加個 if err != nil 就行」,而是要回答五個問題:錯誤從哪來?是什麼型別?怎麼傳播?怎麼恢復?怎麼觀測? fmt.Errorf("%w", err) 回答了「從哪來」,自訂錯誤型別回答了「什麼型別」,errors.Is/As 回答了「怎麼檢查」,Recovery中介層回答了「怎麼恢復」,OpenTelemetry回答了「怎麼觀測」。掌握這6種模式,你就掌握了生產級Go錯誤處理的核心方法論。


延伸閱讀

本站提供瀏覽器本地工具,免註冊即可試用 →

#Go错误处理#错误包装#自定义错误#panic恢复#Go 1.24#2026#编程语言