Building fault-tolerant decorator with best practices (#759)

Testing in Production

Yes, you read that right. Testing in production — not instead of staging, but in addition to it. Here's why and how.

Why Staging Lies

Staging environments differ from production in subtle but critical ways:

  • Different data volumes (10K rows vs 10M rows)
  • Different traffic patterns (no real users)
  • Different infrastructure (smaller instances)
  • Different integrations (sandbox APIs)

Canary Deployments

Route a small percentage of traffic to the new version:

# nginx.conf
upstream backend {
    server app-v1:8080 weight=95;
    server app-v2:8080 weight=5;
}

Monitor error rates, latency percentiles, and business metrics. If anything degrades, roll back automatically.

Feature Flags

Decouple deployment from release:

  • Deploy code to 100% of servers
  • Enable feature for 1% of users
  • Gradually increase to 5%, 25%, 100%
  • Kill switch: disable instantly without redeployment

Observability

You can't test what you can't see. Invest in:

  1. Structured logging (JSON, correlation IDs)
  2. Distributed tracing (OpenTelemetry)
  3. Custom metrics (business KPIs, not just CPU/memory)
  4. Alerting (on symptoms, not causes)

Пријави ме да објавиш коментар

2 коментара

Dave Brown коментар објављен 26. 3. 2026. 17:22

Curabitur aliquam euismod dolor non ornare. Era brevis ratione est. Sunt torquises imitari velox mirabilis medicinaes. Nunc viverra elit ac laoreet suscipit.

Dave Brown коментар објављен 26. 3. 2026. 17:21

Eros diam egestas libero eu vulputate risus. Ut suscipit posuere justo at vulputate.