Why Pino Logs Disappeared in Kubernetes — and the Three PRs It Took to Get Them Back

Jee-eun Kang
Jee-eun Kang April 6, 2026

Context

In the last post, I described building a sprint planning system and deciding that structured logging should be Sprint 1’s first deliverable. The reasoning: you can’t debug regressions without logs, and the codebase had zero structured logging. Every console.log was a fire-and-forget string that vanished the moment a Kubernetes pod recycled.

This post is about what happened when I built the logging system, deployed it to our Azure AKS cluster, and discovered that pino produces absolutely nothing in a Next.js standalone Docker container — silently, without errors.

It took 5 PRs and 4 days to ship what should have been a straightforward library integration. Here’s what went wrong.

What I Built

The logging system itself is small — three files, ~95 lines of production code, 100% test coverage:

logger.ts — Server-side pino instance with environment-aware configuration:

  • JSON output in production, pretty-printed in development
  • ISO 8601 timestamps, service metadata (brs-client, environment name)
  • Automatic redaction of sensitive fields: authorization headers, cookies, tokens, passwords
  • Configurable log level via LOG_LEVEL env var with validation

client-logger.ts — Browser-side wrapper that suppresses debug/info in production but lets warn/error through to the console. Pino is Node.js-only, so the browser needs its own lightweight alternative.

request-logger.ts — Factory function for request-scoped logging in API route handlers:

const { log, finish } = createRequestLogger(request);
log.info("processing tree transform");
// ... do work ...
finish(200);
// → {"level":"info","requestId":"abc-123","method":"POST","url":"/service-api/tree/transform","status":200,"duration":42,"msg":"request completed"}

Each request gets a child logger with requestId (from the gateway’s x-request-id header, or a generated UUID), HTTP method, and URL path. The finish() function selects log level by status code — info for 2xx, warn for 4xx, error for 5xx — and records duration in milliseconds.

I also added an ESLint rule (no-console: ["warn", { allow: ["warn", "error"] }]) to prevent new console.log calls from creeping in, and replaced existing console calls across the codebase with the appropriate logger.

Tests were written first (TDD), covering edge cases like invalid log levels, missing request headers, and status-based level selection. All green. Build passes. Ship it.

PR #335: The Implementation (Works Locally, Dies in Docker)

The initial PR merged cleanly. Local development worked perfectly — colorized, pretty-printed logs with timestamps and request IDs. I was feeling good.

Then I deployed to the dev AKS cluster. Checked the pod logs.

Nothing.

Not “wrong format” nothing. Not “wrong level” nothing. Actual silence. Zero pino output. The application ran fine — pages loaded, API routes responded — but the logging system produced no output whatsoever.

The First Wrong Assumption

My first instinct was the log level. Maybe LOG_LEVEL wasn’t set and it was defaulting to silent? Checked the env — nope, it was using the default info. Maybe serverExternalPackages was wrong? Maybe pino was being bundled and the worker thread transport was broken?

I added pino and pino-pretty to serverExternalPackages in next.config.js (PR #339). This tells Next.js to exclude these packages from its webpack bundling and load them as native Node.js modules at runtime. This is a real requirement for pino — its worker-thread transport doesn’t survive webpack bundling.

Redeployed. Still nothing.

The Actual Root Cause (PR #340)

Here’s where it gets interesting. Our dev Kubernetes cluster runs with NODE_ENV=development. The logger configuration conditionally enables pino-pretty transport when NODE_ENV is not production:

export const logger = pino({
  level: LOG_LEVEL,
  ...(process.env.NODE_ENV !== "production" && {
    transport: {
      target: "pino-pretty",
      // ...
    },
  }),
});

Pino transports run in a separate worker thread. When you specify a transport target, pino spawns a worker that require()s the target module. If that module doesn’t exist, the worker thread fails silently. No error. No fallback. Just… nothing written to stdout.

And pino-pretty was in devDependencies. Docker builds run npm ci, which installs everything. But the node_modules in the final runner stage only contains what Next.js’s standalone trace includes. And pino-pretty wasn’t being traced.

Wait — why not? It’s referenced in the code, right? Yes, but conditionally. And the build runs with NODE_ENV=production.

Here’s the chain:

  1. Docker builder stage sets NODE_ENV=production
  2. Next.js builds the app and traces dependencies
  3. During tracing, it evaluates process.env.NODE_ENV !== "production"false
  4. The pino-pretty transport config is never evaluated
  5. Next.js never sees pino-pretty as a dependency
  6. pino-pretty is not copied to .next/standalone/node_modules/
  7. At runtime on the dev cluster, NODE_ENV=development, so the transport config activates
  8. Pino tries to load pino-pretty in a worker thread
  9. Module not found → worker fails silently → zero output

The fix for PR #340 was simple: move pino-pretty from devDependencies to dependencies. This ensures it’s available for Next.js to trace.

Redeployed. Still nothing.

The Deeper Root Cause (PR #341)

Moving pino-pretty to dependencies wasn’t enough. I SSH’d into the running pod and checked:

ls .next/standalone/node_modules/pino-pretty
# No such file or directory

Even as a production dependency, Next.js’s standalone tracer still didn’t include it. The tracer follows actual code execution paths at build time. Since the transport config is behind a NODE_ENV conditional that evaluates to false during the build, the tracer never encounters pino-pretty — regardless of where it lives in package.json.

The serverExternalPackages config tells Next.js “don’t bundle this, use the native module” — but it doesn’t mean “always include this in the standalone output.” If the tracer never sees the import, the module isn’t copied.

The real fix was explicit. In the Dockerfile runner stage, I manually copy pino and all 27 of its transitive dependencies from the deps stage:

# Copy pino + pino-pretty and all transitive dependencies.
# Next.js standalone trace misses pino-pretty because the build runs with
# NODE_ENV=production, which skips the conditional transport config.
COPY --from=deps /app/node_modules/pino ./node_modules/pino
COPY --from=deps /app/node_modules/pino-pretty ./node_modules/pino-pretty
COPY --from=deps /app/node_modules/thread-stream ./node_modules/thread-stream
COPY --from=deps /app/node_modules/sonic-boom ./node_modules/sonic-boom
COPY --from=deps /app/node_modules/atomic-sleep ./node_modules/atomic-sleep
# ... 22 more transitive dependencies

Redeployed. And finally:

{"level":30,"time":"2026-04-06T05:23:41.123Z","service":"brs-client","env":"production","msg":"pino logger initialized","level":"info","nodeEnv":"production"}

Logs. Real, structured, JSON logs. In the actual Kubernetes pod.

PR #338: The Security Hardening

Between the “it works locally” PR and the Docker debugging, I also merged PR #338 — eight patches caught during an internal code review with AI agents (the BMAD workflow from the previous post). The highlights:

Premature finish() calls. The request logger’s finish(200) was being called before async operations completed. The fix: move finish() into the response-building path, after the actual work is done.

Error message leaks. Production error responses were including raw error messages and stack traces. Now they return a generic "Internal server error" message while logging the full details server-side.

Over-broad transport guard. The original !== "production" check meant any non-production environment (staging, test, preview) would try to load pino-pretty. Changed to === "development" — only local dev gets pretty printing.

Missing redaction paths. Added accessToken and refreshToken to the redaction list. These were being logged in full by the original implementation.

What I’d Do Differently

Test in Docker from day one. The entire debugging saga happened because local Node.js and Docker standalone have fundamentally different module resolution. A docker build && docker run in my local workflow would have caught this immediately.

Don’t use conditional transports. The root cause is a build-time/runtime environment mismatch. A simpler approach: always output JSON (no transport), pipe through pino-pretty in development via npm run dev | pino-pretty. No conditional, no worker thread, no tracing issue.

Read the Next.js standalone docs more carefully. The standalone tracer’s behavior is documented, but the interaction with conditional code paths and worker-thread transports is not obvious. This is a known rough edge in the Next.js + pino ecosystem.

The Debugging Timeline

DayWhat happened
Day 1PR #335 merged — logging system implemented, tests pass, works locally
Day 2PR #338 merged — security hardening (8 patches from code review)
Day 2PR #339 merged — added serverExternalPackages (didn’t fix it)
Day 3PR #340 merged — moved pino-pretty to dependencies (didn’t fix it)
Day 4SSH into pod, discover pino-pretty not in standalone output
Day 4PR #341 merged — explicit Dockerfile COPY of all pino deps (fixed it)

Four days. Five PRs. Three wrong hypotheses. For a logging library.

What Pino Logs Look Like Now

In the dev cluster, every service-api POST request produces structured JSON:

{
  "level": 30,
  "time": "2026-04-06T05:23:41.123Z",
  "service": "brs-client",
  "env": "production",
  "requestId": "7f3a2b1c-...",
  "method": "POST",
  "url": "/service-api/tree/transform",
  "status": 200,
  "duration": 42,
  "msg": "request completed"
}

Sensitive fields are automatically redacted:

{
  "req": {
    "headers": {
      "authorization": "[REDACTED]",
      "cookie": "[REDACTED]"
    }
  }
}

This feeds directly into the Kubernetes log pipeline — no parsing, no regex, just structured data ready for whatever log aggregation we add next (Grafana Loki is the plan).

What I Learned

Silent failures are the worst failures. Pino’s worker-thread transport doesn’t throw when the target module is missing. It just produces no output. In a containerized environment where you’re checking logs remotely, “no output” looks identical to “not logging” — there’s no signal to differentiate “broken” from “not configured.”

Build-time and runtime environments are different worlds. Next.js standalone traces dependencies at build time with one NODE_ENV, but the code runs with a different one. Any conditional import or dynamic require that depends on NODE_ENV can create a gap between what’s traced and what’s needed.

27 transitive dependencies is a lot of COPY lines. But it’s also explicit, auditable, and doesn’t break when upstream changes their dependency tree. I’ll take verbose-but-correct over clever-but-fragile.

Solo debugging in production is lonely. There’s no one to rubber-duck with when you’re staring at empty pod logs at 11pm. The AI agents helped with hypotheses, but the actual fix required hands-on investigation — SSH into the pod, check the filesystem, trace the dependency chain manually.

What’s Next

The logging infrastructure is in place. Next up:

  • OpenTelemetry integration@vercel/otel with pino instrumentation for automatic traceId/spanId injection
  • Client-side log forwarding — browser errors sent to a server endpoint, re-logged through pino
  • Grafana Loki — log aggregation with tiered retention (3 days for debug, 90 days for errors)

But first, I’m going to enjoy having actual logs. After 6 months of console.log and prayer, structured JSON in Kubernetes feels like a luxury.


Debugged with kubectl exec, 27 COPY lines, one too many wrong hypotheses, and the hard-won knowledge that pino worker threads fail silently.