Logging

Logging in Kubernetes works differently than traditional server environments. In containerized applications, logs are ephemeral—when a container restarts, its file system is wiped clean. This guide explains logging concepts and best practices for Kubernetes environments.

What You'll Learn

Why stdout/stderr logging is essential for Kubernetes
How to structure logs for searchability and filtering
What to log (and what not to log)
How to verify logs are being collected

The Golden Rule: Log to stdout/stderr

Always write logs to stdout (standard output) or stderr (standard error).

Kubernetes automatically captures everything your application writes to stdout and stderr. Observability platforms (like Datadog, Grafana Loki, or Splunk) then collect these logs from the container runtime—no additional file configuration needed.

Why stdout/stderr?

Automatic collection: Kubernetes captures these streams by default
No disk management: Avoid running out of disk space from log files
Container-friendly: Works seamlessly with container orchestration
Platform agnostic: Works with any log aggregation system
Simplicity: No need to configure file paths, rotation, or cleanup

Developer Responsibility

Configure your application to log to stdout/stderr. The platform team handles all Kubernetes-level configuration to ship logs to the observability platform.

Structured Logging

Structured logging means outputting logs in a machine-readable format (typically JSON) rather than plain text. This makes logs searchable, filterable, and analyzable.

JSON Format

Good - Structured JSON:

{
  "timestamp": "2025-01-15T14:32:01.234Z",
  "level": "INFO",
  "service": "user-service",
  "message": "User login successful",
  "userId": "usr_12345",
  "requestId": "req_abc123",
  "ip": "192.168.1.100"
}

Bad - Unstructured text:

2025-01-15 14:32:01 INFO User usr_12345 logged in from 192.168.1.100

Structured logs enable:

Filtering by any field (find all logs for a specific user)
Aggregation (count errors per service)
Alerting (trigger when error rate exceeds threshold)
Correlation (trace a request across services)

Log Levels

Use appropriate log levels to control verbosity and signal severity:

Level	Purpose	Production
DEBUG	Detailed diagnostic information	Disabled
INFO	General informational messages	Enabled
WARN	Potentially harmful situations	Enabled
ERROR	Error events that need attention	Enabled
FATAL	Severe errors causing shutdown	Enabled

# Example usage
logger.debug("Cache miss for key", key=cache_key)
logger.info("User registered", user_id=user.id)
logger.warning("API rate limit approaching", current=45, limit=50)
logger.error("Database query failed", error=str(e))
logger.critical("Unable to connect to database", error=str(e))

Correlation IDs

Include a request ID or correlation ID to trace requests across services:

{
  "timestamp": "2025-01-15T14:32:01.234Z",
  "level": "INFO",
  "service": "order-service",
  "message": "Order created",
  "requestId": "req_abc123",
  "traceId": "4bf92f3577b34da6a3ce929d0e0e4736",
  "orderId": "ord_67890"
}

The requestId lets you find all logs for a single request. When using distributed tracing, the traceId connects logs to traces.

What to Log

DO Log

User actions: login, logout, purchases, settings changes
System events: startup, shutdown, configuration changes
External calls: API requests, response times, failures
Database operations: slow queries, connection issues
Security events: authentication failures, authorization denials
Business events: orders placed, payments processed

DO NOT Log

Passwords or secrets: Never log credentials or API keys
Personal Identifiable Information (PII): Be careful with emails, phone numbers
Credit card numbers: Violates PCI compliance
Session tokens: Security risk if logs are exposed
Full request/response bodies: May contain sensitive data

Example: Well-Structured Log

{
  "timestamp": "2025-01-15T14:32:01.234Z",
  "level": "INFO",
  "service": "payment-service",
  "message": "Payment processed successfully",
  "requestId": "req_abc123",
  "userId": "usr_12345",
  "orderId": "ord_67890",
  "amount": 99.99,
  "currency": "USD",
  "paymentMethod": "card_ending_4242",
  "processingTimeMs": 245
}

Third-Party Applications

Some third-party applications write logs to files instead of stdout/stderr. When installing applications via Helm charts that have this limitation, contact your platform team. They can help with:

Sidecar containers to tail log files to stdout
Customizing Helm chart values
Centralized log processors (Fluentd, Fluent Bit)

Verifying Your Logs

After deployment, verify logs in your observability platform:

Logs appearing: Confirm your application's logs are collected
JSON format: Verify logs are structured (easier to search)
Log levels: Ensure appropriate levels are captured
Correlation IDs: Check that request IDs are included

Troubleshooting

Logs not appearing:

Verify writing to stdout/stderr (not files)
Check application is running in ArgoCD or Ybor Studio
Ensure log level is not too restrictive

Too many logs:

Increase log level from DEBUG to INFO
Reduce logging of repetitive events
Sample high-frequency logs
Use rate limiting for error logs

Languages

For language-specific logging implementations:

Python - structlog, standard logging
.NET - Serilog, built-in logging
Java - Logback, Logstash encoder
Rust - tracing crate
JavaScript - Winston, Pino

Metrics — Numerical measurements for monitoring
Tracing — Connect logs to distributed traces
Observability Overview — The four pillars and platform integration

What You'll Learn​

The Golden Rule: Log to stdout/stderr​

Why stdout/stderr?​

Structured Logging​

JSON Format​

Log Levels​

Correlation IDs​

What to Log​

DO Log​

DO NOT Log​

Example: Well-Structured Log​

Third-Party Applications​

Verifying Your Logs​

Troubleshooting​

Languages​

Related​