Skip to content

Unified Structured Logging Design

Date: 2026-04-14 Status: Approved Scope: Generation server, Hono API, React frontend

Problem

PhonoLex is going live soon and has no production-level observability. The generation server logs startup and some debug-level generation info but nothing about requests, responses, or errors. The Hono API relies on wrangler's built-in request lines. The frontend swallows errors silently. When something breaks for a user, there's no way to know what happened.

Shared Log Schema

Every log entry across all three services uses the same shape:

{
  "ts": "2026-04-14T18:30:00.000Z",
  "level": "info",
  "service": "gen",
  "request_id": "abc-123",
  "message": "generation complete",
  "duration_ms": 1420,
  "context": {}
}

Fields: - ts — ISO-8601 UTC timestamp - leveldebug, info, warn, error - servicegen (generation server), api (Hono Workers), web (frontend) - request_id — UUID propagated via X-Request-ID header; frontend generates it, API and generation server forward it - message — human-readable summary - duration_ms — optional, for request/response pairs - context — service-specific structured payload (constraints, error details, etc.)

Generation Server (FastAPI/Python)

Logging Infrastructure

  • Use python-json-logger with structlog-style processors for JSON output to stdout
  • Replace all print() statements in model.py with logger calls
  • Logger name convention: phonolex.{module} (already partially followed)

Request/Response Middleware

FastAPI middleware that logs every request:

{
  "ts": "...",
  "level": "info",
  "service": "gen",
  "request_id": "abc-123",
  "message": "POST /api/generate-single 200",
  "duration_ms": 1420,
  "context": {
    "method": "POST",
    "path": "/api/generate-single",
    "status": 200
  }
}

For generation endpoints specifically, log the full generation context at info level:

{
  "ts": "...",
  "level": "info",
  "service": "gen",
  "request_id": "abc-123",
  "message": "generation complete",
  "duration_ms": 1420,
  "context": {
    "prompt": "Tell me a story about a cat",
    "constraints": [{"type": "exclude", "phonemes": ["k"]}],
    "text": "Whiterun, a ginger tabby...",
    "compliant": true,
    "violation_count": 0,
    "token_count": 128,
    "drafts_attempted": 3
  }
}

Exception Handling

Replace bare except Exception patterns with specific handlers:

  1. Pydantic ValidationError — 422 with field-level details, log at warn
  2. Generation failures (model errors, OOM, timeout) — 500, log at error with full context
  3. Data loading errors (word_norms API failures) — 503, log at error
  4. Constraint compilation errors — 422, log at warn

Add a FastAPI exception handler for ValidationError that returns structured error responses:

{
  "error": "validation_error",
  "detail": [
    {"field": "constraints[0].target_rate", "message": "must be between 0.0 and 1.0"}
  ]
}

Bug Fixes

  • Fix dead call to model.generate_response() on line 69 of routes/generate.py (function does not exist)
  • Add try/except to word_norms.py API calls with retry and structured error logging

Hono API (Cloudflare Workers)

Logging Middleware

Hono middleware that wraps every request:

{
  "ts": "...",
  "level": "info",
  "service": "api",
  "request_id": "abc-123",
  "message": "GET /api/words/search 200",
  "duration_ms": 45,
  "context": {
    "method": "GET",
    "path": "/api/words/search",
    "status": 200,
    "cf_ray": "..."
  }
}

Uses console.log(JSON.stringify(...)) — Workers captures this natively, visible via wrangler tail and Cloudflare dashboard.

Error Handler

Structured error logging for D1 failures and route errors:

{
  "ts": "...",
  "level": "error",
  "service": "api",
  "request_id": "abc-123",
  "message": "D1 query failed",
  "context": {
    "path": "/api/phonemes/rates",
    "error": "no such column: frequency",
    "query_preview": "SELECT phonemes_str, frequency FROM words..."
  }
}

Request ID Propagation

  • Read X-Request-ID from incoming request (set by frontend)
  • Generate one if missing
  • Forward to generation server on proxy requests
  • Include in all log entries and error responses

Frontend (React)

API Client Interceptor

Wrap the fetch/axios client to catch and log all API errors:

{
  "ts": "...",
  "level": "error",
  "service": "web",
  "request_id": "abc-123",
  "message": "API request failed",
  "context": {
    "method": "POST",
    "url": "/api/generate-single",
    "status": 500,
    "response_body": "Generation failed: ...",
    "component": "GovernedGenerationTool"
  }
}

Log to console.error with structure in dev. POST critical errors (5xx, network failures) to a /api/log endpoint on the Hono worker.

Error Boundary

React error boundary at the app level and around key tool components:

  • Catches render crashes with component stack
  • Logs structured error
  • POSTs to /api/log
  • Shows user-friendly fallback UI

/api/log Endpoint (Hono)

Minimal endpoint that accepts frontend error reports:

POST /api/log
{ "level": "error", "service": "web", "message": "...", "context": {...} }

Validates with a simple schema (reject oversized payloads, require level and message). Logs the entry via console.log so it appears in Workers logs alongside API logs. No D1 storage — just pass-through to the Workers log stream.

Rate-limited to prevent abuse (e.g., 10 per minute per IP).

What This Does NOT Include

  • Log aggregation service (Datadog, Grafana Cloud, etc.) — future concern
  • Log persistence beyond Workers' built-in retention — future concern
  • Performance metrics / APM — separate initiative
  • User session recording — out of scope

Querying Logs

Local dev:

# All generation server logs
uvicorn ... 2>&1 | jq 'select(.service=="gen")'

# All errors across Hono API
wrangler tail --format json | jq 'select(.level=="error")'

# Trace a request across services
wrangler tail --format json | jq 'select(.request_id=="abc-123")'

Production (Cloudflare): - Workers logs available in Cloudflare dashboard - Generation server logs depend on deployment (stdout capture)