Skip to content

Governed Generation — Server-Status UI Fixes

Ticket: PHON-42 (Bug, High; parent: PHON-4) Branch: feature/phon-42-server-status-ui-fixes (from develop) Date: 2026-04-18 Status: Spec for review

Problem

Two related UI bugs in the Governed Generation tool's server-status handling prevent users from triggering cold starts and show a false-negative "Server unreachable" chip during the initial poll window.

Bug 1 — Cold start lockout. The Generate button is disabled unless serverStatus.status === 'ready'. When RunPod is fully scaled to zero, the Workers proxy correctly returns status: 'serverless'; during warm-up it returns status: 'loading'. In both cases the button is disabled, so the user can never fire the first POST /api/generate-single — which is the only event that actually wakes a RunPod worker. The 60s cold-start warning emitted by the Workers proxy never has a chance to display.

Bug 2 — False "Server unreachable" on mount. useServerStatus initializes status = null, and the UI renders the error chip "Server unreachable" whenever !serverStatus. That condition is true both before the first poll completes (~100ms after mount) and after a fetch failure. A single transient poll miss also flips the chip red until the next 5s cycle.

Design

Hook: useServerStatus returns { status, hasFetched }

Change the return type from ServerStatus | null to:

interface ServerStatusState {
  status: ServerStatus | null;
  hasFetched: boolean;
}
  • hasFetched starts false, flips to true on the first successful response, stays true forever after.
  • On a fetch failure before hasFetched is true, the hook keeps status: null and hasFetched: false — so the UI knows it has no information yet. On a fetch failure after hasFetched is true, the hook keeps the last known status (stale data is preferable to the flicker to null). If fetches keep failing, the UI can still distinguish "we had a response once" from "we never did."

Simpler alternative considered and rejected: keeping a separate lastError field. Not needed — the consumer doesn't distinguish types of failure, only pre-fetch vs post-fetch.

UI: six-state chip mapping in GovernedGenerationTool/index.tsx

Replace the current two-branch chip render (serverStatus && ... / !serverStatus && ...) with a derived chip state based on { status, hasFetched }:

condition chip label chip color button
!hasFetched Checking server… default disabled
status.status === 'ready' Server ready success enabled
status.status === 'loading' Worker starting… warning enabled
status.status === 'serverless' Server idle warning enabled
status.status === 'error' Server error error disabled
hasFetched && status == null Server unreachable error disabled

Button-enable rule becomes:

const canGenerate = ['ready', 'loading', 'serverless'].includes(status?.status ?? '');
// ...
disabled={loading || !prompt.trim() || !canGenerate}

Helper caption under Generate button

Render only when status.status === 'loading' or status.status === 'serverless':

First request after idle may take ~60s while a GPU worker spins up.

Small caption typography, muted color (text.secondary), same block as the existing statusMessage rendering.

Types

Add 'serverless' and 'checking' to ServerStatus['status'] if that type union isn't already inclusive. (Workers proxy already emits 'serverless'; frontend type may need the string added. Verify during implementation.)

Files

Modified:

  • packages/web/frontend/src/lib/generationApi.ts
  • useServerStatus: return { status, hasFetched } instead of ServerStatus | null
  • Keep transient errors from zeroing out status once hasFetched is true
  • Update ServerStatus['status'] union to include 'serverless' if missing
  • packages/web/frontend/src/components/tools/GovernedGenerationTool/index.tsx
  • Consume new hook shape
  • Six-state chip mapping (table above)
  • canGenerate derived from allowed status set
  • Helper caption shown during loading / serverless

Unchanged:

  • Workers proxy (packages/web/workers/src/routes/generation.ts) — already emits correct states
  • Store, compiler, API contract, all other tools

Verification

  • Manual browser check on http://localhost:3000 (dev) and on https://develop.phonolex.pages.dev (staging) after deploy:
  • Initial render shows "Checking server…" (grey/default), button disabled
  • If staging RunPod is cold → chip reads "Server idle", button enabled, caption visible
  • Click Generate → SSE status "Connecting to GPU… first request after idle may take ~60s." appears (already emitted by Workers proxy)
  • Workers spin up → chip transitions through "Worker starting…" → "Server ready"
  • Kill staging Workers briefly (or point dev frontend at an unreachable URL) → chip goes to "Server unreachable" only after the first successful poll has landed; transient blips don't flicker
  • npm run type-check passes after the ServerStatus['status'] union change
  • npm run lint passes
  • 19 existing tests still green (store + compiler untouched)
  • npm run build clean

Out of scope

  • No change to the Workers proxy or RunPod configuration
  • No cold-start optimization (min-workers, keep-alive) — that's an ops concern, separate ticket if desired
  • No changes to error taxonomy beyond the six states above
  • No toast/snackbar notifications — inline chip + caption only

Follow-ups

  • If cold-start + "loading" latency is still a bad UX, consider a RunPod min-workers setting or a pre-warm request on tool mount. Track separately under PHON-7 (Operations maturity).