Governed Generation — Server-Status UI Fixes¶

Ticket: PHON-42 (Bug, High; parent: PHON-4) Branch: feature/phon-42-server-status-ui-fixes (from develop) Date: 2026-04-18 Status: Spec for review

Problem¶

Two related UI bugs in the Governed Generation tool's server-status handling prevent users from triggering cold starts and show a false-negative "Server unreachable" chip during the initial poll window.

Bug 1 — Cold start lockout. The Generate button is disabled unless serverStatus.status === 'ready'. When RunPod is fully scaled to zero, the Workers proxy correctly returns status: 'serverless'; during warm-up it returns status: 'loading'. In both cases the button is disabled, so the user can never fire the first POST /api/generate-single — which is the only event that actually wakes a RunPod worker. The 60s cold-start warning emitted by the Workers proxy never has a chance to display.

Bug 2 — False "Server unreachable" on mount. useServerStatus initializes status = null, and the UI renders the error chip "Server unreachable" whenever !serverStatus. That condition is true both before the first poll completes (~100ms after mount) and after a fetch failure. A single transient poll miss also flips the chip red until the next 5s cycle.

Design¶

Hook: `useServerStatus` returns `{ status, hasFetched }`¶

Change the return type from ServerStatus | null to:

interface ServerStatusState {
  status: ServerStatus | null;
  hasFetched: boolean;
}

hasFetched starts false, flips to true on the first successful response, stays true forever after.
On a fetch failure before hasFetched is true, the hook keeps status: null and hasFetched: false — so the UI knows it has no information yet. On a fetch failure after hasFetched is true, the hook keeps the last known status (stale data is preferable to the flicker to null). If fetches keep failing, the UI can still distinguish "we had a response once" from "we never did."

Simpler alternative considered and rejected: keeping a separate lastError field. Not needed — the consumer doesn't distinguish types of failure, only pre-fetch vs post-fetch.

UI: six-state chip mapping in `GovernedGenerationTool/index.tsx`¶

Replace the current two-branch chip render (serverStatus && ... / !serverStatus && ...) with a derived chip state based on { status, hasFetched }:

condition	chip label	chip color	button
`!hasFetched`	`Checking server…`	`default`	disabled
`status.status === 'ready'`	`Server ready`	`success`	enabled
`status.status === 'loading'`	`Worker starting…`	`warning`	enabled
`status.status === 'serverless'`	`Server idle`	`warning`	enabled
`status.status === 'error'`	`Server error`	`error`	disabled
`hasFetched && status == null`	`Server unreachable`	`error`	disabled

Button-enable rule becomes:

const canGenerate = ['ready', 'loading', 'serverless'].includes(status?.status ?? '');
// ...
disabled={loading || !prompt.trim() || !canGenerate}

Helper caption under Generate button¶

Render only when status.status === 'loading' or status.status === 'serverless':

First request after idle may take ~60s while a GPU worker spins up.

Small caption typography, muted color (text.secondary), same block as the existing statusMessage rendering.

Types¶

Add 'serverless' and 'checking' to ServerStatus['status'] if that type union isn't already inclusive. (Workers proxy already emits 'serverless'; frontend type may need the string added. Verify during implementation.)

Files¶

Modified:

packages/web/frontend/src/lib/generationApi.ts
useServerStatus: return { status, hasFetched } instead of ServerStatus | null
Keep transient errors from zeroing out status once hasFetched is true
Update ServerStatus['status'] union to include 'serverless' if missing
packages/web/frontend/src/components/tools/GovernedGenerationTool/index.tsx
Consume new hook shape
Six-state chip mapping (table above)
canGenerate derived from allowed status set
Helper caption shown during loading / serverless

Unchanged:

Workers proxy (packages/web/workers/src/routes/generation.ts) — already emits correct states
Store, compiler, API contract, all other tools

Verification¶

Manual browser check on http://localhost:3000 (dev) and on https://develop.phonolex.pages.dev (staging) after deploy:
Initial render shows "Checking server…" (grey/default), button disabled
If staging RunPod is cold → chip reads "Server idle", button enabled, caption visible
Click Generate → SSE status "Connecting to GPU… first request after idle may take ~60s." appears (already emitted by Workers proxy)
Workers spin up → chip transitions through "Worker starting…" → "Server ready"
Kill staging Workers briefly (or point dev frontend at an unreachable URL) → chip goes to "Server unreachable" only after the first successful poll has landed; transient blips don't flicker
npm run type-check passes after the ServerStatus['status'] union change
npm run lint passes
19 existing tests still green (store + compiler untouched)
npm run build clean

Out of scope¶

No change to the Workers proxy or RunPod configuration
No cold-start optimization (min-workers, keep-alive) — that's an ops concern, separate ticket if desired
No changes to error taxonomy beyond the six states above
No toast/snackbar notifications — inline chip + caption only

Follow-ups¶

If cold-start + "loading" latency is still a bad UX, consider a RunPod min-workers setting or a pre-warm request on tool mount. Track separately under PHON-7 (Operations maturity).