Skip to content

Server & SPA

Build pipeline

web/         ── vite build ──▶  server/embedded/   ── go build ──▶  pluma binary
  src/                            assets/
  index.html                      index.html
  1. npm run build (from web/) runs Vite. Output lands in server/embedded/.
  2. go build (from server/) compiles the Go code. embed.go has //go:embed embedded so the SPA is pulled into the binary at compile time.
  3. The server serves the SPA from the embedded FS for everything that isn't an /api/* route.

make build runs both in order. Single binary out at ./pluma.

For local development without rebuilding the binary on every edit, see Dev loop.

The middleware chain

Outer to inner:

withTrustedProxies → withLogging → withNoStoreOnUserContent
  → withAuth → withHostAllowlist → withCORS → mux
Middleware What it does
withTrustedProxies Rewrites RemoteAddr based on X-Forwarded-For, walking right-to-left and skipping any hop in trusted_proxies. Runs first so every downstream layer sees the real client IP.
withLogging Access log to stderr, redacting path-with-id metadata. Skips chat content entirely. Gated on log_requests.
withNoStoreOnUserContent Sets Cache-Control: no-store on /api/conversations*, /api/characters*, /api/personas*, /api/attachments* so browser caches don't hold sensitive data.
withAuth WebAuthn passkey session check. Per-RPID grace lets first enrolment through; loopback_auth_bypass exempts 127.0.0.1 / ::1.
withHostAllowlist Matches the request host against allowed_hosts (IPs, CIDRs, hostname patterns). Loopback always allowed. tsnet listener skips this layer entirely.
withCORS Standard CORS for OPTIONS preflights.

The tsnet (Tailscale) listener uses a different middleware chain — same auth + caching layers, but the host allowlist is omitted (tailnet membership is itself an auth gate).

How the SPA reaches the API

Same-origin. The Vite build output is served by the same Go binary on the same port, so fetch('/api/...') just works.

In dev (make web-dev), Vite serves the SPA on :5173 and proxies /api/* to pluma on :8787. Same fetch code on the FE side.

State in the SPA

Svelte 5 runes. Stores live in web/src/lib/*.svelte.ts:

  • conversationStore — the list of chats; init() fetches on first read; update() mutates in place.
  • characterStore, personaStore, connectionStore, samplerStore, imageConnectionStore, tailscaleStore, authStore, setupStore, userStore, voiceLibrary (in api.ts).

A store is an object with $state/$derived properties exposed via getters. Reactivity is automatic; UI components subscribe by reading.

The theme system (web/src/themes/index.ts) registers themes in a Map<id, Theme>, applies via style.setProperty for each token, persists the active id to localStorage.

Background work

Job Where
Auto-portrait on character save goroutine in characters.go after PUT; status polled at /api/characters/{id}/avatar/status
Auto-titler on first user-driven exchange goroutine in handlers.go; runs a non-streaming chat completion against the same model
Model download per-job goroutine in model_download.go; cancel via POST /api/models/downloads/{id}/cancel
TTS speak synchronous; each call blocks the requesting connection until the upstream returns audio bytes
Tailscale auth-URL watch goroutine in tsnet.go; ticks every 750 ms while Up() is blocked; exits when the node enters running

In-memory only. Recovers on restart by simply re-running (e.g. the auto-portrait job re-checks the avatar status; if it's done, no work). No persistent job queue.