Skip to content

First run

When you launch Pluma for the first time, three things happen automatically:

  1. Data directory created at ~/.config/pluma (macOS / Linux) or %AppData%\pluma (Windows). Holds your config, conversations, characters, and downloaded models.
  2. Browser opens at the listen URL (default http://localhost:8787). Disable with -open=false or set open_browser = false in config.
  3. Setup wizard runs in the browser.

The wizard

Three screens, optimised for time-to-first-chat.

1. Connect an LLM

Pluma scans localhost for running OpenAI-compatible servers:

  • Ollama (default port 11434)
  • LM Studio (1234)
  • mlx_lm.server (1234)
  • llama.cpp / llama-server (8080)
  • vLLM (8000)
  • text-generation-webui (5000)

If one's running, click Auto-detect and Pluma adds it as a connection profile and activates it.

If nothing's running, click Enter a connection manually to type a base URL + optional API key, or pick a template from the dropdown (Ollama / LM Studio / OpenAI / Anthropic / etc.). Don't have any LLM yet? "Don't have one? Download a model →" opens the model browser.

2. Pick a model

The picker shows whatever your connection advertises through /v1/models. Click one. If the list is empty you'll see a CTA to the model browser.

3. Your name

Fills {{user}} in character cards. Default "You". Skip-friendly.

Hit Start chatting and you land in a fresh chat with the built-in Pluma character.

What got skipped

  • Tailscale — only useful for remote-device access. Configure later under Settings → Privacy. See Multi-device access.
  • Passkeys/api/* is open over loopback by default. Enrol passkeys when you expose Pluma to other devices. See Passkeys.

Both deferred deliberately: they're remote-access concerns, not first-chat blockers.

Re-running the wizard

The completion flag lives in config.toml:

setup_completed = true

Flip it back to false and reload to re-see the wizard. Useful for testing or after a config reset.