Model downloads fail¶
You picked a model in the HuggingFace browser, started a download, and it errors out or hangs.
Symptom: "download failed: 401 Unauthorized"¶
The model is gated. HuggingFace requires you to accept its terms first AND for Pluma to send an auth token.
Fix:
- Sign in at https://huggingface.co/.
- Visit the model's page, click Agree and access.
- Generate a token at https://huggingface.co/settings/tokens with read scope.
-
Stick the token in your env before launching Pluma:
Or use a
~/.zshenv/~/.bashrcentry so it persists.
Pluma forwards HF_TOKEN (or HUGGING_FACE_HUB_TOKEN) on every HF download request.
Symptom: "download failed: exceeds 100 GiB cap"¶
You hit max_model_download_bytes. Default is 100 GiB; most publicly-served quants are well under that, but the biggest MoE checkpoints exceed it.
Bump in config.toml:
Or disable the cap entirely:
Same story for max_model_download_duration_seconds (default 6 hours). Restart Pluma after editing.
Symptom: download hangs at <100% forever¶
A few possible causes:
- The HF mirror is rate-limiting you. Cancel + retry; HF cycles requests across mirrors.
- Your disk filled up. Check with
df -h ~/.config/pluma/models/. - Network dropped mid-download. Pluma's downloader doesn't resume on connection drops. Cancel + retry; subsequent attempt starts fresh.
- The cap timeout fired silently. If the job is older than
max_model_download_duration_seconds, Pluma marks it errored. Bump the cap if you're on a slow line.
Symptom: "no MLX files in this repo" / "no GGUF files in this repo"¶
The format toggle (GGUF / MLX) limits results to repos that actually ship that format. Some popular repos ship only one — e.g. mlx-community/* is MLX-only; bartowski/* is GGUF-only.
Flip the toggle in the model browser to switch format. If you really want a GGUF of a model that only ships as MLX (or vice versa), look for community re-quants on HF.
Symptom: model downloaded but Pluma "can't find" it¶
Pluma drops files into <datadir>/models/<repo>/<file>. Your LLM runtime needs to be pointed at that path:
| Runtime | Command |
|---|---|
llama-server (llama.cpp) |
llama-server -m /path/to/<datadir>/models/<repo>/<file>.gguf |
mlx_lm.server |
mlx_lm.server --model /path/to/<datadir>/models/<repo> |
mlx_lm wants the repo DIRECTORY (it loads multiple files); llama-server wants a specific .gguf file. After starting the runtime, hit Auto-detect in Pluma's connection settings to pick up the new model.
Symptom: download starts then immediately errors¶
Often DNS / TLS / proxy issues. Quick checks:
Should return 200 OK. If not, your environment can't reach HF — VPN, captive portal, corporate proxy. Pluma uses the system HTTP client, so any system-level HTTPS_PROXY env applies.
When all else fails¶
Manual download:
- Go to the model's HF page in a browser.
- Click Files and versions.
- Download the file(s) you want.
- Drop into
<datadir>/models/<repo>/. - Point your runtime at it.
Pluma's HF browser is a convenience; you're never locked into it.