Ollama troubleshooting · macOS
You ran ollama run llama3.3:70b on a Mac with 64 GB of RAM, and Ollama killed the load with an OOM. Activity Monitor says ~30 GB free. What gives?
Apple Silicon uses unified memory — CPU and GPU share the same pool. macOS caps how much the GPU process can claim (around 75% of total RAM by default). The rest of your stack — Chrome, Docker, IDE, stale dev servers — eats into that ceiling before Ollama gets a slice.
The model is fine. It's your stack. Two commands fix it.
The fix · 30 seconds
DevPulse's CLI knows what's safe to reclaim — idle Ollama models, orphaned LSPs, stale dev servers — and unloads them before retrying the load.
Exit code semantics for use in scripts: 0 fits · 1won't fit · 2 fits after unload · 3 tight. Branch in shell, no parsing required.
The actual culprits · in order
Across hundreds of developer Macs, the offenders are stubbornly consistent. Run devpulse processes -n 8and you'll almost certainly see this list.
Each tab is a separate process. 50 tabs = 50 renderer procs. Chrome Shame Score →
Docker Desktop's VM hoards memory whether containers are running or not. Quit it before loading a 70B if you don't need containers right now.
Ollama keeps recently-used models in memory for 5 minutes by default. If you tested qwen:7bearlier, it's still resident. --auto-clean unloads them.
TypeScript language servers, ESLint daemons, and file watchers from projects you closed days ago. devpulse zombies --kill.
Slack, Discord, Notion, Cursor — each ships its own Chromium runtime. Quit the ones you're not actively using.
Next.js, Vite, Webpack, nodemon — all happy to sit at 500 MB each forever. DevPulse flags these by project so you know which to kill.
For long-running workloads
If you're running an agent loop or processing a queue, devpulse babysit watches free memory + battery + swap and auto-cleans when pressure builds. Built for the 11-hour-flight-with-a-70B workflow.
Common questions
Unified memory is shared CPU/GPU and capped (~75% of total) for GPU allocation. On 64 GB you get ~48 GB usable. Chrome alone routinely claims 20+ GB. The OOM is your stack, not the model.
sudo sysctl iogpu.wired_limit_mb=<MB> on Apple Silicon. Resets on reboot. For 64 GB Macs, ~57000 is reasonably safe; more risks instability. Freeing existing usage is usually the safer fix.
~42 GB at Q4_K_M. Full 70B compatibility table →
Often dramatically. 50+ tabs = 15–25 GB on most dev Macs. Closing Chrome (or just most tabs) is usually the biggest single win.
DevPulse is free, native, and uses less RAM than this webpage.
Download for macOS