Coding agents · running locally
Local AI on a Mac splits into two tiers. Model runtimes (Ollama, llama.cpp, LM Studio, MLX) load weights and expose inference. Agent runtimes (the ones below) sit on top and orchestrate multi-step work — Claude Code, Codex, OpenCode, Copilot CLI, Aider, OpenClaw, and more. Each one solves the same problem: load a 32B–70B model, point it at your repo, watch it work.
And each one hits the same wall: OOM mid-task when Chrome, Docker, or a stale dev server eats the unified memory the model needed.
Below: the 9 agents we cover with their memory profile and DevPulse pre-flight commands. Click any agent for the full setup.
The pattern
Whichever agent you pick, the operational problem is identical: your stack ate the unified memory before the model got a slice. DevPulse's pre-flight check + auto-clean is the universal fix.
devpulse ai --before-load <MB> --auto-clean reclaims idle Ollama models, kills zombie LSPs, then re-evaluates. Exit 0 = safe to launch.
devpulse babysit --target-free-mb 8192 --json watches free memory + battery + swap. Auto-cleans when pressure builds, emits NDJSON checkpoint signals.
Fix Ollama OOM on Mac → covers the most common failure modes and the diagnostic commands to find them.
DevPulse is free, native, and uses less RAM than this webpage.
Download for macOS