Local AI Coding Agents on Mac — running them without OOM

Local coding agents you can run on Mac.

Local AI on a Mac splits into two tiers. Model runtimes (Ollama, llama.cpp, LM Studio, MLX) load weights and expose inference. Agent runtimes (the ones below) sit on top and orchestrate multi-step work — Claude Code, Codex, OpenCode, Copilot CLI, Aider, OpenClaw, and more. Each one solves the same problem: load a 32B–70B model, point it at your repo, watch it work.

And each one hits the same wall: OOM mid-task when Chrome, Docker, or a stale dev server eats the unified memory the model needed.

Below: the 9 agents we cover with their memory profile and DevPulse pre-flight commands. Click any agent for the full setup.

Agent	Recommended model	Min RAM	OOM risk
Claude Code → Anthropic's coding agent with subagents, tool use, and multi-file edits.	llama3.3-70b	64 GB	High risk
Codex → OpenAI's open-source coding agent for the terminal.	qwen2.5-coder-32b	48 GB	High risk
OpenCode → Open-source coding agent with strong local-model support.	qwen2.5-coder-32b	48 GB	High risk
GitHub Copilot CLI → GitHub's AI coding agent for the terminal.	qwen2.5-coder-32b	32 GB	Medium risk
OpenClaw → Personal AI assistant with 100+ skills — runs locally via Ollama.	qwen3-32b	32 GB	Medium risk
Hermes Agent → Self-improving open-source agent with persistent cross-session memory.	llama3.3-70b	64 GB	High risk
Droid → Factory's terminal-first coding agent — top Terminal-Bench score, IDE integrations.	qwen2.5-coder-32b	48 GB	High risk
Pi → Minimal, aggressively extensible coding agent CLI with native Ollama support.	qwen2.5-coder-32b	24 GB	Low risk
Aider → AI pair programming in your terminal — works with local Ollama models.	qwen2.5-coder-32b	48 GB	High risk

Agent

Recommended model

Min RAM

OOM risk

Claude Code →

Anthropic's coding agent with subagents, tool use, and multi-file edits.

llama3.3-70b

64 GB

High risk

Codex →

OpenAI's open-source coding agent for the terminal.

qwen2.5-coder-32b

48 GB

High risk

OpenCode →

Open-source coding agent with strong local-model support.

qwen2.5-coder-32b

48 GB

High risk

GitHub Copilot CLI →

GitHub's AI coding agent for the terminal.

qwen2.5-coder-32b

32 GB

Medium risk

OpenClaw →

Personal AI assistant with 100+ skills — runs locally via Ollama.

qwen3-32b

32 GB

Medium risk

Hermes Agent →

Self-improving open-source agent with persistent cross-session memory.

llama3.3-70b

64 GB

High risk

Droid →

Factory's terminal-first coding agent — top Terminal-Bench score, IDE integrations.

qwen2.5-coder-32b

48 GB

High risk

Pi →

Minimal, aggressively extensible coding agent CLI with native Ollama support.

qwen2.5-coder-32b

24 GB

Low risk

Aider →

AI pair programming in your terminal — works with local Ollama models.

qwen2.5-coder-32b

48 GB

High risk

The pattern

Same fix across every agent.

Whichever agent you pick, the operational problem is identical: your stack ate the unified memory before the model got a slice. DevPulse's pre-flight check + auto-clean is the universal fix.

Before launch

devpulse ai --before-load <MB> --auto-clean reclaims idle Ollama models, kills zombie LSPs, then re-evaluates. Exit 0 = safe to launch.

During the run

devpulse babysit --target-free-mb 8192 --json watches free memory + battery + swap. Auto-cleans when pressure builds, emits NDJSON checkpoint signals.

If something goes wrong

Fix Ollama OOM on Mac → covers the most common failure modes and the diagnostic commands to find them.

Stop letting your stack OOM your local agents.

DevPulse is free, native, and uses less RAM than this webpage.

Download for macOS

macOS 14+ · Apple Silicon & Intel · Free during launch