Local AI on Apple Silicon

Run local AI.
Skip the OOM.

DevPulse is the pre-flight check for local inference on Apple Silicon. Reclaim VRAM before loading a 70B, branch on exit codes from a one-liner, and stream pressure to long-running agents — so Ollama, llama.cpp, LM Studio, and MLX get the headroom they need.

Same binary also lives in your menu bar — see the local-AI playbook or jump straight to the CLI below.

Not local-maximalist. Open models still trail frontier by ~8 months once you adjust for tokens, evals, and distillation — the case for hybrid local + cloud →

Download Free View on GitHub

macOS 14+ · Apple Silicon · <30 MB memory· No telemetry

37/ 64 GB57%

Free

26.5 GB

Compressed

4.8 GB

Swap

6.9 GB

Do I need a new Mac?

Nope, you're good.

Peak 53 GB on 64 GB. Chrome is eating 22 GB but that's a Chrome problem, not a hardware problem.

Chrome (59)22.5 GB

Cursor (23)5.1 GB

Claude CLI (8)4.5 GB

Notion (8)3.0 GB

Slack (6)981 MB

Everything in your menu bar

Built for developers who ship, not babysit.

Process Intelligence

Groups by project, not PID. Chrome's 59 helpers become one line. Node processes get attributed to the project that spawned them.

Chrome (59)22.5 GB

unify (57)1.7 GB

Docker idle4.0 GB

Zombies 7209 MB

Per-project attribution
App family grouping
One-click breakdown

Auto-Optimizer

Runs every 5 minutes. Kills zombies, flags idle servers, warns about Chrome leaks. You don't think about it.

2mKilled 4 zombies+340 MB

7mChrome: 18.2 GB (47 tabs)

32mKilled 2 zombies+128 MB

1hFlagged idle: api-server

12killed

1.3 GBfreed

3alerts

Chrome & Docker Intel

Chrome gets a full breakdown: tabs, extensions, MB/tab. Docker shows VM reservation vs actual container usage.

41 tabs · 3 windows~380 MB/tab

Tab renderers (41)15.6 GB

Extensions (12)2.1 GB

GPU process1.8 GB

Chrome Shame Score →

Swap Tracking

Monitors swap pressure over time. Warns before your Mac starts thrashing to disk.

Menu Bar Native

Always visible. Click to expand. No separate window. Uses <30 MB itself.

Weekly Reports

Memory trends, top offenders, and optimization history. Delivered to your inbox or notification center.

Zero Telemetry

No analytics. No network calls. No account required. Your data stays on your Mac.

NEW · v1.2.0Now scriptable

Your local AI deserves a co-pilot.

DevPulse ships a devpulse CLI from the same binary. Same intelligence the menu bar shows, exposed as JSON and exit codes — so Ollama, llama.cpp, Claude Code, Cursor, and your own scripts can ask the obvious question before loading a 70B model: will this fit?

$ devpulse ai --before-load 42000 --auto-clean --json
{
  "modelSizeMB": 42000,
  "before":  { "verdict": "fits-after-unload", "exitCode": 2 },
  "actions": [
    "unloaded idle ollama model: qwen2.5:7b (4.2 GB)",
    "killed 6 zombie procs (812 MB reclaimed)"
  ],
  "after":   { "verdict": "fits", "exitCode": 0 }
}

$ echo $?
0   # safe to load

Pre-flight checks

--before-load returns exit codes 0/1/2/3— fits / won't fit / unload-first / tight. Branch in shell, no parsing.

Streaming NDJSON

devpulse watch --json emits one snapshot per tick. Pipe it into a long-running agent loop and react to VRAM pressure in real time.

No daemon. No cloud.

Same process as the menu bar app. No extra permissions, no network calls, no telemetry. Your model load decisions never leave your Mac.

CLI docs →

NEW · v1.7.0Hybrid routing, made visible

Know which brain Claude Code is using.

Wrappers can swap Claude Code's backend to DeepSeek, OpenRouter, or Fireworks by setting ANTHROPIC_BASE_URL. Cheaper — but easy to forget which one is live. DevPulse reads your shell config and surfaces the active backend right next to your local-capacity verdict.

Hybrid routingLocal ready

lean local · 70%+

↳Claude Code → DeepSeek(~/.zshrc)

Backend at a glance

Anthropic, DeepSeek, OpenRouter, Fireworks, or a custom URL — detected from ~/.claude/settings.json and your shell rc files. No daemon, no shell hooks.

Source-of-truth shown

We tell you where the override lives —~/.zshrc, ~/.zshenv, or Claude's own settings — so you can fix it in one click of the editor instead of hunting.

Keys never read

DevPulse reads only the base URL — never the API key. Routing posture stays a local signal: nothing leaves your Mac, nothing gets logged, nothing gets uploaded.

The Honest Answer

Do I need a new Mac?

DevPulse tracks your peak memory over 7 days, calculates how much is waste, and gives you a straight answer. No upselling. No BS.

😎

Absolutely not.

You peaked at 42 GB on a 64 GB Mac. Even with Chrome hogging 18 GB, you have headroom for days.

Peak: 42G · Waste: 6G · Optimized: 36G

😬

Not yet. Clean up first.

Peak 58 GB with 12 GB reclaimable. Biggest culprit: 4.2 GB in idle dev servers. Fix that before shopping.

Peak: 58G · Waste: 12G · Optimized: 46G

💸

Yeah, probably.

Even optimized you'd use 61 GB on a 64 GB Mac. A 96 GB machine would give you breathing room.

Peak: 63G · Waste: 2G · Optimized: 61G

Take the quiz →

Local AI Models

Can I run it?

Based on your actual RAM usage and recoverable waste, DevPulse tells you which AI models your Mac can handle.

Model	Quant	RAM	Status
Llama 3.1 8B →	Q8_0	9.5 GB	Runs great
Qwen 3.5 9B →	Q4_K_M	9.5 GB	Runs great
DeepSeek R1 32B →	Q4_K_M	20 GB	Runs OK
Llama 3.3 70B →	Q4_K_M	42 GB	After cleanup
DeepSeek R1 671B →	Q4_K_M	350 GB	Too heavy

See all 20+ models → Results update as your memory usage changes.

VRAM estimates sourced from CanIRun.ai — model data from llama.cpp, Ollama, and LM Studio.

Ready to ship local AI without the OOM?

Install DevPulse, wire devpulse ai --before-load into your model loader, and stop guessing whether the next 70B fits.

Download for macOS

Requires macOS 14 Sonoma or later · Apple Silicon & Intel · No App Store required

Run local AI.Skip the OOM.

Built for developers who ship, not babysit.

Your local AI deserves a co-pilot.

Know which brain Claude Code is using.

Do I need a new Mac?

Can I run it?

Ready to ship local AI without the OOM?

Run local AI.
Skip the OOM.