SleepyQuant — a 12-agent crypto quant running on one Mac

Published Apr 11, 2026 · Notes from one solo enthusiast, one Mac · ← All posts

SleepyQuant — a 12-agent crypto quant running on one Mac

Hey everyone,

SleepyQuant is a solo experiment I've been running for the last couple of weeks: 12 local AI agents coordinating a paper crypto trading book on a single Apple M1 Max. No cloud inference, no API bills, no vendor black box. Every agent prompt, every losing trade, every round-trip gets written up weekly.

Stack (all local):

What's deliberately boring:

Agents (one role each):

A COO / dispatcher, a trading lead, separate futures + spot executors, a CFO, a CTO with filesystem + shell tools, an R&D / failure analyst, a legal / compliance officer, a resource monitor, a QA engineer, a news intelligence watcher, and a content / SEO writer.

Each agent has a focused system prompt + a small set of skill handlers. The COO routes CEO requests to the right specialist instead of one monolithic agent trying to do everything.

Live paper P&L widget + weekly newsletter: https://sleepyquant.rest

Two things I'd genuinely want feedback on — please weigh in below:

  1. Is 12 agents worth the routing overhead? Or would a single bigger agent with tool use be cleaner at this scale? I keep flip-flopping and would love to hear from anyone who's been through the same decomposition choice.

  2. MLX unload strategies on Apple Silicon? Right now my reasoning model auto-unloads after 2 minutes idle, which works but feels crude. If you're running MLX in production on a Mac, how do you free RAM when you need it back?

Try it or follow along:

Happy to answer questions in the comments about the architecture, the failure vault, the priority queue design, or why local-first LLM agents are worth the effort on a 64 GB machine. Fire away.