A public notebook from a solo finance + tech enthusiast — multi-agent quant on Apple Silicon, all local, all transparent.
Join the ride — see me fall or thrive, whichever comes first.
Solo builder running a 12-agent AI quant on one M1 Max. Every Friday: the architecture that broke, the trades that worked, the freezes. Real numbers, both directions. No hype, no signals, no paid tiers.
One email a week. Unsubscribe anytime with one click.
A handful of specialized agents — trading, research, risk — sharing a single MoE model through a priority queue. Hand-rolled, occasionally fragile.
Every round-trip logged. Every loss written up. Paper first, real capital only when the math earns it.
One Apple Silicon machine. 64 GB of unified memory. Zero cloud inference, zero API bills, zero dependencies I don't own.
Every decision, every failure, every architecture note shared openly. Nothing hidden, nothing polished.
While the public AI narrative is dominated by capex wars and cloud GPU shortages, a quieter shift has happened on the desktop. A single Apple Silicon laptop with 64GB of unified…
After roughly 500 paper round-trips showed a persistent sub-35% win rate with average losses larger than average wins, we stopped scaling the live side and ran a cheap experiment:…
Running local LLMs on M1 Max hardware is one of those setups that looks great on paper — unified memory, no PCIe bottleneck, offline and private. For about a year I ran…
The first lie I had to unlearn buying a 64 GB Mac for local LLM work was that I had 64 GB to use for the model.
I tested both. Same machine (M1 Max 64 GB), same model (Qwen 3.6 35B-A3B Q8), same prompts, same generation lengths. llama.cpp came out about 30% faster on raw decode throughput.…
Looking for older posts? Read the full archive →