Ollama
Ollama lets you run large language models locally on your Mac. This document covers the basic commands to run a model and manage which ones are currently loaded in memory.
Run a Model
ollama run qwen3:14b
This downloads the model (if it's not already available locally) and starts an interactive session with it.
Check Which Models Are Currently Running
To see which models are currently loaded into RAM and "running" on your Mac, open a new Terminal window and run:
ollama ps
Pressing Enter shows a table with the model name, its size, and how long it has been active in memory.
Other Useful Commands
- List all downloaded models (running or not):
ollama list
- Unload a model from memory immediately (force Ollama to free the RAM instead of waiting for it to unload on its own):
ollama stop qwen3:14b
Replace qwen3:14b with the name of the model you want to stop.
- Launch OpenCode (a terminal-based coding agent) using a locally running model:
ollama launch opencode
- Launch Claude (Anthropic's coding agent) using a locally running model:
ollama launch claude
Models
qwen3:14b— general-purpose reasoning model, solid at coding.#coding#general