May 21, 2026

May 21, 2026

coding_agent

ollama/ollama: Get up and running with Kimi-K2.5, GLM-5, MiniMax, DeepSeek, gpt-oss, Qwen, Gemma and other models.

Ollama 0.24 ships native support for the Codex App, bringing parallel thread workflows, built-in browser annotation, and in-app code review to local and cloud-connected setups. The release also reworks the MLX sampler for better generation quality on Apple Silicon.

Ollama 0.24 Brings the Codex App to Your Local Machine

Ollama 0.24 lands with a headline feature that matters for anyone running agentic coding workflows: native support for the Codex App, OpenAI's desktop experience for managing Codex threads in parallel. It ships with built-in worktree support and git functionality, and you can spin it up with a single command.

ollama launch codex-app

That's the whole setup. No separate install dance.

What the Codex App Actually Does

Three capabilities stand out for product engineers.

Built-in browser. Codex can load local servers and sites directly inside the app. From there, you can annotate on the page itself to request changes. This is a meaningful loop-closer — you see the rendered output, mark what's wrong, and the agent acts on it without you copying context across windows.

Review mode. You can read diffs, leave comments, and iterate without leaving the workspace. Keeping review inside the same environment where the agent is working reduces the context-switching that fragments agentic sessions.

Parallel threads with worktree support. Multiple Codex threads run concurrently, each backed by git worktrees. This is the right primitive for running exploratory or speculative coding tasks without them stepping on each other.

Model Selection

For difficult coding and agentic tasks, the release points to kimi-k2.6 (which includes vision support) and glm-5.1. If you're running locally without an Ollama Cloud subscription, the recommended options are nemotron-3-super, gemma4:31b, and qwen3.6.

Vision support in kimi-k2.6 is worth noting specifically — it's what makes the browser annotation flow functional, since the model needs to interpret what you're marking on screen.

Restore Anytime

If you want to roll back to a previous Codex App configuration, there's a restore flag:

ollama launch codex-app --restore

Straightforward escape hatch.

Apple Silicon Improvement

Beyond Codex, the release reworks the MLX sampler for improved generation quality on Apple Silicon. No benchmark numbers are attached, but if you run Ollama on an M-series Mac, the sampler change affects every generation you run.

What to Do Today

If you're already using Ollama and building anything with an agentic coding loop — especially if you're iterating on a local web app — run ollama launch codex-app and test the browser annotation flow. The combination of visual feedback, in-app review, and parallel worktrees is a tight enough loop that it changes how you'd structure an agentic session. Start there, pick the right model tier for your setup, and see where the friction is.