Ollama 0.24 Brings the Codex Desktop App to Local Inference

Ollama 0.24 ships something product engineers have been waiting for: a local-first desktop coding agent experience. The headline feature is support for the Codex App, OpenAI's desktop environment for working on Codex threads in parallel, with built-in worktree support and git functionality baked in.

Launching it takes one command:

ollama launch codex-app

That single line gives you a full desktop workspace for agentic coding work.

What you can actually do inside it

The built-in browser lets Codex load your local servers and sites directly. You can annotate on the page to request changes, which closes a loop that usually requires switching between browser, editor, and chat window. The review mode lets you read code, leave comments, and iterate without leaving the workspace. Less context switching means faster iteration.

Model choices

For difficult coding and agentic tasks, the recommended models are kimi-k2.6 (which includes vision support) and glm-5.1. For local use without an Ollama Cloud subscription, you can run nemotron-3-super, gemma4:31b, or qwen3.6.

That split matters. If you want the full agentic loop with vision-based annotation, kimi-k2.6 is the obvious pick. If you need everything to stay on your machine with no cloud dependency, the three local options give you a path.

Restoring your previous setup

If you try Codex App and want to roll back, there is a restore flag:

ollama launch codex-app --restore

No permanent changes to your existing Ollama configuration.

Also in this release

The MLX sampler has been reworked for improved generation quality on Apple Silicon. If you are running models locally on a Mac, output quality should be better without any changes on your end.

What to do today

If you are actively building a product and doing repeated coding iterations, pull 0.24, run ollama launch codex-app, and test the browser annotation flow against your local dev server. The combination of parallel threads, in-app review, and direct page annotation is a meaningfully different way to work. Start with one of the recommended models based on whether you have a cloud subscription, and run --restore if anything breaks your current setup.