AnythingLLM Lets You Route Queries Between Local and Cloud Models Automatically

AnythingLLM v1.13.0 ships a feature called Model Router, and it changes the core assumption behind how you configure an AI assistant. Until now, you picked one model and everything went through it. Local or cloud, not both. That choice is now optional.

Model Router lets you define routing rules that run against every incoming message. Each message is analyzed and sent to whichever model fits the task. A lightweight local model handles quick questions. A cloud model with stronger reasoning takes the complex ones. The user sees one chat interface. The routing is invisible.

Two rule types are available. Calculated rules trigger on concrete signals: keywords, token counts, time of day, or image attachments. LLM-classified rules accept plain-English descriptions of intent, so you can express routing logic without writing brittle pattern matchers. You can combine both.

The cost angle is real. Simple queries go to cheap or local models. Expensive API calls are reserved for messages that actually need them. The source material frames this as saving money without sacrificing quality, and the mechanism is straightforward: you control the routing table, so you control the spend.

One detail worth noting for production use: the sticky routing system keeps a conversation on the same model across a thread. You are not bouncing between a local model and a cloud model on consecutive turns within the same context window. That matters for coherence.

Supported local runtimes include Ollama and LM Studio. Supported cloud providers include OpenAI, Anthropic, and Google. The full setup guide is at the AnythingLLM docs.

The release is 100% open source.

If you are building an internal tool or assistant where most queries are simple but some are genuinely hard, this is worth setting up today. Define a calculated rule that catches your heavy workloads and routes them to your most capable cloud model. Route everything else locally. Measure the API cost difference over a week. The routing logic lives in your config, not in a vendor's pricing tier.