Supported Models

Three orchestrated modes for picking how Ava routes work, plus every individual model behind them. Pick a mode, or pick a single model and skip the routing.

Routing Modes

Three ways Ava can pick models for you — single conductor, polyglot ensemble, or a sovereign EU stack. Switch any time from the model picker.

Maestro

Live · all plans

One conductor handles every persona, every step. Production-tuned, default for everyone.

Coordinator
Qwen 3.6 Plus
Specialists
Same model serves the full pipeline — Scout → Architect → Builder → Verifier.
Data residency
Mixed — wherever Qwen is hosted

Best for

Daily work, predictable cost, anyone who wants Ava to "just work" without thinking about routing.

Supernova

Coming soon

Polyglot ensemble — the coordinator picks the best specialist for each task, model by model.

Coordinator
DeepSeek V4 Pro (1.6T / 49B active, 1M ctx)
Specialists
V4 Flash for builds and review · Qwen 3.6 Plus fallback · Qwen Omni when vision is in play
Data residency
Mixed — DeepSeek + Qwen infrastructure

Best for

Heavy multi-step work where each subtask wants its own specialist. Frontier coordinator on every plan.

Aurora

Coming soon

European AI stack — sovereign by design. Mistral-only routing in three tiers, never leaves EU infrastructure.

Coordinator
Mistral Large 3 (675B / 41B active, 262K ctx) — coordinator + heavy specialists
Specialists
Mistral Medium 3.5 (128B dense, 256K, vision encoder from scratch, 77.6% SWE-Bench Verified) — Builder, mid-tier specialists, vision, long-form · Mistral Small 4 — intent gate
Data residency
EU only — open weights end-to-end

Best for

GDPR-strict deployments, public-sector and healthcare buyers, anyone with a sovereignty mandate.

Or skip routing entirely — pick a single model below and Ava drives just that one.

Understanding These Numbers

SWE-Bench

Tests whether the model can solve real bugs from GitHub repositories — reading code, understanding the issue, and writing a working fix.

Higher score = Better at fixing real-world code problems autonomously

HumanEval

Given a function description, can the model write correct code that passes all test cases? Measures raw coding ability.

Higher score = More reliable at writing correct code from descriptions

MMLU

A massive exam covering 57 subjects — science, history, law, medicine, maths. Tests general knowledge and reasoning breadth.

Higher score = Broader general knowledge across many domains

MATH

Competition-level maths problems — algebra, calculus, geometry, number theory. Tests deep mathematical reasoning.

Higher score = Stronger at solving complex mathematical problems

GPQA

Graduate-level science questions written by PhD researchers. Even experts struggle with these — tests frontier reasoning.

Higher score = Better at expert-level scientific reasoning

Tool Use

Can the model correctly call functions, pass the right arguments, and chain multiple tools together? Critical for an AI agent.

Higher score = More reliable at using tools like file editing, search, and git

Vision

Can the model understand images — screenshots, diagrams, charts, photos? Tests visual comprehension and reasoning.

Higher score = Better at understanding what it sees on screen

How to Access These Models

Platform

Available through your Ava account. Free tier gets 300 credits/month to evaluate the platform — every model, every tool. Upgrade to Pro for 5,000/month when ready, or stay on Free with your own API key (BYOK is unlimited).

BYOK

Bring Your Own Key. Get an API key directly from the provider, paste it into Ava's settings, and pay the provider directly. No account needed. Runs 100% locally.

Platform+BYOK

Available both ways. Use your Ava account for convenience, or bring your own key for direct access and maximum savings.