Supported Models

Three orchestrated modes for picking how Ava routes work, plus every individual model behind them. Pick a mode, or pick a single model and skip the routing.

Routing Modes

Three ways Ava can pick models for you — single conductor, polyglot ensemble, or a sovereign EU stack. Switch any time from the model picker.

Maestro

Live · all plans

One conductor handles every persona, every step. Production-tuned, default for everyone.

Coordinator: Qwen 3.7 Plus
Specialists: Same model serves the full pipeline — Scout → Architect → Builder → Verifier.
Data residency: Mixed — wherever Qwen is hosted

Best for

Daily work, predictable cost, anyone who wants Ava to "just work" without thinking about routing.

Supernova

Live · all plans

Polyglot ensemble — the coordinator picks the best specialist for each task, model by model.

Coordinator: DeepSeek V4 Pro (1.6T / 49B active, 1M ctx)
Specialists: V4 Flash for builds and review · Qwen 3.7 Plus for Builder and vision
Data residency: Mixed — DeepSeek + Qwen infrastructure

Best for

Heavy multi-step work where each subtask wants its own specialist. Frontier coordinator on every plan.

Aurora

Live · all plans

European AI stack — sovereign by design. Mistral-only routing in three tiers, never leaves EU infrastructure.

Coordinator: Mistral Medium 3.5 (128B dense, 256K, vision, 77.6% SWE-Bench, AA Index 39) — coordinator + heavy specialists + Builder + vision
Specialists: Mistral Small 4 (119B, 256K, AA Index 28) — high-volume workhorse: chat, intent gate, image-gen orchestration, light specialists · Mistral Large 3 (675B/41B MoE, Apache-2.0) — heavy reserve / fallback
Data residency: EU only — open weights end-to-end

Best for

GDPR-strict deployments, public-sector and healthcare buyers, anyone with a sovereignty mandate.

Or skip routing entirely — pick a single model below and Ava drives just that one.

Understanding These Numbers

SWE-Bench

Tests whether the model can solve real bugs from GitHub repositories — reading code, understanding the issue, and writing a working fix.

Higher score = Better at fixing real-world code problems autonomously

HumanEval

Given a function description, can the model write correct code that passes all test cases? Measures raw coding ability.

Higher score = More reliable at writing correct code from descriptions

MMLU

A massive exam covering 57 subjects — science, history, law, medicine, maths. Tests general knowledge and reasoning breadth.

Higher score = Broader general knowledge across many domains

MATH

Competition-level maths problems — algebra, calculus, geometry, number theory. Tests deep mathematical reasoning.

Higher score = Stronger at solving complex mathematical problems

GPQA

Graduate-level science questions written by PhD researchers. Even experts struggle with these — tests frontier reasoning.

Higher score = Better at expert-level scientific reasoning

Tool Use

Can the model correctly call functions, pass the right arguments, and chain multiple tools together? Critical for an AI agent.

Higher score = More reliable at using tools like file editing, search, and git

Vision

Can the model understand images — screenshots, diagrams, charts, photos? Tests visual comprehension and reasoning.

Higher score = Better at understanding what it sees on screen

How to Access These Models

Platform

Available through your Ava account. Free tier gets 300 credits/month to evaluate the platform — every model, every tool. Upgrade to Pro for 5,000/month when ready, or stay on Free with your own API key (BYOK is unlimited).

BYOK

Bring Your Own Key. Get an API key directly from the provider, paste it into Ava's settings, and pay the provider directly. No account needed. Runs 100% locally.

Platform+BYOK

Available both ways. Use your Ava account for convenience, or bring your own key for direct access and maximum savings.

Supported Models

Routing Modes

Maestro

Supernova

Aurora

Qwen 3.7 Plus

Claude Opus 4.8

Claude Sonnet 5

Claude Haiku 4.5

GLM-5.2

GLM-4.5 Air

GLM-4.5 Flash

Kimi K2.7 Code

Kimi K2.5

DeepSeek V4 Pro

DeepSeek V4 Flash

Qwen 3.7 Plus

Qwen 3.5 Plus

Qwen 3.5 Flash

MiniMax M3

MiniMax M2.7

MiniMax M2.7 HighSpeed

Mistral Small 4

Mistral Large 3

Mistral Medium 3.5

Mistral Large (legacy)

Codestral

Devstral 2

MiMo V2.5-Pro

MiMo V2.5

Hunyuan Hy3

Nemotron 3 Ultra

Understanding These Numbers

SWE-Bench

HumanEval

MMLU

MATH

GPQA

Tool Use

Vision

How to Access These Models