top of page

OpenMesh
How It Works
SIMPLE TO GET STARTED
01
Define your workload
Submit model type, memory requirements, and performance constraints.
02
Intelligent workload analysis
OpenDeploy evaluates cost-performance tradeoffs across available compute pools.
03
Dynamic routing
Workloads are deployed to the most cost-efficient configuration.
04
Continuous optimization
Telemetry-driven refinement improves cost-per-inference over time.
Models available on OpenDeploy

Gemma 4 31B
Text | 31B | 260k ctx

GLM 5
Text | 128k ctx

Qwen3.5 Plus 2026-02-15
Text | 131k ctx

MiniMax M2.5
Text | 1024k ctx

DeepSeek V3.2
Text | 685B | 131k ctx

Kimi K2.5
Text | 131k ctx

Ministral 3 3B 2512
Text | 3B | 128k ctx

Llama 3.3 70B Instruct
Text | 70B | 131k ctx

Qwen3 VL 32B Instruct
Multimodal | 32B | 8k ctx

Qwen3 Max Thinking
Reasoning | 131k ctx

LFM2-8B-A1B
Text | 8B | 128k ctx

Mistral Small
Text | 6B | 262k ctx

Grok 4.1 Fast
Text | 131k ctx

Mixtral 8x7B Instruct
Text | 46.7B | 33k ctx

Ministral 3 8B 2512
Text | 8B | 128k ctx

Llama 3.2 11B Vision Instruct
Multimodal | 11B | 131k ctx

Mistral: Pixtral Large 2411
Multimodal | 124B | 128k ctx

GPT OSS 20B
Text | 20B | 128k ctx

Gemma 3 27B
Text | 27B | 128k ctx

Qwen2.5 VL 32B Instruct
Multimodal | 32B | 8k ctx

Nemotron 3 Super 120b
Text | 120B | 262k ctx
bottom of page
