Supported Models#
This page lists all the language models that can be used with Oumi. Thanks to the integration with the 🤗 Transformers library, you can easily use any of these models for training, evaluation, or inference.
Models prefixed with a checkmark (✅) have been thoroughly tested and validated by the Oumi community, with ready-to-use recipes available in the configs directory.
📚 Model Categories#
Instruct Models#
Model |
Size |
Paper |
HF Hub |
License |
Open [1] |
Recommended Parameters |
---|---|---|---|---|---|---|
✅ SmolLM-Instruct |
135M/360M/1.7B |
Apache 2.0 |
✅ |
|||
✅ Llama 3.1 Instruct |
8B/70B/405B |
❌ |
||||
✅ Llama 3.2 Instruct |
1B/3B |
❌ |
||||
✅ Llama 3.3 Instruct |
70B |
❌ |
||||
✅ Phi-3.5-Instruct |
4B/14B |
❌ |
||||
Qwen2.5-Instruct |
0.5B-70B |
❌ |
||||
OLMo 2 Instruct |
7B |
Apache 2.0 |
✅ |
|||
MPT-Instruct |
7B |
Apache 2.0 |
✅ |
|||
Command R |
35B/104B |
❌ |
||||
Granite-3.1-Instruct |
2B/8B |
Apache 2.0 |
❌ |
|||
Gemma 2 Instruct |
2B/9B |
❌ |
||||
DBRX-Instruct |
130B MoE |
Apache 2.0 |
❌ |
|||
Falcon-Instruct |
7B/40B |
Apache 2.0 |
❌ |
Vision-Language Models#
Base Models#
Model |
Size |
Paper |
HF Hub |
License |
Open [1] |
Recommended Parameters |
---|---|---|---|---|---|---|
✅ SmolLM2 |
135M/360M/1.7B |
Apache 2.0 |
✅ |
|||
✅ Llama 3.2 |
1B/3B |
❌ |
||||
✅ Llama 3.1 |
8B/70B/405B |
❌ |
||||
✅ GPT-2 |
124M-1.5B |
MIT |
✅ |
|||
DeepSeek V2 |
7B/13B |
❌ |
||||
Gemma2 |
2B/9B |
❌ |
||||
GPT-J |
6B |
Apache 2.0 |
✅ |
|||
GPT-NeoX |
20B |
Apache 2.0 |
✅ |
|||
Mistral |
7B |
Apache 2.0 |
❌ |
|||
Mixtral |
8x7B/8x22B |
Apache 2.0 |
❌ |
|||
MPT |
7B |
Apache 2.0 |
✅ |
|||
OLMo |
1B/7B |
Apache 2.0 |
✅ |