Recipes#
To help you get started with Oumi, weβve prepared a set of recipes for common use cases. These recipes are designed to be easy to understand and modify, and should be a good starting point for your own projects. Each recipe is a YAML file that can be used to train, evaluate, or deploy a model. We also have corresponding job configs for most recipes that let you run the job remotely; theyβre usually files ending in _job.yaml
in the same directory as the recipe config.
Overview#
The recipes are organized by model family and task type. Each recipe includes:
Configuration files for different tasks (training, evaluation, inference)
Platform-specific job configurations (Cloud (e.g. GCP), Polaris, or local)
Multiple training methods (FFT, LoRA, QLoRA, FSDP/DDP)
To use a recipe, simply download the desired configuration file, modify any parameters as needed, and run the configuration using the Oumi CLI. For example:
oumi train --config path/to/config.yaml
oumi evaluate --config path/to/config.yaml
oumi infer --config path/to/config.yaml
You can also check out the README.md
in each recipeβs directory for more details and examples. You can easily adapt these recipes to use with other supported models, datasets, and cloud providers.
Common Models#
π DeepSeek R1 Family#
Model |
Configuration |
Links |
---|---|---|
DeepSeek R1 671B |
|
|
Distilled Llama 8B |
|
|
|
||
|
||
|
||
|
||
Distilled Llama 70B |
|
|
|
||
|
||
|
||
|
||
Distilled Qwen 1.5B |
|
|
|
||
|
||
|
||
Distilled Qwen 32B |
|
|
|
||
|
π¦ Llama Family#
Model |
Configuration |
Links |
---|---|---|
Llama 3.1 8B |
|
|
|
||
|
||
|
||
|
||
|
||
Llama 3.3 70B |
|
|
|
||
|
||
|
||
|
||
Llama 3.1 405B |
|
|
|
||
|
||
Llama 3.2 1B |
|
|
|
||
|
||
Llama 3.2 3B |
|
|
|
||
|
||
|
||
|
π¨ Vision Models#
Model |
Configuration |
Links |
---|---|---|
Llama 3.2 Vision 11B |
|
|
|
||
|
||
|
||
|
||
LLaVA 7B |
|
|
|
||
|
||
Phi3 Vision |
|
|
|
||
Qwen2-VL 2B |
|
|
|
||
|
||
|
||
SmolVLM |
|
π― Training Techniques#
This section lists an example config for various training techniques supported by Oumi.
Technique |
Configuration |
Links |
---|---|---|
FSDP |
|
|
Long-context training |
|
|
DPO |
|
|
DDP Pretraining |
|
|
FSDP Pretraining |
|
π Inference#
Model |
Configuration |
Links |
---|---|---|
DeepSeek R1 671B |
|
|
DeepSeek R1 Distill Llama 8B |
|
|
DeepSeek R1 Distill Llama 70B |
|
|
DeepSeek R1 Distill Qwen 1.5B |
|
|
DeepSeek R1 Distill Qwen 32B |
|
|
Llama 3.1 8B |
|
|
|
||
|
||
Llama 3.1 70B |
|
|
Llama 3.2 1B |
|
|
|
||
|
||
Llama 3.2 3B |
|
|
|
||
|
||
Llama 3.2 Vision 11B |
|
|
|
||
|
||
GPT-2 |
|