Core Concepts#
This guide walks you through the fundamental concepts in Oumi. By the end, you’ll understand how Oumi’s components fit together and be ready to train your first model.
Prerequisites#
Before diving in, you should have:
Oumi installed — See the installation guide if you haven’t set it up yet
Basic Python knowledge — Familiarity with running scripts and using pip
ML fundamentals — Understanding of what model training and fine-tuning mean at a high level
Key Terminology#
Oumi uses standard machine learning terminology. Here are the key terms you’ll encounter:
Term |
What it means |
|---|---|
Pretraining |
Training a model from scratch on large amounts of text data |
Fine-tuning |
Adapting a pretrained model to a specific task or domain |
SFT |
Supervised Fine-Tuning — Teaching a model to follow instructions using input-output examples |
DPO |
Direct Preference Optimization — Aligning a model using pairs of preferred/rejected responses |
GRPO |
Group Relative Policy Optimization — A reinforcement learning method for model alignment |
LoRA |
Low-Rank Adaptation — A memory-efficient fine-tuning technique that trains small adapter layers |
Inference |
Using a trained model to generate predictions or text |
The Oumi Workflow#
The diagram below shows a typical workflow in Oumi. You can start from scratch with pretraining, or begin with an existing model and fine-tune it using SFT, DPO, or GRPO:
%%{init: {'theme': 'base', 'themeVariables': { 'background': '#f5f5f5'}}}%%
graph LR
%% Data stage connections
DS[Datasets] --> |Existing Datasets| TR[Training]
DS --> |Data Synthesis| TR
%% Training methods
TR --> |Pretraining| EV[Evaluation]
TR --> |SFT| EV
TR --> |DPO| EV
TR --> |GRPO| EV
%% Evaluation methods spread horizontally
EV --> |Generative| INF[Inference]
EV --> |Multi-choice| INF
EV --> |LLM Judge| INF
%% Style for core workflow
style DS fill:#1565c0,color:#ffffff
style TR fill:#1565c0,color:#ffffff
style EV fill:#1565c0,color:#ffffff
style INF fill:#1565c0,color:#ffffff
Using Oumi#
Oumi provides two ways to run workflows: the command-line interface (CLI) and the Python API. Most users start with the CLI for its simplicity, then move to the Python API when they need more control.
Command-Line Interface (CLI)#
The CLI is the quickest way to get started. All commands follow this pattern:
oumi <command> [options]
For detailed help on any command, you can use the --help option:
oumi --help # for general help
oumi <command> --help # for command-specific help
The available commands are:
Command |
Purpose |
|---|---|
|
Train or fine-tune a model |
|
Evaluate a model on benchmarks |
|
Run inference on a model |
|
Launch jobs on cloud platforms |
|
Use LLM-as-a-Judge for evaluation |
|
Generate synthetic training data |
|
Analyze and profile datasets |
|
Hyperparameter tuning with Optuna |
|
Quantize models for efficient deployment |
|
Distributed training wrapper for torchrun/accelerate |
|
Display environment information |
Any Oumi command which takes a config path as an argument (train, evaluate, infer, etc.) can override parameters from the command line. For example:
oumi train -c configs/recipes/smollm/sft/135m/quickstart_train.yaml \
--training.max_steps 20 \
--training.learning_rate 1e-4 \
--data.train.datasets[0].shuffle true \
--training.output_dir output/smollm-135m-sft
See CLI Reference for full CLI details, including more details about CLI overrides.
Python API#
The Python API gives you programmatic control over Oumi. Use it when you need to:
Integrate training into a larger pipeline
Modify configurations dynamically
Run experiments in Jupyter notebooks
Here’s a complete example that loads a recipe, modifies it, and runs training:
from oumi.train import train
from oumi.core.configs import TrainingConfig
# Load a predefined recipe
config = TrainingConfig.from_yaml(
"configs/recipes/smollm/sft/135m/quickstart_train.yaml"
)
# Optionally modify settings programmatically
config.training.max_steps = 100
config.training.output_dir = "output/my_experiment"
# Run training
train(config)
When you run this, you’ll see output like:
Loading model: HuggingFaceTB/SmolLM2-135M-Instruct
Starting training for 100 steps...
Step 10/100 | Loss: 2.45 | LR: 5.0e-05
Step 20/100 | Loss: 2.12 | LR: 5.0e-05
...
Training complete. Model saved to output/my_experiment
See oumi for the full API reference.
Configuration Files#
Every Oumi workflow is defined by a YAML configuration file. This makes experiments reproducible—you can share a config file and someone else can run the exact same workflow.
Oumi has four types of configs:
Config Type |
What it controls |
Learn more |
|---|---|---|
Training |
Model, data, hyperparameters, and training settings |
|
Evaluation |
Benchmarks and metrics to run |
|
Inference |
How to generate text from a model |
|
Launcher |
Where and how to run jobs (local, cloud, etc.) |
Here’s what a training config looks like:
# The model to train
model:
name: meta-llama/Llama-3.1-70B-Instruct
trust_remote_code: true
# Training data
data:
train:
datasets:
- dataset_name: text_sft
dataset_path: path/to/data
stream: true
# Training hyperparameters
training:
trainer_type: TRL_SFT
learning_rate: 1e-4
max_steps: 1000
Oumi comes with many ready-to-use configs called recipes. Browse them at Recipes.
Key Components#
Now that you understand how to run Oumi, let’s look at the main components you’ll work with.
Recipes#
A recipe is a complete, ready-to-run configuration file. Oumi includes recipes for common workflows like fine-tuning Llama or training SmolLM. Think of recipes as starting points—you can use them as-is or customize them for your needs.
Browse available recipes: Recipes
Models#
Oumi works with most models from HuggingFace’s transformers library. You specify a model by its HuggingFace name (like meta-llama/Llama-3.1-8B) in your config file. You can also define custom model architectures.
Learn more: Custom Models
Datasets#
Oumi provides a unified interface for loading and preprocessing training data. You can use datasets from HuggingFace, load local files, or create custom dataset classes.
Data mixtures let you combine multiple datasets with different weights—useful when you want to train on diverse data sources simultaneously.
Learn more: Datasets
Training Methods#
Oumi supports multiple training approaches through different trainers:
SFT trainers for supervised fine-tuning
DPO trainers for preference-based alignment
GRPO trainers for reinforcement learning
Each trainer handles the optimization loop, gradient updates, and checkpointing for its respective method.
Learn more: Training Methods
Oumi Judge#
Oumi Judge uses an LLM to evaluate model outputs on attributes like helpfulness, honesty, and safety. It’s useful for automated quality assessment when you don’t have ground-truth labels.
Learn more: LLM Judge
Launcher#
The launcher lets you run Oumi jobs on different platforms—your local machine, a GPU cluster, or cloud providers like AWS and GCP. You define where to run in a launcher config, keeping your training config portable.
Learn more: Running Jobs on Clusters
Next Steps#
Get started with Oumi: First install Oumi, then follow the Quickstart guide to run your first training job.
Explore example recipes: Check out the Recipes page and try running a few examples.
Dive deeper with tutorials: The Tutorials provide step-by-step guidance on specific tasks and workflows.
Learn more about key functionalities: Explore detailed guides on training, inference, evaluation, and model judging.