oumi.core.models#
Core models module for the Oumi (Open Universal Machine Intelligence) library.
This module provides base classes for different types of models used in the Oumi framework.
See also
oumi.models: Module containing specific model implementations.oumi.models.mlp.MLPEncoder: An example of a concrete modelimplementation.
Example
To create a custom model, inherit from BaseModel:
>>> from oumi.core.models import BaseModel
>>> class CustomModel(BaseModel):
... def __init__(self, *args, **kwargs):
... super().__init__(*args, **kwargs)
...
... def forward(self, x):
... # Implement the forward pass
... pass
- class oumi.core.models.BaseModel(**kwargs)[source]#
Bases:
Module,ABC- abstract property criterion: Callable#
Returns the criterion function used for model training.
- Returns:
A callable object representing the criterion function.
- abstractmethod forward(**kwargs) dict[str, Tensor][source]#
Performs the forward pass of the model.
Optionally computes the loss if the necessary keyword arguments are provided.
- Parameters:
**kwargs – should contain all the parameters needed to perform the forward pass, and compute the loss if needed.
- Returns:
A dictionary containing the output tensors.
- classmethod from_pretrained(load_directory: str | Path, *, map_location: str | device | None = None, strict: bool = True, weights_filename: str = 'model.safetensors', config_filename: str = 'config.json', override_kwargs: dict[str, Any] | None = None) BaseModel[source]#
Loads a model from a directory saved with save_pretrained().
This classmethod instantiates a model and loads pretrained weights from disk. It reads both the model configuration and weights, ensuring compatibility.
- Parameters:
load_directory – Directory containing the saved model files.
map_location – Device to load tensors to (e.g., “cpu”, “cuda:0”). If None, loads to CPU by default.
strict – If True, requires exact match between state_dict keys. Defaults to True for safety.
weights_filename – Expected name of weights file. Defaults to “model.safetensors”.
config_filename – Expected name of config file. Defaults to “config.json”.
override_kwargs – Dict of initialization kwargs to override those in config. Useful for modifying model architecture during loading.
- Returns:
An instance of the model class with loaded weights.
- Raises:
FileNotFoundError – If weights file doesn’t exist.
RuntimeError – If there are missing or unexpected keys when loading state_dict.
ValueError – If model type in config doesn’t match the class.
Example
>>> model = MyCustomModel.from_pretrained("./my_model_checkpoint") >>> # Or with overrides >>> model = MyCustomModel.from_pretrained( ... "./my_model_checkpoint", ... override_kwargs={"dropout": 0.0} ... )
- save_pretrained(save_directory: str | Path, *, save_config: bool = True, weights_filename: str = 'model.safetensors', config_filename: str = 'config.json') None[source]#
Saves model weights and initialization config to a directory.
This method saves the model in a format compatible with from_pretrained(), allowing the model to be reloaded later for inference or further training.
- Parameters:
save_directory – Directory where the model will be saved. Will be created if it doesn’t exist.
save_config – If True, saves initialization kwargs as JSON config. Defaults to True.
weights_filename – Name of the weights file. Defaults to “model.safetensors”.
config_filename – Name of the config file. Defaults to “config.json”.
- Raises:
OSError – If the directory cannot be created or files cannot be written.
Example
>>> model = MyCustomModel(hidden_dim=128, num_layers=4) >>> model.save_pretrained("./my_model_checkpoint")