oumi.core.inference#
Inference module for the Oumi (Open Universal Machine Intelligence) library.
This module provides base classes for model inference in the Oumi framework.
- class oumi.core.inference.BaseInferenceEngine(model_params: ModelParams, *, generation_params: GenerationParams | None = None)[source]#
Bases:
ABC
Base class for running model inference.
- apply_chat_template(conversation: Conversation, **tokenizer_kwargs) str [source]#
Applies the chat template to the conversation.
- Parameters:
conversation – The conversation to apply the chat template to.
tokenizer_kwargs – Additional keyword arguments to pass to the tokenizer.
- Returns:
The conversation with the chat template applied.
- Return type:
str
- abstract get_supported_params() set[str] [source]#
Returns a set of supported generation parameters for this engine.
Override this method in derived classes to specify which parameters are supported.
- Returns:
A set of supported parameter names.
- Return type:
Set[str]
- infer(input: list[Conversation] | None = None, inference_config: InferenceConfig | None = None) list[Conversation] [source]#
Runs model inference.
- Parameters:
input – A list of conversations to run inference on. Optional.
inference_config – Parameters for inference. If not specified, a default config is inferred.
- Returns:
Inference output.
- Return type:
List[Conversation]
- abstract infer_from_file(input_filepath: str, inference_config: InferenceConfig | None = None) list[Conversation] [source]#
Runs model inference on inputs in the provided file.
This is a convenience method to prevent boilerplate from asserting the existence of input_filepath in the generation_params.
- Parameters:
input_filepath – Path to the input file containing prompts for generation.
inference_config – Parameters for inference.
- Returns:
Inference output.
- Return type:
List[Conversation]
- abstract infer_online(input: list[Conversation], inference_config: InferenceConfig | None = None) list[Conversation] [source]#
Runs model inference online.
- Parameters:
input – A list of conversations to run inference on.
inference_config – Parameters for inference.
- Returns:
Inference output.
- Return type:
List[Conversation]