oumi.core.inference#

Inference module for the Oumi (Open Universal Machine Intelligence) library.

This module provides base classes for model inference in the Oumi framework.

class oumi.core.inference.BaseInferenceEngine(model_params: ModelParams, *, generation_params: GenerationParams | None = None)[source]#

Bases: ABC

Base class for running model inference.

apply_chat_template(conversation: Conversation, **tokenizer_kwargs) str[source]#

Applies the chat template to the conversation.

Parameters:
  • conversation – The conversation to apply the chat template to.

  • tokenizer_kwargs – Additional keyword arguments to pass to the tokenizer.

Returns:

The conversation with the chat template applied.

Return type:

str

abstract get_supported_params() set[str][source]#

Returns a set of supported generation parameters for this engine.

Override this method in derived classes to specify which parameters are supported.

Returns:

A set of supported parameter names.

Return type:

Set[str]

infer(input: list[Conversation] | None = None, inference_config: InferenceConfig | None = None) list[Conversation][source]#

Runs model inference.

Parameters:
  • input – A list of conversations to run inference on. Optional.

  • inference_config – Parameters for inference. If not specified, a default config is inferred.

Returns:

Inference output.

Return type:

List[Conversation]

abstract infer_from_file(input_filepath: str, inference_config: InferenceConfig | None = None) list[Conversation][source]#

Runs model inference on inputs in the provided file.

This is a convenience method to prevent boilerplate from asserting the existence of input_filepath in the generation_params.

Parameters:
  • input_filepath – Path to the input file containing prompts for generation.

  • inference_config – Parameters for inference.

Returns:

Inference output.

Return type:

List[Conversation]

abstract infer_online(input: list[Conversation], inference_config: InferenceConfig | None = None) list[Conversation][source]#

Runs model inference online.

Parameters:
  • input – A list of conversations to run inference on.

  • inference_config – Parameters for inference.

Returns:

Inference output.

Return type:

List[Conversation]