oumi.core.feature_generators#

Feature generators module for the Oumi (Open Universal Machine Intelligence) library.

This module provides classes to generate model input features, which can be used in multiple contexts (datasets, collators, etc).

class oumi.core.feature_generators.BaseConversationFeatureGenerator[source]#

Bases: ABC

Applies processor to generate model inputs from an input Conversation.

abstractmethod transform_conversation(conversation: Conversation, options: FeatureGeneratorOptions | None) dict[source]#

Transforms a single Oumi conversation into a dictionary of model inputs.

Parameters:
  • conversation – An input conversation.

  • options – Options for the feature generator.

Returns:

A dictionary of inputs for a model.

Return type:

dict

abstractmethod transform_conversations(conversations: list[Conversation], options: FeatureGeneratorOptions | None) dict[source]#

Transforms a list of Oumi conversations into a dictionary of model inputs.

Parameters:
  • conversations – A list of input conversations.

  • options – Options for the feature generator.

Returns:

A dictionary of inputs for a model.

Return type:

dict

class oumi.core.feature_generators.FeatureGeneratorOptions(allow_feature_reshape: bool = True)[source]#

Bases: NamedTuple

Options for the feature generator.

allow_feature_reshape: bool#

Whether to allow reshaping of the model input features.

For example, whether the generator can drop the first dummy dimension.

class oumi.core.feature_generators.VisionLanguageConversationFeatureGenerator(*, tokenizer: PreTrainedTokenizerBase | None = None, processor: BaseProcessor | None = None, processor_name: str | None = None, trust_remote_code: bool = False, return_tensors: str | None = None, max_length: int | None = None, truncation: bool = False, truncation_side: str = 'right', label_ignore_index: int | None = None)[source]#

Bases: BaseConversationFeatureGenerator

Applies processor to generate model inputs from an input Conversation.

transform_conversation(conversation: Conversation, options: FeatureGeneratorOptions | None) dict[source]#

Transforms a single Oumi conversation into a dictionary of model inputs.

Parameters:
  • conversation – An input conversation.

  • options – Options for the feature generator.

Returns:

A dictionary of inputs for a model.

Return type:

dict

transform_conversations(conversations: list[Conversation], options: FeatureGeneratorOptions | None) dict[source]#

Transforms a list of Oumi conversations into a dictionary of model inputs.

Parameters:
  • conversations – An input conversation.

  • options – Options for the feature generator.

Returns:

A dictionary of inputs for a model.

Return type:

dict