oumi.core.feature_generators#
Feature generators module for the Oumi (Open Universal Machine Intelligence) library.
This module provides classes to generate model input features, which can be used in multiple contexts (datasets, collators, etc).
- class oumi.core.feature_generators.BaseConversationFeatureGenerator[source]#
Bases:
ABC
Applies processor to generate model inputs from an input Conversation.
- abstractmethod transform_conversation(conversation: Conversation, options: FeatureGeneratorOptions | None) dict [source]#
Transforms a single Oumi conversation into a dictionary of model inputs.
- Parameters:
conversation – An input conversation.
options – Options for the feature generator.
- Returns:
A dictionary of inputs for a model.
- Return type:
dict
- abstractmethod transform_conversations(conversations: list[Conversation], options: FeatureGeneratorOptions | None) dict [source]#
Transforms a list of Oumi conversations into a dictionary of model inputs.
- Parameters:
conversations – A list of input conversations.
options – Options for the feature generator.
- Returns:
A dictionary of inputs for a model.
- Return type:
dict
- class oumi.core.feature_generators.FeatureGeneratorOptions(allow_feature_reshape: bool = True)[source]#
Bases:
NamedTuple
Options for the feature generator.
- allow_feature_reshape: bool#
Whether to allow reshaping of the model input features.
For example, whether the generator can drop the first dummy dimension.
- class oumi.core.feature_generators.VisionLanguageConversationFeatureGenerator(*, tokenizer: PreTrainedTokenizerBase | None = None, processor: BaseProcessor | None = None, processor_name: str | None = None, trust_remote_code: bool = False, return_tensors: str | None = None, max_length: int | None = None, truncation: bool = False, truncation_side: str = 'right', label_ignore_index: int | None = None)[source]#
Bases:
BaseConversationFeatureGenerator
Applies processor to generate model inputs from an input Conversation.
- transform_conversation(conversation: Conversation, options: FeatureGeneratorOptions | None) dict [source]#
Transforms a single Oumi conversation into a dictionary of model inputs.
- Parameters:
conversation – An input conversation.
options – Options for the feature generator.
- Returns:
A dictionary of inputs for a model.
- Return type:
dict
- transform_conversations(conversations: list[Conversation], options: FeatureGeneratorOptions | None) dict [source]#
Transforms a list of Oumi conversations into a dictionary of model inputs.
- Parameters:
conversations – An input conversation.
options – Options for the feature generator.
- Returns:
A dictionary of inputs for a model.
- Return type:
dict