oumi.datasets.grpo#

GRPO datasets module.

class oumi.datasets.grpo.LetterCountGrpoDataset(*, dataset_name: str | None = None, dataset_path: str | None = None, split: str | None = None, tokenizer: PreTrainedTokenizerBase | None = None, return_tensors: bool = False, **kwargs)[source]#

Bases: BaseExperimentalGrpoDataset

Dataset class for the oumi-ai/oumi-letter-count dataset.

A sample from the dataset: {

“prompt”: “Can you let me know how many ‘r’s are in ‘pandered’?”, “metadata”: {

“letter”: “r”, “letter_count_integer”: 1, “letter_count_string”: “one”, “unformatted_prompt”: “Can you let me know how many {letter}s are in {word}?”, “word”: “pandered”,

},

}

dataset_name: str#
default_dataset: str | None = 'oumi-ai/oumi-letter-count'#
transform(sample: Series) dict[source]#

Validate and transform the sample into Python dict.

trust_remote_code: bool#
class oumi.datasets.grpo.TldrGrpoDataset(*, dataset_name: str | None = None, dataset_path: str | None = None, split: str | None = None, tokenizer: PreTrainedTokenizerBase | None = None, return_tensors: bool = False, **kwargs)[source]#

Bases: BaseExperimentalGrpoDataset

Dataset class for the trl-lib/tldr dataset.

dataset_name: str#
default_dataset: str | None = 'trl-lib/tldr'#
trust_remote_code: bool#

Subpackages#