oumi.judges#

This module provides access to various judge configurations for the Oumi project.

The judges are used to evaluate the quality of AI-generated responses based on different criteria such as helpfulness, honesty, and safety.

class oumi.judges.BaseJudge(config: JudgeConfig, inference_engine: BaseInferenceEngine | None = None)[source]#

Bases: ABC

build_judgement_prompt(judge_input: Message, attribute_name: str | None) Conversation[source]#

Generate judge prompts for a dataset.

judge(raw_inputs: list[Conversation] | list[dict] | list[Message]) list[dict[str, BaseJudgeOutput]][source]#

Judge the given conversations.

class oumi.judges.BaseJudgeOutput(*, template: str, role: Role, raw_judgement: str | None = None)[source]#

Bases: ABC, TemplatedMessage

property fields#

Return the fields of the judgement.

classmethod from_json_output(raw_judgement: str | None) Self | None[source]#

Parses the judgement from JSON.

classmethod from_xml_output(raw_judgement: str | None) Self | None[source]#

Parses the judgement from XML-like tags in the raw output.

Parameters:

raw_judgement – The raw judgement string to parse.

Returns:

An instance of the class with parsed attributes,

or None if parsing fails.

Return type:

Optional[Self]

property label#

Convert the judgement to a boolean or Likert scale label.

This method should be overridden by subclasses to provide the actual conversion logic.

model_computed_fields: ClassVar[Dict[str, ComputedFieldInfo]] = {}#

A dictionary of computed field names and their corresponding ComputedFieldInfo objects.

model_config: ClassVar[ConfigDict] = {}#

Configuration for the model, should be a dictionary conforming to [ConfigDict][pydantic.config.ConfigDict].

model_fields: ClassVar[Dict[str, FieldInfo]] = {'raw_judgement': FieldInfo(annotation=Union[str, NoneType], required=False, default=None), 'role': FieldInfo(annotation=Role, required=True), 'template': FieldInfo(annotation=str, required=True)}#

Metadata about the fields defined on the model, mapping of field names to [FieldInfo][pydantic.fields.FieldInfo] objects.

This replaces Model.__fields__ from Pydantic V1.

raw_judgement: str | None#
class oumi.judges.OumiJudgeInput(*, template: str = '<request>{{ request }}</request>\n{% if context %}<context>{{ context }}</context>{% endif %}\n{% if response %}<response>{{ response }}</response>{% endif %}\n', role: Role = Role.USER, request: str, response: str | None = None, context: str | None = None)[source]#

Bases: TemplatedMessage

context: str | None#
model_computed_fields: ClassVar[Dict[str, ComputedFieldInfo]] = {}#

A dictionary of computed field names and their corresponding ComputedFieldInfo objects.

model_config: ClassVar[ConfigDict] = {}#

Configuration for the model, should be a dictionary conforming to [ConfigDict][pydantic.config.ConfigDict].

model_fields: ClassVar[Dict[str, FieldInfo]] = {'context': FieldInfo(annotation=Union[str, NoneType], required=False, default=None), 'request': FieldInfo(annotation=str, required=True), 'response': FieldInfo(annotation=Union[str, NoneType], required=False, default=None), 'role': FieldInfo(annotation=Role, required=False, default=<Role.USER: 'user'>), 'template': FieldInfo(annotation=str, required=False, default='<request>{{ request }}</request>\n{% if context %}<context>{{ context }}</context>{% endif %}\n{% if response %}<response>{{ response }}</response>{% endif %}\n')}#

Metadata about the fields defined on the model, mapping of field names to [FieldInfo][pydantic.fields.FieldInfo] objects.

This replaces Model.__fields__ from Pydantic V1.

request: str#
response: str | None#
role: Role#

The role of the message sender (e.g., USER, ASSISTANT, SYSTEM).

template: str#

The template string used to generate the message content.

class oumi.judges.OumiJudgeOutput(*, template: str = '<explanation>{{explanation}}</explanation><judgement>{{judgement}}</judgement>', role: Role = Role.ASSISTANT, raw_judgement: str | None = None, judgement: str | None = None, explanation: str | None = None)[source]#

Bases: BaseJudgeOutput

explanation: str | None#
judgement: str | None#
property label#

Convert the judgement to a boolean or Likert scale label.

model_computed_fields: ClassVar[Dict[str, ComputedFieldInfo]] = {}#

A dictionary of computed field names and their corresponding ComputedFieldInfo objects.

model_config: ClassVar[ConfigDict] = {}#

Configuration for the model, should be a dictionary conforming to [ConfigDict][pydantic.config.ConfigDict].

model_fields: ClassVar[Dict[str, FieldInfo]] = {'explanation': FieldInfo(annotation=Union[str, NoneType], required=False, default=None), 'judgement': FieldInfo(annotation=Union[str, NoneType], required=False, default=None), 'raw_judgement': FieldInfo(annotation=Union[str, NoneType], required=False, default=None), 'role': FieldInfo(annotation=Role, required=False, default=<Role.ASSISTANT: 'assistant'>), 'template': FieldInfo(annotation=str, required=False, default='<explanation>{{explanation}}</explanation><judgement>{{judgement}}</judgement>')}#

Metadata about the fields defined on the model, mapping of field names to [FieldInfo][pydantic.fields.FieldInfo] objects.

This replaces Model.__fields__ from Pydantic V1.

role: Role#

The role of the message sender (e.g., USER, ASSISTANT, SYSTEM).

template: str#

The template string used to generate the message content.

class oumi.judges.OumiXmlJudge(config: JudgeConfig, inference_engine: BaseInferenceEngine | None = None)[source]#

Bases: BaseJudge

oumi.judges.oumi_v1_xml_claude_sonnet_judge() JudgeConfig[source]#

Returns a JudgeConfig for the Oumi v1 XML Anthropic judge.

This function creates and returns a JudgeConfig object for the Oumi V1 Judge, which uses Claude Sonnet as a judge, with inputs and outputs in XML format.

Returns:

A configuration object for the Oumi v1 XML Anthropic judge.

Return type:

JudgeConfig

Note

This judge uses the Anthropic API, so the ANTHROPIC_API_KEY environment variable must be set with a valid API key.

oumi.judges.oumi_v1_xml_gpt4o_judge() JudgeConfig[source]#

Returns a JudgeConfig for the Oumi v1 XML GPT-4 judge.

This function creates and returns a JudgeConfig object for the Oumi V1 Judge, which uses GPT-4 as a judge, with inputs and outputs in XML format.

Returns:

A configuration object for the Oumi v1 XML GPT-4 judge.

Return type:

JudgeConfig

Note

This judge uses the OpenAI API, so the OPENAI_API_KEY environment variable must be set with a valid API key.

oumi.judges.oumi_v1_xml_local_judge() JudgeConfig[source]#

Returns a JudgeConfig for the Oumi v1 XML local judge.

Returns:

A configuration object for the Oumi v1 XML local judge.

Return type:

JudgeConfig

Note

This judge uses a local GGUF model file for inference.