oumi.core.evaluation.utils#

Submodules#

oumi.core.evaluation.utils.platform_prerequisites module#

oumi.core.evaluation.utils.platform_prerequisites.check_prerequisites(evaluation_backend: EvaluationBackend, task_name: str | None = None) None[source]#

Check whether the evaluation backend prerequisites are satisfied.

Parameters:
  • evaluation_backend – The evaluation backend that the task will run.

  • task_name (for LM Harness backend only) – The name of the task to run.

Raises:

RuntimeError – If the evaluation backend prerequisites are not satisfied.

oumi.core.evaluation.utils.save_utils module#

oumi.core.evaluation.utils.save_utils.save_evaluation_output(backend_name: str, task_params: EvaluationTaskParams, evaluation_result: EvaluationResult, base_output_dir: str | None, config: EvaluationConfig | None) None[source]#

Writes configuration settings and evaluation outputs to files.

Parameters:
  • backend_name – The name of the evaluation backend used (e.g., “lm_harness”).

  • task_params – Oumi task parameters used for this evaluation.

  • evaluation_result – The evaluation results to save.

  • base_output_dir – The directory where the evaluation results will be saved. A subdirectory with the name <base_output_dir> / <backend_name>_<time> will be created to retain all files related to this evaluation. If there is an existing directory with the same name, a new directory with a unique index will be created: <base_output_dir> / <backend_name>_<time>_<index>.

  • config – Oumi evaluation configuration settings used for the evaluation.