oumi.core.evaluation.utils#
Submodules#
oumi.core.evaluation.utils.platform_prerequisites module#
- oumi.core.evaluation.utils.platform_prerequisites.check_prerequisites(evaluation_backend: EvaluationBackend, task_name: str | None = None) None [source]#
Check whether the evaluation backend prerequisites are satisfied.
- Parameters:
evaluation_backend – The evaluation backend that the task will run.
task_name (for LM Harness backend only) – The name of the task to run.
- Raises:
RuntimeError – If the evaluation backend prerequisites are not satisfied.
oumi.core.evaluation.utils.save_utils module#
- oumi.core.evaluation.utils.save_utils.save_evaluation_output(backend_name: str, task_params: EvaluationTaskParams, evaluation_result: EvaluationResult, base_output_dir: str | None, config: EvaluationConfig | None) None [source]#
Writes configuration settings and evaluation outputs to files.
- Parameters:
backend_name – The name of the evaluation backend used (e.g., “lm_harness”).
task_params – Oumi task parameters used for this evaluation.
evaluation_result – The evaluation results to save.
base_output_dir – The directory where the evaluation results will be saved. A subdirectory with the name <base_output_dir> / <backend_name>_<time> will be created to retain all files related to this evaluation. If there is an existing directory with the same name, a new directory with a unique index will be created: <base_output_dir> / <backend_name>_<time>_<index>.
config – Oumi evaluation configuration settings used for the evaluation.