oumi.core.launcher#

Launcher module for the Oumi (Open Universal Machine Intelligence) library.

This module provides base classes for cloud and cluster management in the Oumi framework.

These classes serve as foundations for implementing cloud-specific and cluster-specific launchers for running machine learning jobs.

class oumi.core.launcher.BaseCloud[source]#

Bases: ABC

Base class for resource pool capable of creating clusters.

abstractmethod get_cluster(name: str) BaseCluster | None[source]#

Gets the cluster with the specified name, or None if not found.

abstractmethod list_clusters() list[BaseCluster][source]#

Lists the active clusters on this cloud.

abstractmethod up_cluster(job: JobConfig, name: str | None, **kwargs) JobStatus[source]#

Creates a cluster and starts the provided Job.

class oumi.core.launcher.BaseCluster[source]#

Bases: ABC

Base class for a compute cluster (job queue).

abstractmethod cancel_job(job_id: str) JobStatus[source]#

Cancels the specified job on this cluster.

abstractmethod down() None[source]#

Tears down the current cluster.

abstractmethod get_job(job_id: str) JobStatus[source]#

Gets the job on this cluster if it exists, else returns None.

abstractmethod get_jobs() list[JobStatus][source]#

Lists the jobs on this cluster.

abstractmethod name() str[source]#

Gets the name of the cluster.

abstractmethod run_job(job: JobConfig) JobStatus[source]#

Runs the specified job on this cluster.

abstractmethod stop() None[source]#

Stops the current cluster.

class oumi.core.launcher.JobStatus(name: str, id: str, status: str, cluster: str, metadata: str, done: bool)[source]#

Bases: object

Dataclass to hold the status of a job.

cluster: str#

The cluster to which the job belongs.

done: bool#

A flag indicating whether the job is done. True only if the job is in a terminal state (e.g. completed, failed, or canceled).

id: str#

The unique identifier of the job on the cluster

metadata: str#

Miscellaneous metadata about the job.

name: str#

The display name of the job.

status: str#

The status of the job.