nemo_automodel.components.launcher.skypilot.utils#
Module Contents#
Functions#
Return a sky cloud object for the given cloud name string. |
|
Launch a training job on a cloud VM via SkyPilot. |
Data#
API#
- nemo_automodel.components.launcher.skypilot.utils.REMOTE_CONFIG_PATH#
‘/tmp/automodel_job_config.yaml’
- nemo_automodel.components.launcher.skypilot.utils._DEFAULT_SETUP#
‘cd ~/sky_workdir && pip install -e . –quiet’
- nemo_automodel.components.launcher.skypilot.utils._CLOUD_CLASSES#
None
- nemo_automodel.components.launcher.skypilot.utils._get_cloud(cloud_name: str)[source]#
Return a sky cloud object for the given cloud name string.
- nemo_automodel.components.launcher.skypilot.utils.submit_skypilot_job(
- config: nemo_automodel.components.launcher.skypilot.config.SkyPilotConfig,
- job_dir: str,
Launch a training job on a cloud VM via SkyPilot.
The local job config written to job_dir/job_config.yaml is uploaded to REMOTE_CONFIG_PATH on the remote VM. The code in the current working directory is synced to ~/sky_workdir via SkyPilot’s workdir mechanism.
- Parameters:
config – Populated SkyPilotConfig (including the training command).
job_dir – Local directory holding the job artifacts.
- Returns:
0 on successful submission.