syne_tune.backend.sagemaker_backend.sagemaker_utils module
- syne_tune.backend.sagemaker_backend.sagemaker_utils.default_config()[source]
https://aws.amazon.com/premiumsupport/knowledge-center/sagemaker-python-throttlingexception/
- Return type:
Config
- Returns:
Default config which avoids throttling
- syne_tune.backend.sagemaker_backend.sagemaker_utils.get_log(jobname, log_client=None)[source]
- Parameters:
jobname (
str
) – name of a sagemaker training joblog_client – a log client, for instance
boto3.client('logs')
if None, the client is instantiated with the
default AWS configuration :rtype:
List
[str
] :return: lines appearing in the log of the Sagemaker training job
- syne_tune.backend.sagemaker_backend.sagemaker_utils.sagemaker_search(trial_ids_and_names, sm_client=None, log_client=None)[source]
- Parameters:
trial_ids_and_names (
List
[Tuple
[int
,str
]]) – Trial ids and sagemaker jobnames to retrieve information fromsm_client – Sagemaker client used to search for jobs
sm_client – Log client used to query lob logs
- Return type:
List
[TrialResult
]- Returns:
list of dictionary containing job information (status, creation-time, metrics, hyperparameters etc).
In term of speed around 100 jobs can be retrieved per second.
- syne_tune.backend.sagemaker_backend.sagemaker_utils.metric_definitions_from_names(metrics_names)[source]
- Parameters:
metrics_names (
List
[str
]) – names of the metrics present in the log.
Metrics must be written in the log as [metric-name]: value, for instance [accuracy]: 0.23 :return: a list of metric dictionaries that can be passed to sagemaker so that metrics are parsed from logs, the list can be passed to
metric_definitions
in sagemaker.
- syne_tune.backend.sagemaker_backend.sagemaker_utils.add_metric_definitions_to_sagemaker_estimator(estimator, metrics_names)[source]
Adds metric definitions according to
metric_definitions_from_names()
toestimator
for each name inmetrics_names
. The regexp for each name is compatible with howReporter
outputs metric values.- Parameters:
estimator (
EstimatorBase
) – SageMaker estimatormetrics_names (
List
[str
]) – Names of metrics to be appended
- syne_tune.backend.sagemaker_backend.sagemaker_utils.sagemaker_fit(sm_estimator, hyperparameters, checkpoint_s3_uri=None, wait=False, job_name=None, *sagemaker_fit_args, **sagemaker_fit_kwargs)[source]
- Parameters:
sm_estimator (
Framework
) – sagemaker estimator to be fittedhyperparameters (
Dict
[str
,object
]) – dictionary of hyperparameters that are passed toentry_point_script
checkpoint_s3_uri (
Optional
[str
]) – checkpoint_s3_uri of Sagemaker Estimatorwait (
bool
) – whether to wait for job completionmetrics_names – names of metrics to track reported with
report.py
. In case those metrics are passed, their
learning curves will be shown in Sagemaker console. :return: name of sagemaker job
- syne_tune.backend.sagemaker_backend.sagemaker_utils.get_execution_role()[source]
- Returns:
sagemaker execution role that is specified with the environment variable
AWS_ROLE
, if not specified then
we infer it by searching for the role associated to Sagemaker. Note that
import sagemaker; sagemaker.get_execution_role()
does not return the right role outside of a Sagemaker notebook.
- syne_tune.backend.sagemaker_backend.sagemaker_utils.download_sagemaker_results(s3_path=None)[source]
Download results obtained after running tuning remotely on Sagemaker, e.g. when using
RemoteLauncher
.
- syne_tune.backend.sagemaker_backend.sagemaker_utils.map_identifier_limited_length(name, max_length=63, rnd_digits=4)[source]
If
name
is longer than ‘max_length`` characters, it is mapped to a new identifier of lengthmax_length
, being the concatenation of the firstmax_length - rnd_digits
characters ofname
, followed by a random string of lengthhash_digits
.- Parameters:
name (
str
) – Identifier to be limited in lengthmax_length (
int
) – Maximum length for outputrnd_digits (
int
) – See above
- Return type:
str
- Returns:
See above
- syne_tune.backend.sagemaker_backend.sagemaker_utils.s3_copy_objects_recursively(s3_source_path, s3_target_path)[source]
Recursively copies objects from
s3_source_path
tos3_target_path
.We return a dict with ‘num_action_calls’, ‘num_successful_action_calls’, ‘first_error_message’ (the error message for the first failed
action
call encountered).Note
This function should not be used to copy a large number of objects, as it is rather slow (one API call for object)
- Parameters:
s3_source_path (
str
) –s3_target_path (
str
) –
- Return type:
Dict
[str
,Any
]- Returns:
See above
- syne_tune.backend.sagemaker_backend.sagemaker_utils.s3_delete_objects_recursively(s3_path)[source]
Recursively deletes objects from
s3_path
.We return a dict with ‘num_action_calls’, ‘num_successful_action_calls’, ‘first_error_message’ (the error message for the first failed
action
call encountered).Note
This function should not be used to delete a large number of objects, as it is rather slow (one API call for object)
- Parameters:
s3_path (
str
) –- Return type:
Dict
[str
,Any
]- Returns:
See above
- syne_tune.backend.sagemaker_backend.sagemaker_utils.s3_download_files_recursively(s3_source_path, target_path, valid_postfixes=None)[source]
Recursively downloads objects from
s3_source_path
and stores them locally as files belowtarget_path
We return a dict with ‘num_action_calls’, ‘num_successful_action_calls’, ‘first_error_message’ (the error message for the first failed
action
call encountered).If
valid_postfixes
is given, only such objects are downloaded for whichobject_key.endswith(postfix)
for somepostfix in valid_postfixes
.Note
This function should not be used to download a large number of objects, as it is rather slow (one API call for object). In this case, running
aws s3 sync
can be much faster.- Parameters:
s3_source_path (
str
) – See abovetarget_path (
str
) – See abovevalid_postfixes (
Optional
[List
[str
]]) – See above, optional
- Return type:
Dict
[str
,Any
]- Returns:
See above
- syne_tune.backend.sagemaker_backend.sagemaker_utils.backend_path_not_synced_to_s3()[source]
When an experiment with the local backend is run remotely (as SageMaker training job), we do not want checkpoints to be synced to S3, since this is expensive and error-prone (since several trials may write checkpoints at the same time). Pass the returned path to
trial_backend_path
when constructing the :class`~syne_tune.Tuner`.Here, we direct checkpoint writing to /opt/ml/input/data/, which is mounted on a partition with sufficient space. Different to /opt/ml/checkpoints, this directory is not synced to S3.
- Return type:
Path
- Returns:
Path to set in local backend