How to use the ramp-engine.ramp_engine.aws.api._wait_until_train_finished function in ramp-engine

To help you get started, we’ve selected a few ramp-engine examples, based on popular ways it is used in public projects.

Secure your code as it's written. Use Snyk Code to scan source code in minutes - no build needed - and fix issues immediately.

github paris-saclay-cds / ramp-board / ramp-engine / ramp_engine / aws / worker.py View on Github external
def collect_results(self):
        super().collect_results()
        if self.status == 'running':
            aws._wait_until_train_finished(
                self.config, self.instance.id, self.submission)
            self.status = 'finished'
        if self.status != 'finished':
            raise ValueError("Cannot collect results if worker is not"
                             "'running' or 'finished'")

        logger.info("Collecting submission '{}'".format(self.submission))
        aws.download_log(self.config, self.instance.id, self.submission)

        if aws._training_successful(
                self.config, self.instance.id, self.submission):
            _ = aws.download_predictions(  # noqa
                self.config, self.instance.id, self.submission)
            self.status = 'collected'
            exit_status, error_msg = 0, ''
        else:
github paris-saclay-cds / ramp-board / ramp-engine / ramp_engine / aws / aws_train.py View on Github external
Train a submission on a ready ec2 instance
    the steps followed by this function are the following:
        1) upload the submission code to the instance
        2) launch training in a screen
        3) wait until training is finished
        4) download the predictions
        5) download th log
        6) set the predictions in the database
        7) score the submission
    """
    conf_aws = config[AWS_CONFIG_SECTION]
    upload_submission(conf_aws, instance_id, submission_id)
    launch_train(conf_aws, instance_id, submission_id)
    set_submission_state(config, submission_id, 'training')
    _run_hook(config, HOOK_START_TRAINING, submission_id)
    _wait_until_train_finished(conf_aws, instance_id, submission_id)
    download_log(conf_aws, instance_id, submission_id)

    label = _get_submission_label_by_id(config, submission_id)
    submission = get_submission_by_id(config, submission_id)
    actual_nb_folds = get_event_nb_folds(config, submission.event.name)
    if _training_successful(conf_aws, instance_id, submission_id,
                            actual_nb_folds):
        logger.info('Training of "{}" was successful'.format(
            label, instance_id))
        if conf_aws[MEMORY_PROFILING_FIELD]:
            logger.info('Download max ram usage info of "{}"'.format(label))
            download_mprof_data(conf_aws, instance_id, submission_id)
            max_ram = _get_submission_max_ram(conf_aws, submission_id)
            logger.info('Max ram usage of "{}": {}MB'.format(label, max_ram))
            set_submission_max_ram(config, submission_id, max_ram)