Issue
I am trying to use a XGBoost model in Sage Maker and use it to score for a large data stored in S3 using Batch Transform.
I build the model using existing Sagemaker Container as follows:
estimator = sagemaker.estimator.Estimator(image_name=container,
hyperparameters=hyperparameters,
role=sagemaker.get_execution_role(),
train_instance_count=1,
train_instance_type='ml.m5.2xlarge',
train_volume_size=5, # 5 GB
output_path=output_path,
train_use_spot_instances=True,
train_max_run=300,
train_max_wait=600)
estimator.fit({'train': s3_input_train,'validation': s3_input_test})
The following code is used to do Batch Transform
The location of the test dataset
batch_input = 's3://{}/{}/test/examples'.format(bucket, prefix)
# The location to store the results of the batch transform job
batch_output = 's3://{}/{}/batch-inference'.format(bucket, prefix)
transformer = xgb_model.transformer(instance_count=1, instance_type='ml.m4.xlarge', output_path=batch_output)
transformer.transform(data=batch_input, data_type='S3Prefix', content_type='text/csv', split_type='Line')
transformer.wait()
The above code works fine in Development environment (Jupyter notebook) when the model is built in Jupyter. However, I would like to deploy the model and call its endpoint for Batch Transform.
Most examples for SageMaker endpoint creation is for scoring on a single data and not for batch transform.
Can someone point to how to deploy and use the endpoints for Batch Transform in SageMaker? Thank you
Solution
The following link has an example of how to call a stored model in SageMaker to run Batch Transform job.
Answered By - Ravi
0 comments:
Post a Comment
Note: Only a member of this blog may post a comment.