Skip to content

Extra quotation marks from training job description #4409

Open
@b5y

Description

@b5y

Describe the bug
Extra quotation marks returned when you call sagemaker_submit_directory from sagemaker_submit_directory

To reproduce
Let's we have some PyTorch estimator, then if we do:

import sagemaker
from sagemaker import get_execution_role
from sagemaker.pytorch import PyTorch


boto_session = boto3.session.Session()
sagemaker_session = sagemaker.Session(boto_session=boto_session)
role = get_execution_role(sagemaker_session=sagemaker_session)
code_location = "s3//path/to/code/location/sourcedir.tar.gz"
output_path = "s3://path/to/output"
train_instance_type="ml.g5.4xlarge"

def get_hyperparameters():
    hyperparameters = {
       "some hyperparameters"
    }
    
    return hyperparameters

estimator = PyTorch(
            entry_point="train.py",
            source_dir="./source_dir",  # directory of your training script
            code_location=code_location,
            role=role,
            framework_version="2.1",
            py_version="py310",
            instance_type=train_instance_type,
            instance_count=1,
            volume_size=10,  # size of the storage volume in GB
            output_path=output_path,
            hyperparameters=get_hyperparameters()
        )
train_data_loc = "s3://path/to/train/data"
val_data_loc = ''
test_data_loc = ''
channels = {
    'training': train_data_loc,
    # 'validation': val_data_loc,
    # 'test': test_data_loc
}

training_job_name = "some-training-job-name"
estimator.fit(
    inputs=channels,
    wait=False,
    job_name=training_job_name
)

describe_training_job = estimator.latest_training_job.describe()
model_data_url = describe_training_job["ModelArtifacts"]["S3ModelArtifacts"]
# PROBLEM IS HERE
source_dir = describe_training_job['HyperParameters']['sagemaker_submit_directory']

source_dir returns string in a format '"s3://path/to/the/sourcedir.tar.gz"'
so every time I need to do source_dir.strip('\"')

Expected behavior
No extra quotation marks after getting the sagemaker_submit_directory

System information
A description of your system. Please provide:

  • SageMaker Python SDK version: 2.199.0
  • Framework name (eg. PyTorch) or algorithm (eg. KMeans): from sagemaker.pytorch import PyTorch
  • Framework version: not able to import torch in ml.t3.medium instance
  • Python version: 3.10.6
  • CPU or GPU: CPU
  • Custom Docker image (Y/N): N

Additional context
Happens in ml.t3.medium instance in SageMaker Domains Studio.

Metadata

Metadata

Assignees

Labels

Type

No type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions