Make sourcedir.tar.gz and repacked model.tar.gz structure consistent

When deploying a model together with customer code (described by one or more `Model.__init__` arguments among `entry_point`, `source_dir`, `dependencies`), the SDK (actually the relevant `Model.prepare_container_def` method) has 2 options for the customer code: 
* **Either bundling all the code artifacts in a `sourcedir.tar.gz` file.** The file is then staged to S3 and later downloaded and extracted in the container at `/opt/ml/model/code`. If supplied, the `entry_point` file is copied at the root of the tar file. If supplied, the content of the `source_dir` directory is copied at the root of the tar file. If supplied, each dependency in `dependencies` is copied at the root of the tar file. This behavior is implemented by the `sagemaker.fw_utils.tar_and_upload_dir` function. 
* **Or repacking the model and code artifacts together in a single `model.tar.gz` file.** The file is then staged to S3 and later downloaded by the container's host and made available in the container at `/opt/ml/model` where it is extracted. From the `model.tar.gz` file perspective, code artifacts (the `entry_point` file if supplied and the content of the `source_dir` directory if supplied) are placed in a `code` folder (location is relative to the root of the tar file). If supplied, each dependency in `dependencies` in placed in a `code/lib` folder. This behavior is implemented by the `sagemaker.utils._create_or_update_code_dir` function.

In both cases, code artifacts end up being available in the inference container at `/opt/ml/model/code`. However an inconsistency appears if we use `dependencies`. In that case, our `dependencies` end up being located:
* In `/opt/ml/model/code` if the code was bundled in a `source.dir.tar.gz` file.
* In `/opt/ml/model/code/lib` if the code was repacked with the model artifacts in a `model.tar.gz` file.

The SageMaker inference toolkits automatically add `/opt/ml/model` and `/opt/ml/model/code` to `sys.path`, unlike `/opt/ml/model/code/lib`. Therefore, `dependencies` located in the latter directory cannot be imported using the Python import system. The user/customer has to manually add this location to `sys.path` for its `dependencies` to be importable. This ultimately boils down to the inconsistency in the file structure which is annoying since the process of opting for a `sourcedir.tar.gz` or a repacked `model.tar.gz` is opaque to the user (and highly framework-dependent).

*Notice*: We do not consider the Multi-Model Enabled (MME) mode here.

IMHO, the solution with minimal impact would be not to create a `code/lib` directory in the case of the repacked `model.tar.gz`, `dependencies` would simply be copied to the `code` directory. Dependencies from a repacked `model.tar.gz` would then be directly available under `/opt/ml/code` which is already automatically added to `sys.path` by the inference toolkits. This solution would in fact simply align the structure of the repacked `model.tar.gz` file on the structure of the `sourcedir.tar.gz`. The latter being already in use, this fix should not raise backward-compatibility issues.

This topic directly relates to the following issues:
* [Issue 1065](https://github.com/aws/sagemaker-python-sdk/issues/1065) - Failed to import code copied into the /opt/ml/model/code/lib directory
* [Issue 1832](https://github.com/aws/sagemaker-python-sdk/issues/1832) - Extra lib directory when adding dependencies for PyTorchModel

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Make sourcedir.tar.gz and repacked model.tar.gz structure consistent #3491

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Make sourcedir.tar.gz and repacked model.tar.gz structure consistent #3491

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions