Description
When deploying a model together with customer code (described by one or more Model.__init__
arguments among entry_point
, source_dir
, dependencies
), the SDK (actually the relevant Model.prepare_container_def
method) has 2 options for the customer code:
- Either bundling all the code artifacts in a
sourcedir.tar.gz
file. The file is then staged to S3 and later downloaded and extracted in the container at/opt/ml/model/code
. If supplied, theentry_point
file is copied at the root of the tar file. If supplied, the content of thesource_dir
directory is copied at the root of the tar file. If supplied, each dependency independencies
is copied at the root of the tar file. This behavior is implemented by thesagemaker.fw_utils.tar_and_upload_dir
function. - Or repacking the model and code artifacts together in a single
model.tar.gz
file. The file is then staged to S3 and later downloaded by the container's host and made available in the container at/opt/ml/model
where it is extracted. From themodel.tar.gz
file perspective, code artifacts (theentry_point
file if supplied and the content of thesource_dir
directory if supplied) are placed in acode
folder (location is relative to the root of the tar file). If supplied, each dependency independencies
in placed in acode/lib
folder. This behavior is implemented by thesagemaker.utils._create_or_update_code_dir
function.
In both cases, code artifacts end up being available in the inference container at /opt/ml/model/code
. However an inconsistency appears if we use dependencies
. In that case, our dependencies
end up being located:
- In
/opt/ml/model/code
if the code was bundled in asource.dir.tar.gz
file. - In
/opt/ml/model/code/lib
if the code was repacked with the model artifacts in amodel.tar.gz
file.
The SageMaker inference toolkits automatically add /opt/ml/model
and /opt/ml/model/code
to sys.path
, unlike /opt/ml/model/code/lib
. Therefore, dependencies
located in the latter directory cannot be imported using the Python import system. The user/customer has to manually add this location to sys.path
for its dependencies
to be importable. This ultimately boils down to the inconsistency in the file structure which is annoying since the process of opting for a sourcedir.tar.gz
or a repacked model.tar.gz
is opaque to the user (and highly framework-dependent).
Notice: We do not consider the Multi-Model Enabled (MME) mode here.
IMHO, the solution with minimal impact would be not to create a code/lib
directory in the case of the repacked model.tar.gz
, dependencies
would simply be copied to the code
directory. Dependencies from a repacked model.tar.gz
would then be directly available under /opt/ml/code
which is already automatically added to sys.path
by the inference toolkits. This solution would in fact simply align the structure of the repacked model.tar.gz
file on the structure of the sourcedir.tar.gz
. The latter being already in use, this fix should not raise backward-compatibility issues.
This topic directly relates to the following issues:
- Issue 1065 - Failed to import code copied into the /opt/ml/model/code/lib directory
- Issue 1832 - Extra lib directory when adding dependencies for PyTorchModel