Open
Description
Describe the bug
Trition version is old and affected by this
pytorch/pytorch#96937
To reproduce
See attached issue.
Expected behavior
No leaks.
System information
A description of your system. Please provide:
- SageMaker Python SDK version: 2.165.0
- Framework name (eg. PyTorch) or algorithm (eg. KMeans): Pytorch
- Framework version: 2.0.0
- Python version: 3.10
- CPU or GPU: GPU
- Custom Docker image (Y/N): N
Additional context
You are seriously using development version of packages???
Found existing installation: triton 2.0.0.dev20221202
Adding:
triton==2.0.0.post1
into requirements fixes the issue.
Honestly, when we are paying much more for Sagemaker training compared to EC2, I would expect some level of support and comfort.