Open
Description
Hi,
The program was shut down and throw out an error ``Write failed: Broken pipe" when I was training imagnet using the cmd: python main.py -a resnet18 [imagenet-folder with train and val folders]
I tried several suggestions after a google search and none of them helped out.
I am using ubuntu16.04 with 8 1080 GPU cards. What I have tried are:
1)I set num_work=0 and 1 to disable multi-thread as suggested here: pytorch/pytorch#2341
2)I also tried to wrap all operations in functions and then call them inside an if name == 'main' as suggested here: https://discuss.pytorch.org/t/brokenpipeerror-errno-32-broken-pipe-when-i-run-cifar10-tutorial-py/6224/4?u=karmus89.
Has anyone encountered the same problem? Thank you so much.