Description
Description
Accessing image classification, object detection and instance segmentation fridge datasets using below mentioned URLs leads to below issue intermittently.
Dataset URLs:
1/ https://cvbp-secondary.z19.web.core.windows.net/datasets/image_classification/fridgeObjects.zip
2/ https://cvbp-secondary.z19.web.core.windows.net/datasets/image_classification/multilabelFridgeObjects.zip
3/ https://cvbp-secondary.z19.web.core.windows.net/datasets/object_detection/odFridgeObjects.zip
4/ https://cvbp-secondary.z19.web.core.windows.net/datasets/object_detection/odFridgeObjectsMask.zip
---------------------------------------------------------------------------
HTTPError Traceback (most recent call last)
Cell In[3], line 23
20 data_file = os.path.join(dataset_parent_dir, f"{dataset_name}.zip")
22 # Download the dataset
---> 23 urllib.request.urlretrieve(download_url, filename=data_file)
25 # extract files
26 with ZipFile(data_file, "r") as zip:
File c:\Users\rupaljain\.conda\envs\ft_acft_local_comp\lib\urllib\request.py:247, in urlretrieve(url, filename, reporthook, data)
230 """
231 Retrieve a URL into a temporary location on disk.
232
(...)
243 data file as well as the resulting HTTPMessage object.
244 """
245 url_type, path = _splittype(url)
--> 247 with contextlib.closing(urlopen(url, data)) as fp:
248 headers = fp.info()
250 # Just return the local path and the "headers" for file://
251 # URLs. No sense in performing a copy unless requested.
File c:\Users\rupaljain\.conda\envs\ft_acft_local_comp\lib\urllib\request.py:222, in urlopen(url, data, timeout, cafile, capath, cadefault, context)
220 else:
221 opener = _opener
--> 222 return opener.open(url, data, timeout)
File c:\Users\rupaljain\.conda\envs\ft_acft_local_comp\lib\urllib\request.py:531, in OpenerDirector.open(self, fullurl, data, timeout)
529 for processor in self.process_response.get(protocol, []):
530 meth = getattr(processor, meth_name)
--> 531 response = meth(req, response)
533 return response
File c:\Users\rupaljain\.conda\envs\ft_acft_local_comp\lib\urllib\request.py:640, in HTTPErrorProcessor.http_response(self, request, response)
637 # According to RFC 2616, "2xx" code indicates that the client's
638 # request was successfully received, understood, and accepted.
639 if not (200 <= code < 300):
--> 640 response = self.parent.error(
641 'http', request, response, code, msg, hdrs)
643 return response
File c:\Users\rupaljain\.conda\envs\ft_acft_local_comp\lib\urllib\request.py:569, in OpenerDirector.error(self, proto, *args)
567 if http_err:
568 args = (dict, 'default', 'http_error_default') + orig_args
--> 569 return self._call_chain(*args)
File c:\Users\rupaljain\.conda\envs\ft_acft_local_comp\lib\urllib\request.py:502, in OpenerDirector._call_chain(self, chain, kind, meth_name, *args)
500 for handler in handlers:
501 func = getattr(handler, meth_name)
--> 502 result = func(*args)
503 if result is not None:
504 return result
File c:\Users\rupaljain\.conda\envs\ft_acft_local_comp\lib\urllib\request.py:649, in HTTPDefaultErrorHandler.http_error_default(self, req, fp, code, msg, hdrs)
648 def http_error_default(self, req, fp, code, msg, hdrs):
--> 649 raise HTTPError(req.full_url, code, msg, hdrs, fp)
HTTPError: HTTP Error 404: The requested content does not exist.
In which platform does it happen?
Azure Cluster Run. Sample workflow: https://github.com/Azure/azureml-examples/actions/runs/9501325986
How do we replicate the issue?
One can try running "2.1. Download the Data" section from below notebooks:
1/ https://github.com/Azure/azureml-examples/blob/main/sdk/python/jobs/automl-standalone-jobs/automl-image-classification-multiclass-task-fridge-items/automl-image-classification-multiclass-task-fridge-items.ipynb
2/ https://github.com/Azure/azureml-examples/blob/main/sdk/python/jobs/automl-standalone-jobs/automl-image-classification-multilabel-task-fridge-items/automl-image-classification-multilabel-task-fridge-items.ipynb
3/ https://github.com/Azure/azureml-examples/blob/main/sdk/python/jobs/automl-standalone-jobs/automl-image-object-detection-task-fridge-items-batch-scoring/image-object-detection-batch-scoring-non-mlflow-model.ipynb
4/ https://github.com/Azure/azureml-examples/blob/main/sdk/python/jobs/automl-standalone-jobs/automl-image-instance-segmentation-task-fridge-items/automl-image-instance-segmentation-task-fridge-items.ipynb
Expected behavior (i.e. solution)
We should not see HTTPError: HTTP Error 404: The requested content does not exist.