Additionally, make sure you have enough space on your disk.
If this does not work, please reply. If you are in China you might face difficulties, but the patch above should increase the number of retries boto3 / S3 performs when downloading.
Thank you for your advice, but it didn’t work. And I am living in China.
Is there any solution to fix some problems for Chinese users?
Thank you for your help again.
@Liym@miguelmartin
Sorry, I haven’t tried a VPN to download the data so far, because the dataset is so big. I am doing a small trial in some complete samples. If I use a VPN to download the data, I will let you know whether this is successful.
I have made some downloading attempts and here are the results. Noting that before these tryings I have successfully downloaded more than 90% of takes of benchmark “proficiency” but get an error every time when downloading the remaining 10% takes.
First, I tried to print all the paths to be fetch and the results (posted below) show that all the failed fetched takes are from university “iiith”.
Second, I found that when I try to download without a VPN, I will get a WARNING on “Bucket Failures”. The same issue has also been posted on another question (posted below). Additionally, this WARNING is also about “iiith”.
Third, when I am trying to download without a VPN, I will not get the “Bucket Failures”. However, I still got the error.
WARN: failed to fetch 289 files [integrity=0, exceptions=289]
Please retry the download (… returning with error code 2)
It seems that a VPN may not help to solve this problem. Moreover, it seem that there is something wrong on the data from “iiith”.
Hopefully these tryings can help identify the real problem. If there is something I have missed or there are better solutions on this series of questions, please let me know.