EgoTracks dataset download failure

aram · March 4, 2023, 10:17pm

Hello.
I have received my aws cli license from ego4d yesterday, and I’m trying to download the “EgoTracks” dataset.

I can successsfully download the viz and annotations but egotracks videos is failing.

ego4d --output_directory="~/scratch/data/tracking/ego4d" --datasets egotracks
output:
botocore.exceptions.ClientError: An error occurred (403) when calling the HeadObject operation: Forbidden

Also I wanted to ask how large (in GB) the EgoTracks dataset is? can only find it consists of 5.9K videos but I couldn’t find anywhere it states the actual size.

I have followed the instructions based on
egotracks download instructions
and
ego4d cli instructions

vkvats · March 5, 2023, 4:37pm

Hi

I am running into the same problem. If I do not use the region in config file then it throws ValueError: Invalid endpoint and if I use us-east-1 or us-east-2 then it throws the error stated above. Did you find any solution?

Thanks,

aram · March 5, 2023, 11:00pm

Hello I have not been able to resolve this. Also i think --datasets annotations_540ss also doesn’t work… Please let me know where i can find the downscaled 540ss annotations!

gene · March 6, 2023, 7:45pm

Please try with the --version v2 argument as well. Can you confirm that works for EgoTracks?

(Looking at 540ss, will come back shortly.)

aram · March 6, 2023, 8:18pm

–version v2 argument works, but it says the dataset is only 2.4 GB which I am highly doubtful it is the size of the entire egotracks dataset?

ego4d --output_directory ./ --datasets egotracks --version v2
Expected size of downloaded files is 2.4 GB. Do you want to start the download?

Could you confirm with me that 2.4GB is the correct size of the entire EgoTracks dataset?

aram · March 6, 2023, 9:18pm

I think the --datasets egotracks only gives the annotation json files.
So do I have to download the video data by the following command???
ego4d --output_directory ./ --datasets full_scale --benchmark EM --version v2

aram · March 8, 2023, 7:16pm

Hello. I have downloaded version 2 of the full-scale videos for EM Benchmark, but while running the preprocessing step, I meet this error.

python tools/preprocess/extract_ego4d_clip_frames.py

line 77, in extract_clip_ids
clip_uids.append(c["exported_clip_uid"])
KeyError: 'exported_clip_uid'

Could you please provide a full guide for the egotracks dataset download & preprocess so I can follow?
Thank you.

haotang · March 9, 2023, 1:28am

Hi, which annotation_path are you using (train, val or test)?

This should not happen with the challenge test set, but if it does, please let us know!
For the train and val, we are working on pushing an updated preprocess that should solve the problem. The workaround is: You can simple ignore these clips (should be less than 1%). We don’t have the exported_clip_uid for these frames because of conversion error.

aram · March 9, 2023, 5:30pm

Thank you for the reply.
Can you please confirm for me that the download script i used is correct?

ego4d --output_directory ./ --datasets egotracks full_scale --benchmark EM --version v2

Total size was about 2.7 TB

haotang · March 9, 2023, 8:11pm

Hi, I am working on confirming the download script, but we don’t need full_scale, only the clips for EM are needed. So it should be the following:

ego4d --output_directory ./ --datasets egotracks clips --benchmark EM --version v2

aram · March 9, 2023, 8:49pm

Thank you so much for the fast reply! I think I will try to re-download only the clip dataset in the meanwhile. Please let me know when the preprocess script is cleaned :). Thanks @haotang!

aram · March 10, 2023, 6:50pm

@haotang
Just as an FYI. Skipping 113 videos...Total 3433 to be processed ... Is the number of videos that don’t have “exported_clip_uid” field in annotations

haotang · March 10, 2023, 7:31pm

Hi @aram Thanks for the sharing the numbers! This looks correct to me. I ignored a few more videos because of issues with the frame conversion for certain bounding boxes. I created a fix in Egotracks fix by tanghaotommy · Pull Request #42 · EGO4D/episodic-memory · GitHub and am waiting for review. But you may take a look and use that.

aram · March 10, 2023, 8:25pm

Thank you for making this adjustment!!

aram · March 11, 2023, 7:22pm

I have tried running the preprocess code but it hangs. Also I can no longer cd or ls into the drive storing the data. Currently I am using a 4TB ssd to store all the data. Could you please provide reference on what is the total disk space that is required to run the preprocess script?

haotang · March 15, 2023, 12:09am

I believe the clips themselves are less than 1TB. The preprocessed data takes about 800GB (only annotated frames).

aram · March 15, 2023, 6:05am

Thanks. I have a 4 TB ssd, and I have a problem where the extraction code hangs in the middle, and I cannot ls into my ssd. (probably the extraction process/thread is not exitting??)

So I have cancelled - restarted multiple times but know the extracted frames folder is ~ 3.8 TB. (disk space is basically full)
Do you suspect anything going wrong?

Major problems

frame extraction process hanging (probably due to 2? but not sure)
Disk space requirement > 8 TB

I did some calculation where
each video ~ 8 min with 30fps = 8 * 60 * 30 = 14400 frames. (I checked and the extracted folder actually has 14400 frames)
Each frame ~ 200 KB
Each video image folder (extracted frames) = 14400 * 200 KB = 2.88 GB.
Train set includes 3000 videos which leads to 8.6 TB disk space for extracted frames.

Would really appreciate your reply!

aram · March 15, 2023, 6:14am

I noticed that you mentioned we should be extracting “only annotated frames” but I guess the current preprocessing code is extracting all frames?

Please correct me if I am wrong.

haotang · March 15, 2023, 3:25pm

Yes, true. We annotate at 5FPS, so it should be fine if only extracting those frames. If you extract at 30 FPS, the disk space is not enough. Please take a look at the pull request here: Egotracks fix by tanghaotommy · Pull Request #42 · EGO4D/episodic-memory · GitHub. EgoTracks/tools/preprocess/extract_ego4d_clip_annotated_frames.py only extracts annotated frames.

aram · March 19, 2023, 9:22am

Start processing db211359-c259-4515-9d6c-be521711b6d0!
Start processing 87b52dc5-3ac3-47e7-9648-1b719049732f!
Start processing b7fc5f98-e5d5-405d-8561-68cbefa75106!
Start processing 59daca91-5433-48a4-92fc-422b406b551f!

I have problems preprocessing the videos above. (Process never ends and gives error)

File "av/enum.pyx", line 60, in av.enum.EnumType.__getitem__
KeyError: 'ERRORTYPE_2'

Could you please help me check what is going wrong with these?

Topic		Replies	Views
Fail to download annotations_540ss Ego4D Challenges hands-and-objects	0	475	October 25, 2022
How to download only the videos/annotations/clips related to the long-term activity predicition task? Ego4D Challenges forecasting	3	336	April 4, 2024
Question Regarding Downloading NLQ clips Q&A	4	49	May 22, 2025
Issue about ego-exo4d downloading Q&A	2	301	December 28, 2023
(egoexo) error downloading ego_pose Q&A	9	281	December 22, 2023

EgoTracks dataset download failure

Related topics