Bad Annotations in the Dataset (VQ2D)

asjad.s · February 5, 2023, 11:50pm

Hi,

There seems to be alot of bad annotations in the dataset. Are we to assume that no such annotation mistake is there in the VQ2D test dataset?

For example, the bad annotations persist in v1_0_5 of the dataset where for “video_uid”: “14a05360-fcc4-4bf6-97d3-2d77bc282c84” and “clip_uid”: “f9c9c2ec-c5fd-46b8-bbb6-49c7dda702af” the “object_title”: “vacuum cleaner” has “x”: 966.67, “y”: 172.71,“width”: 700.13,“height”: 660.46 but if we look at the crop using the baseline with the given annotations, it only shows half of the vacuum cleaner whereas the online visualizer shows that it was visible completely.

Similarly, for “video_uid”: “3f0bd238-228d-4796-a3e4-820308fb04b0” and “clip_uid”: “6b93fc6d-ed92-42da-886a-0e532e5f66cb” the visual crop with “object_title”: “microwave” is said to have “frame_number”: 1093 and “video_frame_number”: 14657. Now the video_frame_number looks correct with the online visualizer but the frame_number does not make sense since “video_start_frame”: 8099 and “video_end_frame”: 15959 so should the frame_number for this visual crop not be 6558? I am not sure about the bounding box for this too since things do not seem to match the online visualizer for this too. The visualizer says that
x: 298.3

y: 40.96

width: 768.69

height: 584.29

but the annotation file says “x”: 593.64, “y”: 679.35, “width”: 404.54, “height”: 180.34

I have been seeing many such instances where either the response track was incorrectly labelled (the object is not even visible in those frames), the object to be referenced as the visual crop is either not in the frame at all or the crop does not contain the object but something else.
Can someone please confirm this? I have been using the baseline code on the EGO4D github.

dkukreja · February 6, 2023, 4:11am

Hey @asjad.s,

Thanks for your post, can you help us verify these?

I don’t see any vaccum cleaners in vq for 14a05360-fcc4-4bf6-97d3-2d77bc282c84; is this the right video_uid?
Video 3f0bd238-228d-4796-a3e4-820308fb04b0 has two objects labeled ‘microwave’ with visual crops on the same frame. One has the first bbox you mentioned, the second has the other; which should explain the difference.

Can you share any other inconsistencies you’ve found? We’ll take a look.

asjad.s · February 6, 2023, 4:46am

Apologies, the video_uid was 7f09822a-87b9-4eac-bb34-3f1059c704d1
For 3f0bd238-228d-4796-a3e4-820308fb04b0 you are correct, regarding the bounding boxes ( I happened to have missed that other instance of microwave), however the frame_number 1093 in the test_val.json gives us a crop of the user’s hand and not the other microwave.

Thankyou so much for your help. I guess we can modify our codes if there are issues with frame_number and simply use video_frame_number instead.

Topic		Replies	Views
Ego4d VQ2D Challenge : "textual name of object" Ego4D Challenges visual-queries	1	383	November 14, 2022
VQ2D Challenge Test Annotations Now Live Ego4D Challenges visual-queries	1	501	May 19, 2022
Ego4D v2.0 Release & Updates Announcements	2	1390	March 4, 2023
Submission Results Ego4D Challenges visual-queries	8	452	March 15, 2023
Ego4d clip names not present in the moments annotation json files Q&A	0	23	February 28, 2025

Bad Annotations in the Dataset (VQ2D)

Related topics