Hi Ego4D Community!
First, thank you Ego4D team to share this great various dataset. Also, thank you everyone in the community to share your thoughts here.
Here I have a question/observation about the timestamp in Narration annotations.
For a narration annotation, there are always two parts: a narration in text and timestamps (in frame number and video time). Even though the
Narration guideline implied that the timestamp should indicate the ending point of the narration text, annotations for some videos are indicating the starting points of the corresponding clip.
So I wonder if is this an known issue or there are some underlying principle I missed. If it’s a known issue, is there any suggestion how to “register” it so that we can get the accurate pair of video clip and Narration text on it automatically (without manually check if I should choose the previous or latter clip for a Narration annotation).
The example video ids include Narraion annotation that indicating
- starting point of the clip: C moves the metal to on top the metal of video 3fc288bb-1082-4ec5-96a7-75bb61144b1
- ending point of the clip: C C closes the cabinet of video 64f466e5-99c9-4ed5-89be-7ef74d0dfdd3