Information about the video content

I had some question regarding the dataset

Are all the videos egocentric views of human beings performing activities?

How many seconds specifically is the video entire dataset?

Is the frame rate the same for all videos?

Yes, they are all egocentric. There are some third-person videos (small number of them), see Unprocessed Data | Ego4D (3rd_person_video)

The videos are normalized to 30FPS. In their raw form they vary from 15FPS to 60FPS, depending on the device being used. The most common device used is a GoPro. Please refer to this page for information: Videos | Ego4D - the “normalized” videos are referred to as the “Canonical Videos” or the full_scale videos on the CLI downloader.

For stats, refer to the paper or read the metadata file (see “ego4d.json” here: Annotation Schemas | Ego4D) and sum the number of seconds/frames yourself. It is >14million seconds