Ego-Exo4D Dataset Changes: Quality Issues & Future Update

Hello all,

We have made some changes to the dataset and are currently ongoing a future dataset update regarding the quality of the dataset. The future dataset update will expand the size of the annotations and data and include quality assurance work. We expect this to be available by early March.

Please upgrade your ego4d package (pip install --upgrade ego4d) to v1.6.0

To get into the specifics:

  • 787 takes are now removed from the dataset due to potential dataset quality issues, resulting in a total of 3,694 takes currently available
  • After the future update, the same number of takes (and more) will be added into the dataset without quality issues.
  • Raw capture-level data is no longer available to download. Instead you can utilize the take-trimmed data.
    • If you are after camera intrinsics, IMU, or audio of the egocentric device you can use the newly released trimmed VRS files (available under --parts takes).
    • For RGB video data, please leverage the trimmed MP4 files. You can additionally download take-trimmed VRS files (for the egocentric camera) with the RGB and other video streams in them (--parts take_vrs).
    • Additional take-trimmed data is available: please download this with --parts take_trajectory or --parts takes (eye gaze)

You can refer to the documentation here.

Apologies for the inconvenience. The team is working hard to improve the dataset.

Please let us know if you have any thoughts, feedback or have any issues with the release.

Thanks for understanding,
The Ego-Exo4D team.

1 Like

Thank you for the update. I noticed that some of the “takes” in the “annotations/splits.json” file are missing from the downloaded dataset, and I am curious if this is the reason. Additionally, when can we expect the next update to occur (regarding adding more takes)?

Yes this is expected, please refer to the first bullet point. Early to mid-March the data should be fixed and be available to download. Please note that more annotations will additionally be available.

Could you be more specific about Raw capture-level data?