Hi everyone,
I am exploring the Ego4D dataset to improve an action recognition model for a project I am working on & I am curious about the best strategies for utilizing this dataset effectively. Given the scale and variety of egocentric video data in Ego4D are there any suggestion preprocessing techniques or tools that can help streamline the training process for deep learning models?
I appreciate any insights on how to handle challenges like long term temporal dependencies and occlusions in egocentric videos. Has anyone had success in using specific architectures, such as 3D CNNs or Transformers for these tasks in relation to AI driven approaches?
As well, I found these resources when doing research on this; Camera parameters (intrinsics parameters ) of the cameras used for the dataset & if anyone have any resources, tutorials or personal experiences please share with me, It would be greatly appreciated!!
Thank you…….