Details on the multi-view GoPro setup

I am looking for more information regarding the multi-view capture setup that Ego-Exo4D used. Specifically, I have the following questions:

  • I didn’t fully understand the sync procedure described in the appendix section 8.B. Could the authors share more information about it (e.g. how often do the QR codes change in the 29 fps video, how is the wall-time encoded, etc.) Is it possible to share the QR video as well as scripts that perform the time synchronization? I would be very curious to try this out myself.
  • Did the authors try out video-synchronization via audio? (I read that there are audio fingerprints for audio synchronization, but not for video synchronization?)
  • In section 8.D. it is described that the Aria is held in front of the GoPro cameras. Am I correct to assume that this procedure is used to calibrate the extrinsics of the GoPros? (i.e. the Aria self-localized pose will be set to the GoPro extrinsics) Was it in any way ensured that the offset between the Aria’s origin and the GoPro’s origin is minimized when holding the Aria to the GoPro? Did the authors experiment with any other calibration procedures, e.g. by showing calibration patterns to the cameras (either to obtain intrinsics and/or extrinsics)?

Many thanks!