How to get ASR timestamps of expert commentary text tokens?

joyachen · April 3, 2024, 4:35pm

Thank you for the amazing data! If I want to know the specific time of each ASR recognized text tokens in expert commentary, e.g. 0.webm → whisper → text tokens → the timestamp of each text token, what should I do?

miguelmartin · April 6, 2024, 10:33pm

You can see how I transcribed the expert commentary audio files here: Ego4d/ego4d/internal/expert_commentary/transcribe.py at main · facebookresearch/Ego4d · GitHub

You can re-process the audio files with whisper, as it does support word-level timestamps (refer to Ego4d/ego4d/egoexo/scripts/extract_audio_transcribe.py at main · facebookresearch/Ego4d · GitHub for details)

I can do this on my end, but this wont be available for awhile.

Topic		Replies	Views
About the "Narrate and Act" in EgoExo4D Q&A	1	183	January 2, 2024
Two questions about transcription challenge Ego4D Challenges audio-visual	1	249	May 18, 2023
The ambiguity of labeled timestamps in Narrations annotation. (Do the timestamp in Narration annotation indicate starting or ending point of the labeled narration?) Q&A narrations	0	308	February 21, 2023
On release of all language annotations Q&A	0	123	March 17, 2024
Speech transcription challenge submission issues Ego4D Challenges audio-visual	4	349	May 9, 2023

How to get ASR timestamps of expert commentary text tokens?

Related topics