Extracting audio features only for the clips

I am trying to extract the audio features from the clips.
I’ve downloaded the clips and then I run run the code ‘batch_audio_embedding.py’. (inside the folder audio-visual/active-speaker-detection/audio_embedding/make_audio_embeddings).
I get an error while I’m running the code. The problem is that the code is trying to load a model that is not present in my personal folder. I want to know if you can provide the model ‘audio_embedding.model’ so I can downloaded it ad run the code correctly.
The error arise in line 73: ‘neta.load_state_dict(torch.load(’…/…/…/models/audio_embedding.model’