Questions Regarding Submission Format and Evaluation Errors for EgoExo4D Demonstrator Proficiency Estimation

Dear Organizers,

I hope this message finds you well.
I have a few questions regarding the EgoExo4D Demonstrator Proficiency Estimation competition.

Previously, I posted these questions as issues on GitHub. However, since I have not received a response for over two weeks, I am reposting them here on the forum for clarification.


First Issue:
Questions Regarding Submission Format (Input and Output Data)

Thank you very much for organizing the EgoExo4D Demonstrator Proficiency Estimation competition.

I have two questions regarding the submission format, specifically about the input and output data:

  1. About Input Data
    Should we prepare three exocentric views or four exocentric views for each video?
    On GitHub, it states:

exo_model_predictions: A nested list where each video has 4 sets of 4-dimensional prediction logits, corresponding to the four exo views.

Meanwhile, on Eval.ai, it says:

Synchronized video clips of a demonstrator performing a task from egocentric and three exocentric views.

This discrepancy has caused confusion about how many exocentric views should actually be used.
Could you kindly clarify this point?

  1. About Output Format
    In demonstrator_proficiency/README.md under “1. File Format,” the expected output format is described as [p1, p2, p3, p4].
    However, it is not entirely clear what each value represents.

On Eval.ai, it seems we are expected to submit one of the following four class labels for each video:

  • Novice (0)
  • Late Expert (1)
  • Intermediate Expert (2)
  • Early Expert (3)

Thus, my initial understanding was that we should submit a single predicted label per video.
However, based on the README, it seems we may need to output four logits [p1, p2, p3, p4] instead.

Could you please confirm whether the final output should be a single class label or a list of four logits?

I would greatly appreciate your clarification.


Second Issue:
Possible Argument Order Issue in validate_model_predictions on Evaluation Server

Thank you again for hosting the competition.

When I submitted my results following the format described in the GitHub README, I encountered an error on the evaluation server.
The format I submitted was as follows:

{
  "videos": ["video_1", "video_2", ...],
  "ego_model_predictions": [[p1, p2, p3, p4], ...],
  "exo_model_predictions": [
    [[p1, p2, p3, p4], [p1, p2, p3, p4], [p1, p2, p3, p4], [p1, p2, p3, p4]],
    ...
  ]
}

However, the evaluation server returned the following error:

Traceback (most recent call last):
  File "/code/scripts/workers/submission_worker.py", line 538, in run_submission
    submission_metadata=submission_serializer.data,
  File "/tmp/tmp2uzllcza/compute/challenge_data/challenge_2291/main.py", line 108, in evaluate
    validate_model_predictions(gt_annotations, model_predictions)
  File "/tmp/tmp2uzllcza/compute/challenge_data/challenge_2291/main.py", line 9, in validate_model_predictions
    assert key in model_predictions.keys()
AssertionError

Based on my investigation, I suspect that the arguments passed to the validate_model_predictions function might be reversed.
To test this, I ran eval_script.py locally with dummy gt_annotations and observed the following:

  • :white_check_mark: validate_model_predictions(model_predictions, gt_annotations) → works fine and returns a score
  • :cross_mark: validate_model_predictions(gt_annotations, model_predictions) → results in the same AssertionError as above

Given this, it seems that the argument order might be incorrect on the evaluation server.

Could you kindly check if the argument order needs to be corrected?


I would greatly appreciate it if you could respond either here on the forum or on the GitHub issues.
Thank you very much for your time and support.