Automated Evaluation for VMCBench

Upload a JSON file containing question index and model prediction to evaluate the performance.

Example JSON format:

[ { "index": 1, "prediction": "A" }, { "index": 2, "prediction": "The answer is C. cat" } ]

Each record should contain the fields: 'index', 'prediction'.