Evaluation

Mean average precision (mAP) will be the main metric used to evaluate the submitted solutions. It is a widely used metric to measure the quality of the systems in the field of information retrieval. It is computed as the mean of the average precision (AP) achieved by each query submitted to the system.

Dealing with segmentation-free scenarios (assignments I.B, II.A, II.B) introduces additional issues, since the detected bounding boxes may not match exactly with the reference ones. A detected bounding box will be considered correct match if the overlapping area between it and the reference bounding box surpasses a certain threshold.

Participants are NOT required to submit solutions to all assigments. For each track, the main score of each participant will be based on its best performance among the two assigments in the track. Additionaly, it will receive additional points for its performance on the other assignment. Finally, a single winner team will be chosen from each track.

The following equations explain how the score of each participant (p) is determined in each track (t) and assigment (a), from the mAP metric.

  • Score of an assigment: If the mAP is lower than the given baseline, 0 points will be given. Otherwise: SA(p,t,a) = mAP(p,t,a) / (max_{p'} mAP(p',t,a))
  • Score of a track: ST(p,t) = max(SA(p,t,A), SA(p,t,B)) + 0.2 * min(SA(p,t,A), SA(p,t,B))

All the implementation details can be found in the following tools, which will be used by the organizers to determine the winner of each track, and can be used by the participants to evaluate the baseline systems.

  • Evaluation tool. This will be used by the organizers to compute the mAP of each team in each assignment. Contestants may use it to evaluate their systems on the validation data.
  • Team ranker tool. This will be used by the organizers to produce the final ranking for each track. It is made public for transparency.

Submission format

The solution files that you need to submit must be in the following format:

  • A plain text file containing multiple lines, one for each match you detected, sorted by decreasing confidence (i.e. the most confident match should be placed first).
  • Each line must have 6 fields: first the document ID (Segm-Free) or the word image ID (Segm-Based), second the query image ID (QbE) or the query string (QbS), and finally four fields encoding the bounding box of the match: X, Y, Width, Height.

This format is identical to the ground-truth files given in the Data section. Remember to sort your matches by decreasing confidence, otherwise your mAP could be penalized.

If you still have questions about the submission format, take a look at the solution files generated by our baseline systems for the validation data:

  • Assignment I.A: Training-free, Segmentation-based, QbE. mAP = 0.407635942384.
  • Assignment I.B: Training-free, Segmentation-free, QbE. mAP = 0.263840648975.
  • Assignment II.A: Training-based, Segmentation-free, QbS. mAP = 0.530860039179.
  • Assignment II.B: Training-based, Segmentation-free, QbE. mAP = 0.345444589203.