CROHME 2019 results will be submitted directly online by participants. Submitted results may be updated for each task, and multiple times if necessary. NOTE: Each competition task has a 'test' version used for computing competition results, along with a 'validation' version for error and sanity-checking.
You will need to create an account on the submission system. Select the 'Register' link at top-right, and then provide a participant id and password. This will create an account for your team that can be used to view results for your submissions to the various tasks.
Have your recognition system produce results in label graph (.lg) format (see the LgEval library README for details). If your system generates LaTex or MathMl output, we provide converters to generate label graph from these two formats (tex2symlg, mml2symlg). Participants need to produce one .lg file for each test file. Label graph files are .csv files providing the segmentation of strokes into symbols, classification of symbols, and labeling of spatial relationships between symbols. We recommend using the Object-Relationship (OR) .lg file format. This format allows stroke groups (e.g., symbols) to be named, and then relationship to be defined directly between stroke groups. Earlier .lg formats (e.g. Node-Edge (NE) files) will also work.
(NEW) We have decided to use symLGs for evaluating Task1 since there are systems participating in Task1 that can't generate LG files with primitives. The fair way to rank both the systems that start from strokes and the encoder-decoder systems for task 1 is evaluating both results in symbol level using symLGs as the segmentation errors may appear in one group and not the other. So, If your system is producing LGs please first make sure that they are in OR format then use the lg2symlg converter to convert them to symLGs before submission. We can help you with converting as well.
Similar to the online recognition, participants need to produce one .lg file for each test file. Please notice since primitive level information (connected components) is not provided, we evaluate the systems based on the correct symbols and correct relation between the symbols (symbolic evaluation). For systems generating LaTex and MathMl outputs, we provide converters to generate the label graph format (tex2symlg, mml2symlg ). To generate symLG from label graph files with primitives we provide lg2symlg converter. An example of symbol level label graph is shown below. The absolute path of each node is written in the place of primitives. "O" is showing the origin.
# IUD, filename # Objects(3): O, z_1, z, 1.0, OSub O, F_1, F, 1.0, O O, x_1, x, 1.0, ORight # Relations from SRT: R, F_1, z_1, Sub, 1.0 R, F_1, x_1, Right, 1.0
The symbol classification tasks use a different .csv file format, with a single .csv file providing the classification results for all input files. An example .csv classification result file is shown below.
Each line starts with a symbol identifier, which is defined for each .inkml file using the tag:
MfrDB3907_85801, a, b, c, d, e, f, g, h, i, j MfrDB3907_85802, 1, |, l, COMMA, junk, x, X, \times
<annotation type="UI">Remaining entries on a line are symbol classes, provided in decreasing order of confidence. For the first symbol in the example below, class 'a' has the highest confidence, followed by class 'b,' etc.
The IoU is calculated in two thresholds (0.5 and 0.75) and Precison, Recall and F-score will be reported for both coarse (IoU>0.5) and fine detections (IoU>0.75). For each math region in ground_truths.csv, the IoU metric is computed for all of the predicted math regions in the submitted csv file and a sorted list of (IoU, ground_truth, detection) returns in descending order.
page number, x, y, x2, y2
For each task that you want to submit results for, you will need to submit all .lg files. We recommend uploading a .zip file containing all .lg files for a task; multiple .zip and/or .lg files may be selected for upload. Log into your user account, and click on the Upload link at the top of the page. Then:
Please Note: computing results may take several minutes, depending on the server load, complexity of the task, etc. While results are being computed, a message will be displayed. Refresh the web page to check for when results have been updated.
Submission is the same as for previous tasks, but for each subtask, you need to instead upload a single .csv file in the classification results format shown above. Submissions should be a single .csv file, with results for all symbols.
Sumbisson is the same as for previous tasks. You need to submit results for all .csv files. We recommend uploading a .zip file containing all .csv files.
Once results have been computed, visible results will differ depending on the task that you submit to; detailed results are not visible for the new test sets for the 'main' tasks.For the other official and validation tasks, we provide detailed error information computed using the LgEval and CROHMELib libraries, including:
Included are recall, precision and f-measure metrics, along with error metrics (e.g. false positives and false negatives for symbol segmentation). Recognition rates (i.e. percentage of GT targets recognized correctly) are represented by Recall. Percentage of correct classifications for correct detections is shown using Class/Det.
If you wish to update your results for a given task (e.g., after correcting bugs, or improving performance in some way), you may submit new results by repeating the process described in Step 3.
For each task, participants will be ranked by their submission with the highest recognition rate, after any corrections to ground truth have been made by the competition organizers. It is quite likely that corrections to ground truth will be found during or after the results submission period for the competition.
Participants may be asked to re-submit results to a 'new' task if ground-truth is updated.
From your participant account, you may download previously uploaded submissions, in case you accidentally lose track of them.
Confusion histograms simultaneously count and visualize errors from symbol segmentation, clasification, and parsing at the stroke level. A description of confusion histograms and their contents may be found in the paper below. Errors are presented in decreasing order of frequency. For example, specific errors for segmenting and classifying the symbol 'x' written using one vs. two strokes may be easily seen using confusion histograms. We also provide confusion histograms for pairs of symbols with a spatial relationship in ground truth, such as 'x squared' and '2x.'
All files with errors may be identified by clicking on check boxes shown in the generated .html files, and then exporting the list of files as text using the button at the top of the page.
H. Mouchère, R. Zanibbi, U. Garain and C. Viard-Gaudin. (2016) Advancing the State-of-the-Art for Handwritten Math Recognition: The CROHME Competitions, 2011-2014. Int'l Journal on Document Analysis and Recognition, 19(2): 173-189. The penultimate version of this paper is available online: www.cs.rit.edu/~rlaz/publications.html.
If you have questions, please email competition organizers firstname.lastname@example.org :
We thank Yi Huang, who created the main design for this system, and implemented the system core for his MSc project. We also thank the systems administrators at RIT CS (James Craig and Sam Waters) for helping set up the web server.