[mrp-users] validation of submissions
Stephan Oepen
oe at ifi.uio.no
Tue Aug 4 17:45:14 CEST 2020
dear all:
there has been a steady stream of submissions to the MRP 2020 CodaLab
site these past few days, and we much look forward to starting the
official scoring early next week!
we realize that the validation configured on the submission site is,
in a formal sense, overly strict. thus, we would like to assure all
participants that we will also consider submissions that fail
validation on CodaLab, though a lot will depend on why they fail.
internally we distinguish between basic, formal validation (e.g. valid
"source" and "target" identifiers on all edges) vs. more
in-depth,linguistic validation. it turns out that our current
validator at CodaLab runs both types; for example, you may be getting
negative feedback for multi-rooted UCCA graphs or multiple primary
edges to the same node. while these properties are undesirable
according to the UCCA guidelines, it may not always be so easy to
strictly enforce these constraints in a data-driven parser (save for
heuristic graph post-processing maybe).
graphs exhibiting these kinds of validation errors, however, are
formally well-formed and can be evaluated. as a rule of thumb,
validator feedback that is prefixed with a framework identifier
typically is of the second, linguistic type. for example:
validate(): [E] {UCCA} graph #552010; node #15: multiple roots in graph.
validate(): [E] {UCCA} graph #636013; node #20; edge 57 -20-> C:
multiple primary parents for node.
to not leave room for uncertainty about which submission from your
team you want us to consider for the official evaluation, we will
share with all registered CodaLab users who have made at least one
submission a brief questionnaire this coming monday (i.e. at the end
of the evaluation period), asking you to point us to your final
submission (i.e. a CodaLab submission identifier and time stamp).
finally, please allow us to also send a reminder about the expected
format of submissions: your single-file collection of parser outputs
must be in valid MRP format, and it is critical that you provide (i.e.
preserve) complete meta-information for each graph from our parser
inputs, notably (at least the) the "id", "framework", "language", and
"input" fields.
best wishes, and good luck wrapping it all up :-)! oe
More information about the mrp-users
mailing list