[mrp-users] results from the MRP 2019 shared task

Thu Aug 8 23:59:18 CEST 2019

dear colleagues,

it is our pleasure to provide to you what we consider the release
candidate for final quantitative results from the 2019 shared task on
Cross-Framework Meaning Representation Parsing.  please see:

http://bit.ly/cfmrp19

in total, there were 16 participating teams, plus another two teams
outside the primary ranking because of involvement by task
co-organizers.

wellformed submissions by 14 teams were received within the evaluation
period and are considered for the primary ranking.  one additional
team submitted their parser outputs after the deadline, and one other
team had to re-submit because their original file was unreadable;
these submissions are included in the result overview but not
considered for the primary ranking.  likewise, two teams discovered
systematic errors in their original submissions and re-submitted
corrected graphs after the deadline; these too are reported as
post-evaluation submissions.

finally, three ‘inofficial’ submissions by task co-organizers provide
(a) baseline results across all frameworks from the TUPA multi-task
parser (Hershcovich et al., 2018) and (b) a point of reference for the
DM and EDS frameworks, from parsing using the (not white-listed)
English Resource Grammar (ERG; Flickinger et al., 2016).  these scores
are shown in the top rows of the results table.

the on-line spreadsheet is organized into sub-sheets by metric and
framework.  overall results (macro-averaged MRP F1 across all
frameworks) are summarized in the sheet labeled ‘MRP’: cells AE9
through AE35 show the primary ranking of on-time submissions, where in
each case the row using ‘all’ evaluation data shows the official
results.  the additional LPPS evaluation scores (on a 100-sentence
sample from ‘the little prince’) serve two auxiliary purposes: they
annotate the same sentences in all frameworks, and we will likely
share their gold-standard graphs with participants for further
diagnostics next week.

we would like to use another few days for final quality control before
announcing these results publicly.  please hold off on linking to our
spreadsheet or posting about it until monday next week, august 12.  in
the meantime, in case you discover anything surprising in these
results (or simply would like to change the team identifiers we have
suggested in the first two sub-sheets), please do get in touch with us
as soon as possible!

in separate email, we will provide the developers of participating
submissions with download links to (a) per-sentence scoring results
(the JSON files generated by mtool) and (b) a starting package for
preparing a system description package.  we are now moving towards
communication with just those participants who ended up making a
submission and, thus, are invited to publish a system description in
the shared task proceedings and present their work at CoNLL in early
november.

earlier today, we have established a new mailing list
‘mrp-participants at nlpl.eu’ for follow-up analysis and working towards
the proceedings.  everyone associated with a submission should already
have received a notification of subscription to that list an hour or
so ago.  if you are expecting to contribute to a system description
paper but are unsure whether you are subscribed to the new
‘mrp-participants’ list, please send us a quick note immediately.
likewise, any team members and co-authors who did not register for
CodaLab or reponded to our post-evaluation survey, still need to be
added to the mailing list: please make sure to let us know about
anyone who should receive communication about system description
papers, and please indicate your submission identifier and team name
(in columns A and B from our spreadsheet above) in all such email.

once again, we are grateful for your interest in MRP 2019!

stephan oepen, omri abend, jan hajič, daniel hershcovich,
marco kuhlmann, tim o'gorman, and nianwen xue