[mrp-users] end of MRP 2019 evaluation phase; next steps

Thu Jul 25 12:31:45 CEST 2019

dear colleagues,

the submission deadline for parser outputs in the MRP 2019 shared task
has now expired.  we are grateful to anyone who participated and
acknowledge the hard work that has gone into system development and
submission of semantic graphs for evaluation over the past many weeks
and months!

it appears we have received 51 submissions from 16 distinct teams.  we
will now start the (somewhat tedious, i am afraid) process of
extracting the parser outputs and metadata from CodaLab and computing
official scores for the task, both in the MRP metric, several variants
focussing on sub-sets of information, and of course the traditional
framework-specific metrics.  unless you tell us otherwise (in email,
to the ‘mrp-organizers’ address), we will in all cases consider the
most recent successful submission from each team as your official
entry to the shared task.

some of you have already indicated that you might want additional runs
(e.g. reflecting different system parameterizations) evaluated, and we
will try to accomodate that.  the CodaLab site remains open for
submissions in a new ‘post-evaluation’ phase, and we welcome
additional submissions of parser outputs there.  for the next week or
so, we will however prioritize evaluation of the official submission
within the evaluation period.  sometime this fall, we will make
available all data, including submissions and the evaluation
gold-standards, to all participants.

we have budgeted one week to prepare and validate official scores,
which means you should expect to hear from us about official results
by thursday next week, august 1, 2019.  in parallel, we will initiate
the process towards a proceedings volume for the shared task, which
means that each participating team will be invited to prepare a system
description paper, with a submission deadline of september 2, 2019.

system descriptions will be reviewed by a pool comprising task
participants and external experts; in other words, we will ask authors
of submitted papers to review two or three system description papers
from other teams.  reviewing will be single-blind: the system
descriptions need not be anonymized, but reviewer identities will be
anonymous.  please monitor the task web site for the schedule forward
and future updates.

thanks again, everyone, for your involvement!

stephan oepen, omri abend, jan hajič, daniel hershcovich,
marco kuhlmann, tim o'gorman, and nianwen xue