[epe-users] preliminary EPE 2017 results

Thu Jul 20 00:34:35 CEST 2017

dear colleagues,

> please see the following link for a preliminary summary:
>
>   https://goo.gl/ZTZxXW

with most of us EPE co-organizers on vacation these past few weeks, it
has taken a little longer to (a) fill in the missing end-to-end
negation scores and (b) double-check and in a few cases re-run
downstream scores.

we are inclined to consider the on-line spreadsheet accessible to
everyone at the above address the final quantitative result summary
from the EPE 2017 task; we plan to make these results public in the
next few days.  unless, of course, we discover remaining errors :-).
in case you notice anything surprising, please let us know!

for some of your submissions numerical scores may vary mildly from the
version that we had previously shared.  this should primarily be the
case for runs providing multiple part of speech values on dependency
nodes, e.g. UPOS and XPOS: as mentioned earlier, our downstream
systems need to pick one of the available fields, and we have now
optimized that choice as follows: for each team, we pick the property
that yields the highest average performance across all runs (from that
team) on the development data.  in most cases, this optimization
leaves the systems using XPOS, with the ECNU submissions for event
extraction as an exception.  overall, the differences between the PoS
variants are rarely really large.

with numerical results finalized, the scientifically interesting
question of course remain: what do we actually see in this wealth of
end-to-end empirical results?  we hope to use the process of writing,
reviewing, publishing, and presenting system descriptions to start a
collective interpretation process.  we are still discussing the exact
mechanics of this process, but we hope to maximize involvement of EPE
2017 participants.  for example, we will try to make time for a
pre-publication round of feedback by everyone on the complete set of
task, application, and system summaries.  we have posted some
high-level guidance on the EPE web site, but will get back to you
before the end of the month with more specific process (and schedule)
information:

  http://epe.nlpl.eu/index.php?page=5

to aid the interpretation of empirical results, we are currently
preparing an open-source release of the downstream systems, submitted
parser outputs, gold-standard evaluation data (where available),
downstream system results, official score files, and complete logs—to
allow everyone to inspect evaluation results in full detail and to
re-run the complete end-to-end pipelines.  for our negation analysis
downstream application, please find the system results, scores, and
log files on-line:

  http://svn.nlpl.eu/epe/2017/public/negation.tgz

we hope to publish similar packages for the other two downstream
applications by the end of this week and will then link everything
from the EPE 2017 web site.

best wishes, oe