[mrp-users] fine-tuning the scorer

Stephan Oepen oe at ifi.uio.no
Fri Jul 5 01:29:50 CEST 2019


dear colleagues,

as we are approaching the evaluation period for the task, we had
thought earlier this week that our reference implementation of the
official MRP metric should be considered stable now.  however, we have
decided to release a few revisions to the scorer today and suggest
that you update (when it is convenient) from the mtool repository on
Microsoft GitHub:

+ a bug fix for normalization on inverted edges in AMR graphs: if one
graph contained, say, an ‘ARG0’ edge where the corresponding edge in
the other graph was the inverse as ‘ARG0-of’, these were not always
considered equivalent until yesterday.  in general, we do not advise
that parsers output inverted edges when serializing in the MRP format.

+ an increased default --limit for the SMATCH metric (previously 5;
now 20): this will increase (SMATCH) running time but also make it
more likely to arrive at the overall optimal solution and, thus,
reduce non-determinism in SMATCH scoring.

+ a reduced default limit for the hill-climbing–based initialization
of the MRP metric (previously 50; now 20): this will decrease (MRP)
running time but hopefully not reduce overall scores.  note, however,
that initialization by hill-climbing potentially introduces
non-determinism, in cases where the MCES limit inhibits complete
exploration of the search space.

+ a general prohibition of initialization by hill-climbing when
scoring UCCA graphs (it has always been off for DM and PSD graphs):
this is because (our current implementation of) the hill-climbing
search is not informed by the structural constraints (dominance
relations in UCCA, sequential order for DM and PSD) that hold on the
MRP metric.

+ for more fine-grained control of the efficiency–reliability
trade-off in the MRP scorer with initialization by hill-climbing, the
‘--limit’ command line option can now be specified as a pair of limits
(e.g. ‘5:100000’ for much faster scoring), where a value of 0 for
either of the two limits will disable the corresponding component in
the MRP search.  please see:

  https://github.com/cfmrp/mtool#scoring

happy scoring!

stephan oepen (for the MRP co-organizers)

ps: for further technical background, please also also consider:

  https://github.com/cfmrp/mtool/issues/52

pps: we are actively working to increase the efficiency of the MRP
scorer (without affecting results) and have gratefully incorporated
some code optimizations already.  this work, for now, proceeds on a
separate  ‘dev’ branch.  we will email again if and when we feel
participants would be well-served by switching to a newer version of
mtool.



More information about the mrp-users mailing list