<html>
<head>
<meta http-equiv="Content-Type" content="text/html; charset=utf-8">
</head>
<body style="word-wrap: break-word; -webkit-nbsp-mode: space; line-break: after-white-space;" class="">
<div class="">Hi,</div>
<div class=""><br class="">
</div>
<div class="">Postponing to Dec 4 is fine with me even though 15 CET is not ideal but I could manage.</div>
<div class="">The alternative (do you mean Friday Dec 11) could also work but only at 10 CET.</div>
<br class="">
<div class="">
<div dir="auto" style="caret-color: rgb(0, 0, 0); color: rgb(0, 0, 0); letter-spacing: normal; text-align: start; text-indent: 0px; text-transform: none; white-space: normal; word-spacing: 0px; -webkit-text-stroke-width: 0px; text-decoration: none; word-wrap: break-word; -webkit-nbsp-mode: space; line-break: after-white-space;" class="">
<div dir="auto" style="caret-color: rgb(0, 0, 0); color: rgb(0, 0, 0); letter-spacing: normal; text-align: start; text-indent: 0px; text-transform: none; white-space: normal; word-spacing: 0px; -webkit-text-stroke-width: 0px; text-decoration: none; word-wrap: break-word; -webkit-nbsp-mode: space; line-break: after-white-space;" class="">
<div>All the best,</div>
<div>Jörg</div>
<div><br class="">
</div>
<div>*****************************************************************<br class="">
Jörg Tiedemann<br class="">
Language Technology <a href="https://blogs.helsinki.fi/language-technology/" class="">https://blogs.helsinki.fi/language-technology/</a><br class="">
University of Helsinki</div>
</div>
</div>
</div>
<div><br class="">
<blockquote type="cite" class="">
<div class="">On 26. Nov 2020, at 17.12, Stephan Oepen <<a href="mailto:oe@ifi.uio.no" class="">oe@ifi.uio.no</a>> wrote:</div>
<br class="Apple-interchange-newline">
<div class="">
<div class="">colleagues,<br class="">
<br class="">
we have an internal EOSC-Nordic meeting scheduled for tomorrow, but<br class="">
there is a conflicting seminar presentation that i would like to<br class="">
attend (see below; feel free to zoom in if you are interested :-).<br class="">
<br class="">
any chance we could postpone to e.g. 15:00 CET next friday (december<br class="">
4), or sometime before 14:00 CET the following week (december 10)?<br class="">
<br class="">
with apologies for the short notice! oe<br class="">
<br class="">
<br class="">
---------- Forwarded message ---------<br class="">
From: Milena Pavlovic <<a href="mailto:milenpa@student.matnat.uio.no" class="">milenpa@student.matnat.uio.no</a>><br class="">
Date: Fri, Nov 13, 2020 at 10:44 AM<br class="">
Subject: [MLS research seminar] Hopfield Networks is All You Need -<br class="">
November 27 at 14.00<br class="">
To: <a href="mailto:mls-research-seminar@ifi.uio.no" class="">mls-research-seminar@ifi.uio.no</a> <<a href="mailto:mls-research-seminar@ifi.uio.no" class="">mls-research-seminar@ifi.uio.no</a>><br class="">
Cc: <a href="mailto:ramsauer@ml.jku.at" class="">ramsauer@ml.jku.at</a> <<a href="mailto:ramsauer@ml.jku.at" class="">ramsauer@ml.jku.at</a>><br class="">
<br class="">
<br class="">
Dear all,<br class="">
<br class="">
The next MLS research seminar will be on Friday, November 27 at 14.00<br class="">
on Zoom (meeting details below). Hubert Ramsauer from the Institute<br class="">
for Machine Learning at the Johannes Kepler University Linz will give<br class="">
a talk titled “Hopfield Networks is All You Need”.<br class="">
<br class="">
Abstract: The transformer and BERT models pushed the performance on<br class="">
NLP tasks to new levels via their attention mechanism. We show that<br class="">
this attention mechanism is the update rule of a modern Hopfield<br class="">
network with continuous states. This new Hopfield network can store<br class="">
exponentially (with the dimension) many patterns, converges with one<br class="">
update, and has exponentially small retrieval errors. The number of<br class="">
stored patterns must be traded off against convergence speed and<br class="">
retrieval error. The new Hopfield network has three types of energy<br class="">
minima (fixed points of the update): (1) global fixed point averaging<br class="">
over all patterns, (2) metastable states averaging over a subset of<br class="">
patterns, and (3) fixed points which store a single pattern.<br class="">
Transformers learn an attention mechanism by constructing an embedding<br class="">
of patterns and queries into an associative space. Transformer and<br class="">
BERT models operate in their first layers preferably in the global<br class="">
averaging regime, while they operate in higher layers in metastable<br class="">
states. The gradient in transformers is maximal in the regime of<br class="">
metastable states, is uniformly distributed when averaging globally,<br class="">
and vanishes when a fixed point is near a stored pattern. Based on the<br class="">
Hopfield network interpretation, we analyzed learning of transformer<br class="">
and BERT architectures. Learning starts with attention heads that<br class="">
average and then most of them switch to metastable states. However,<br class="">
the majority of heads in the first layers still averages and can be<br class="">
replaced by averaging operations like the Gaussian weighting that we<br class="">
propose. In contrast, heads in the last layers steadily learn and seem<br class="">
to use metastable states to collect information created in lower<br class="">
layers. These heads seem to be a promising target for improving<br class="">
transformers. Neural networks that integrate Hopfield networks, that<br class="">
are equivalent to attention heads, outperform other methods on immune<br class="">
repertoire classification, where the Hopfield net stores several<br class="">
hundreds of thousands of patterns. We provide a new PyTorch layer<br class="">
called “Hopfield” which allows to equip deep learning architectures<br class="">
with modern Hopfield networks as new powerful concept comprising<br class="">
pooling, memory, and attention. The implementation is available at:<br class="">
<a href="https://github.com/ml-jku/hopfield-layers" class="">https://github.com/ml-jku/hopfield-layers</a>.<br class="">
<br class="">
The full paper is available at this link.<br class="">
<br class="">
Looking forward to seeing you all at the seminar!<br class="">
<br class="">
Kind regards,<br class="">
Milena<br class="">
<br class="">
<br class="">
<br class="">
Zoom details:<br class="">
<br class="">
<a href="https://uio.zoom.us/j/67683473454?pwd=YUNyZWhRTDZMdjRvWVkxTWRWdHdmQT09" class="">https://uio.zoom.us/j/67683473454?pwd=YUNyZWhRTDZMdjRvWVkxTWRWdHdmQT09</a><br class="">
<br class="">
Meeting ID: 676 8347 3454<br class="">
Passcode: 096069<br class="">
<br class="">
Documentation on how to use Zoom can be found here:<br class="">
https://www.uio.no/english/services/it/phone-chat-videoconf/zoom/<br class="">
<br class="">
</div>
</div>
</blockquote>
</div>
<br class="">
</body>
</html>