[NLPL Task Force (A)] Fwd: [MLS research seminar] Hopfield Networks is All You Need - November 27 at 14.00

Andrey Kutuzov andreku at ifi.uio.no
Thu Nov 26 15:21:04 UTC 2020


Hi,

Both options are OK for me.

On 26.11.2020 16:12, Stephan Oepen wrote:
> colleagues,
> 
> we have an internal EOSC-Nordic meeting scheduled for tomorrow, but
> there is a conflicting seminar presentation that i would like to
> attend (see below; feel free to zoom in if you are interested :-).
> 
> any chance we could postpone to e.g. 15:00 CET next friday (december
> 4), or sometime before 14:00 CET the following week (december 10)?
> 
> with apologies for the short notice!  oe
> 
> 
> ---------- Forwarded message ---------
> From: Milena Pavlovic <milenpa at student.matnat.uio.no>
> Date: Fri, Nov 13, 2020 at 10:44 AM
> Subject: [MLS research seminar] Hopfield Networks is All You Need -
> November 27 at 14.00
> To: mls-research-seminar at ifi.uio.no <mls-research-seminar at ifi.uio.no>
> Cc: ramsauer at ml.jku.at <ramsauer at ml.jku.at>
> 
> 
> Dear all,
> 
> The next MLS research seminar will be on Friday, November 27 at 14.00
> on Zoom (meeting details below). Hubert Ramsauer from the Institute
> for Machine Learning at the Johannes Kepler University Linz will give
> a talk titled “Hopfield Networks is All You Need”.
> 
> Abstract: The transformer and BERT models pushed the performance on
> NLP tasks to new levels via their attention mechanism. We show that
> this attention mechanism is the update rule of a modern Hopfield
> network with continuous states. This new Hopfield network can store
> exponentially (with the dimension) many patterns, converges with one
> update, and has exponentially small retrieval errors. The number of
> stored patterns must be traded off against convergence speed and
> retrieval error. The new Hopfield network has three types of energy
> minima (fixed points of the update): (1) global fixed point averaging
> over all patterns, (2) metastable states averaging over a subset of
> patterns, and (3) fixed points which store a single pattern.
> Transformers learn an attention mechanism by constructing an embedding
> of patterns and queries into an associative space. Transformer and
> BERT models operate in their first layers preferably in the global
> averaging regime, while they operate in higher layers in metastable
> states. The gradient in transformers is maximal in the regime of
> metastable states, is uniformly distributed when averaging globally,
> and vanishes when a fixed point is near a stored pattern. Based on the
> Hopfield network interpretation, we analyzed learning of transformer
> and BERT architectures. Learning starts with attention heads that
> average and then most of them switch to metastable states. However,
> the majority of heads in the first layers still averages and can be
> replaced by averaging operations like the Gaussian weighting that we
> propose. In contrast, heads in the last layers steadily learn and seem
> to use metastable states to collect information created in lower
> layers. These heads seem to be a promising target for improving
> transformers. Neural networks that integrate Hopfield networks, that
> are equivalent to attention heads, outperform other methods on immune
> repertoire classification, where the Hopfield net stores several
> hundreds of thousands of patterns. We provide a new PyTorch layer
> called “Hopfield” which allows to equip deep learning architectures
> with modern Hopfield networks as new powerful concept comprising
> pooling, memory, and attention. The implementation is available at:
> https://github.com/ml-jku/hopfield-layers.
> 
> The full paper is available at this link.
> 
> Looking forward to seeing you all at the seminar!
> 
> Kind regards,
> Milena
> 
> 
> 
> Zoom details:
> 
> https://uio.zoom.us/j/67683473454?pwd=YUNyZWhRTDZMdjRvWVkxTWRWdHdmQT09
> 
> Meeting ID: 676 8347 3454
> Passcode: 096069
> 
> Documentation on how to use Zoom can be found here:
> https://www.uio.no/english/services/it/phone-chat-videoconf/zoom/
> 


-- 
Andrey
Language Technology Group (LTG)
University of Oslo



More information about the infrastructure mailing list