[NLPL Task Force (A)] CSC - Call for Pilots

Tiedemann, Jörg jorg.tiedemann at helsinki.fi
Thu Mar 7 15:59:17 UTC 2019


I did our proposal on the ferry from Stockholm to Helsinki on that online form as well. It's about training highly multilingual NMT models on all of Opus.

Jörg

On 7 Mar 2019, at 11.58, Filip Ginter <figint at utu.fi<mailto:figint at utu.fi>> wrote:

:D we typed ours straight into the submission form, clicking the link now gives me the attached screenshot. :D Basically, there was next to nothing about this kind of directory structure / infrastructure / access stuff you mention. What we asked was about 50K GPU hours to train Finnish BERT as the primary target, and the rest of the proposal went into some detail of why doing something like this would be important and impactful.

F




<image.png>

On Thu, Mar 7, 2019 at 11:48 AM Stephan Oepen <oe at ifi.uio.no<mailto:oe at ifi.uio.no>> wrote:
could you imagine sharing your proposals?  maybe there is a way to consolidate into one ‘umbrella’ activity, which could represent NLPL at large?  part of my motivation would be to start early with getting our project directory, software and data, and access mechanisms in place.  at the same time, i believe there is current work on ELMo training both at uppsala and oslo ... so i imagine exchangingnotes could be beneficial to everyone :-).

oe


On Thu, 7 Mar 2019 at 10:43 Filip Ginter <figint at utu.fi<mailto:figint at utu.fi>> wrote:
Hi

Not sure about how strict the deadline is. On the upside, both Turku and Helsinki submitted one proposal. We did mention NLPL in our proposal, and I'd venture a guess Jörg mentioned NLPL in his as well. Training fancy modern language models does sound like one of these two proposals. ;)

F

On Thu, Mar 7, 2019 at 11:27 AM Stephan Oepen <stephan.oepen at gmail.com<mailto:stephan.oepen at gmail.com>> wrote:
colleagues,

even though the deadline for the call below is closed, i wonder whether we should try to get NLPL enrolled as a pilot user for the new CSC system, in particular its AI partition.  i imagine the infrastructure task force could help with software installations (e.g. PyTorch, AllenNLP, OpenNMT).  could we makes use of up to 300 V100 gpus for a month or two?  train ELMo embeddings on some of our multilingual corpora?

https://research.csc.fi/call-for-pilots

cheers, oe


-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.nlpl.eu/archives/infrastructure/attachments/20190307/3fb3f40b/attachment.htm>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: image.png
Type: image/png
Size: 11995 bytes
Desc: image.png
URL: <http://lists.nlpl.eu/archives/infrastructure/attachments/20190307/3fb3f40b/attachment.png>


More information about the infrastructure mailing list