[NLPL Task Force (A)] rolling your own BERT (and maybe ELMo) on Saga

Filip Ginter figint at utu.fi
Tue Jan 28 21:17:18 UTC 2020


Hi

Just to confirm that we count on Andrei doing the ELMo stuff. Great!

Our overall plan. btw, is to try give people enough information to start
working on their own bert based on the OSCAR dataset. Some data cleanup
scripts will be provided too. I also plan to spend a while presenting the
results which lead to us training our own bert, and some of the current
results we have using it.

Things will unfortunately come together under a bit of a panic because we
have a big shared task deadline in 12 days now, and much time sinks into
that. :-|

Andrei, what are your plans for the ELMo part?

F


On Mon, Jan 27, 2020 at 5:11 PM Antti Virtanen <sajvir at utu.fi> wrote:

>>
> ​Here's a (quick and dirty) repo for the code we used to train FinBERT:
> https://github.com/haamis/DeepLearningExamples_FinBERT/tree/master/TensorFlow/LanguageModeling/BERT_nonscaling.
> This one has the sbatch files used:
> https://github.com/haamis/BERT-pretraining​
>
> -Antti​
>
>
> ------------------------------
> *From:* Stephan Oepen <oe at ifi.uio.no>
> *Sent:* Monday, January 27, 2020 4:48 PM
> *To:* Antti Virtanen
> *Cc:* Andrei Kutuzov; Filip Ginter; infrastructure
> *Subject:* Re: rolling your own BERT (and maybe ELMo) on Saga
>
> thanks, antti!  we had previously put NLPL versions of TF and Horovod on
> Saga, so possibly the trickier parts actually are already in place :-).
>  once you have something resembling a sample invocation (preferably on a
> smallish test case, say targetting 6 gpus on two nodes), i will be eager to
> test for you!  we have Puhti access too, so if needbe i can try there or
> look up specific version numbers ...
>
> cheers, oe
>
>
> On Mon, 27 Jan 2020 at 15:39 Antti Virtanen <sajvir at utu.fi> wrote:
>
>> Hi,
>>
>> We used the tensorflow/1.13.1-hvd module on Puhti. As you might figure
>> out from the name it includes Tensorflow 1.13 and Horovod 0.16.4 plus any
>> dependencies those have (https://docs.csc.fi/apps/tensorflow/). I can
>> give you a list of packages in that module from Puhti if you wish. Also
>> worthy of note is that we had to create symlinks in the code directory to
>> cuda files `libdevice.10.bc` and `ptxas` to get XLA working correctly,
>> although I believe this is the fault of Puhti's environment being
>> misconfigured.
>>
>> -Antti
>> ________________________________________
>> From: Andrei Kutuzov <andreku at ifi.uio.no>
>> Sent: Monday, January 27, 2020 4:15 PM
>> To: Stephan Oepen
>> Cc: Antti Virtanen; Filip Ginter; infrastructure
>> Subject: Re: rolling your own BERT (and maybe ELMo) on Saga
>>
>> No, I tried only multiple GPUs (up to 4) within the same node.
>>
>> 27.01.2020 15:14, Stephan Oepen wrote:
>> > across multiple nodes?  oe
>> >
>> >
>> > On Mon, 27 Jan 2020 at 15:07 Andrei Kutuzov <andreku at ifi.uio.no
>> > <mailto:andreku at ifi.uio.no>> wrote:
>> >
>> >     27.01.2020 14:29, Stephan Oepen wrote:
>> >     >> Antti can tell about the exact GPU stuff needed. We will run the
>> >     tutorial on puhti since this is a tried and tested environment for
>> >     us, and we have little time to prepare, so we play it safe. But
>> >     Antti can tell what it takes to run the BERT code.
>> >     > yes, if possible, i could see myself try and replicate your
>> software
>> >     > environment on Saga ... the multi-gpu part sounds like an
>> interesting
>> >     > new challenge :-)!
>> >     Hi all,
>> >
>> >     Well, at least TensorFlow has no problems with multi-GPU training on
>> >     Saga, works more or less out of the box.
>> >
>> >
>> >     --
>> >     Andrei
>> >     PhD Candidate at Language Technology Group (LTG)
>> >     University of Oslo
>> >
>>
>>
>> --
>> Andrei
>> PhD Candidate at Language Technology Group (LTG)
>> University of Oslo
>>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.nlpl.eu/archives/infrastructure/attachments/20200128/5901bc86/attachment.htm>


More information about the infrastructure mailing list