[NLPL Task Force (A)] outline for LUMI challenge

Andrey Kutuzov andreku at ifi.uio.no
Sat Nov 7 14:16:13 UTC 2020


Hi,

Yes, I definitely can put all the instructions on this wiki page after
my PhD defense (November 13).
The code and data are ready, we just have to decide where to actually
keep them. Should we use MS Github, Gitlab, CodeRefinery, UiO Github, or...?

My toy dataset is a Wikipedia sample, so we definitely can simply make
it available from the NLPL server, yes.

07.11.2020 11:17, Stephan Oepen wrote:
> hi andrey:
> 
> i constructed a skeletal page for how we could present the NLPL
> pre-training setup to users, including the LUMI preparatory work:
> 
> http://wiki.nlpl.eu/index.php/Eosc/pretraining/nvidia
> 
> my preference would be for all necessary software to be installed
> using EasyBuild, from our NLPL repository of easyconfigs.  once we
> have the (automated installation of the) full stack working on Saga, i
> expect we will move on to validating the installation on either Puhty
> or eX3.  but i would be prepared to already present the challenge to
> the LUMI folks at that point, hopefully sometime in the course of
> november?
> 
> regarding the sample data, we should discuss where to host that.
> maybe just make it available for public download from an NLPL server?
> what about data preparation?  ideally we would also document the steps
> and tools involved there, maybe as a separate wiki page and associated
> repository for scripts and such?
> 
> cheers, oe
> 


-- 
Andrey
PhD Candidate at Language Technology Group (LTG)
University of Oslo



More information about the infrastructure mailing list