[NLPL Infrastructure] NLP analysis using the NIRD Toolkit

Thu Jan 14 23:37:45 UTC 2021

hi lorand, and belatedly all the best for the new year!

i am copying my colleagues at helsinki university (joerg tiedemann and
alessandro raganato), who also participate in the NLPL use case in
EOSC-Nordic.

> Since the NS9052K project has been running on the NIRD Service Platform back in 2020, using JupyterHub and DL tools, was wondering if you could share your experience with the NIRD Service Platform [1] and the NIRD Toolkit [2]. Did you manage to test the capabilities of the NIRD Toolkit?

i am a bit surprised to hear that NS9052K has been running on the NIRD
service platform in 2020.  do you have information about specific
users available and an indication of the extend of their use?

i believe it was in mid-2019 that bjoern lindi (then the manager for
the NeIC-funded NLPL project) and i made a semi-systematic attempt at
running gpu-enabled NLP experiments on the service platform.  we
succeeded in getting the general mechanisms to work for toy examples,
but there were four stumbling blocks, of which three relate to the
service platform at the time.  (i) for UiO users, authentication is a
real stumbling block: i need to connect via feide to configure and
launch an instance, but i then have to authenticate using my OpenIDP
identity to actually connect to my container; i realize that the root
cause of this inconvenience is the UiO opt-in policy for individual
feide services, but from my user point of view it was hard to
understand initially and then quite inconvenient to manage (let alone
explain to others).  (ii) the allocation and scheduling mechanisms on
the service platform were opaque back then; for example, how would one
go about requesting up to four gpus for occasional usage over a period
of, say, four months?  (iii) persistent storage and interactions with
the NIRD storage areas were not fully transparent to me either.  my
understanding is that anyone who i allow to connect to my service
(running container) gains write access as me (who launched the
container) to my complete NIRD project space.  we were evaluating the
service platform for usage in teaching back then, and (i) through
(iii) combined led us to instead run our classes on Saga instead, i.e.
actually teach our students how to operate on the command line and
submit to SLURM.  finally, (iv) the NLPL user community generally
prefers working on the command line; most of us, including MSc and
doctoral students, are software developers more than users of
ready-to-run tools.

i would be curious to hear your take on challenges (i) through (iii).
surely the NIRD service platform has evolved since i last looked at it
in earnest, but consideration (iv) will of course remain true, either
way.  i have been silently thinking that the NLPL use case may in the
end have little demand on sub-task 5.2.3.  we are actively working on
the virtual laboratory perspective in sub-tasks 5.2.2 (on Saga and
Puhti for now, but with an eye toward LUMI) and are getting ready to
give more thought to data provisioning across distinct HPC systems.

best wishes, oe