[NLPL Task Force (A)] OpenNMT installation for NLPL (on Abel)
Martin Matthiesen
martin.matthiesen at csc.fi
Mon Oct 1 12:45:08 UTC 2018
Hi Yves,
Yes, we need to find a balance between quickly installing something and really supporting something. This is already a challenge with software in general, with Python it is even more complex. I forwarded your feedback and will also myself look into this.
Martin
--
Martin Matthiesen
CSC - Tieteen tietotekniikan keskus
CSC - IT Center for Science
PL 405, 02101 Espoo, Finland
+358 9 457 2376, martin.matthiesen at csc.fi
Public key : https://pgp.mit.edu/pks/lookup?op=get&search=0x74B12876FD890704
Fingerprint: AA25 6F56 5C9A 8B42 009F BA70 74B1 2876 FD89 0704
----- Original Message -----
> From: "Yves Scherrer" <yves.scherrer at helsinki.fi>
> To: "Martin Matthiesen" <martin.matthiesen at csc.fi>
> Cc: "Stephan Oepen" <oe at ifi.uio.no>, "infrastructure" <infrastructure at nlpl.eu>
> Sent: Saturday, 29 September, 2018 17:07:51
> Subject: Re: [NLPL Task Force (A)] OpenNMT installation for NLPL (on Abel)
> Hi Martin,
>
> Thanks for looking into this. As far as I see, Stephan has solved the issue
> regarding OpenNMT, so there is no immediate need of changing anything.
> Personally, I don’t mind if there is virtualenv or conda env, but it would be
> nice if there were at least some documentation on that, and if every future
> version of Python could ship with either of these.
>
> Best,
> Yves
>
>> On 28 Sep 2018, at 16:58, Martin Matthiesen <martin.matthiesen at csc.fi> wrote:
>>
>> Hi Yves,
>>
>> I talked to Markus about virtualenv and he in turn told me that intelpython uses
>> conda env for virtual environments. virtualenv should also work, you should be
>> able to install it yourself via pip install --user virtualenv. I am not sure
>> here what the right level of support from our side should be. Should we
>> consistently install virtualenv?
>>
>> Regards,
>> Martin
>>
>> --
>> Martin Matthiesen
>> CSC - Tieteen tietotekniikan keskus
>> CSC - IT Center for Science
>> PL 405, 02101 Espoo, Finland
>> +358 9 457 2376, martin.matthiesen at csc.fi
>> Public key : https://pgp.mit.edu/pks/lookup?op=get&search=0x74B12876FD890704
>> Fingerprint: AA25 6F56 5C9A 8B42 009F BA70 74B1 2876 FD89 0704
>>
>> ----- Original Message -----
>>> From: "Yves Scherrer" <yves.scherrer at helsinki.fi>
>>> To: "Stephan Oepen" <oe at ifi.uio.no>
>>> Cc: "Martin Matthiesen" <martin.matthiesen at csc.fi>, "infrastructure"
>>> <infrastructure at nlpl.eu>
>>> Sent: Wednesday, 26 September, 2018 21:40:50
>>> Subject: Re: [NLPL Task Force (A)] OpenNMT installation for NLPL (on Abel)
>>
>>> Further validating your installation, I am currently training a model, and once
>>> I found that I need to use $CUDA_VISIBLE_DEVICES it also seems to be training
>>> on GPU :)
>>>
>>> I’ll see if I can easily modify my test to use data from the NLPL repository
>>> (the data is certainly not the problem, but there might be some preprocessing
>>> steps for which scripts are not (yet) available).
>>>
>>> Regarding virtualenv on CSC, it’s hit or miss:
>>> - python-env/intelpython3.6-2018.3, which Martin mentioned lately and which
>>> contains PyTorch, doesn’t have virtualenv
>>> - python-env/3.5.3 has virtualenv, as you correctly observed
>>> - python-env/3.4.0, which is the default version on taito-shell, doesn’t have
>>> virtualenv
>>>
>>> I’ll have to test if it’s easier to build on the intelpython or the “normal” gnu
>>> one…
>>>
>>> Yves
>>>
>>>> On 26 Sep 2018, at 15:57, Stephan Oepen <oe at ifi.uio.no> wrote:
>>>>
>>>> many thanks for validating (to some degree at least :-) my OpenNMT-py
>>>> installation on Abel. i have now added it to the software catalogue
>>>> and created minimal documentation on the NLPL wiki:
>>>>
>>>> http://wiki.nlpl.eu/index.php/Infrastructure/software/catalogue
>>>> http://wiki.nlpl.eu/index.php/Translation/opennmt-py
>>>>
>>>> —could you suggest a minimal example workflow, demonstrating how to
>>>> train and decode with OpenNMT, ideally using files from our own
>>>> ‘/proj/nlpl/data/translation/’? speaking of which, should i start
>>>> replicating that directory from Taito to Abel, i.e. remove what you
>>>> had installed manually on Abel and instead turn on automated
>>>> replication once a day?
>>>>
>>>> in principle, we should now produce a parallel installation of
>>>> OpenNMT-py on Taito, of course—which presupposes that we get something
>>>> parallel worked out for PyTorch.
>>>>
>>>> yves, why do you say that CSC does not include ‘virtualenv’ in their
>>>> python installation? is there something principled that i am missing?
>>>>
>>>> [oe at taito-login3 ~]$ module add python-env/3.5.3
>>>> Loading application python-3.5.3 environment with needed modules
>>>> Switching compiler gcc to gcc/5.4.0
>>>> Switching MPI version intelmpi to intelmpi/5.1.3
>>>>
>>>> The following have been reloaded with a version change:
>>>> 1) gcc/4.8.2 => gcc/5.4.0 2) intelmpi/4.1.3 => intelmpi/5.1.3 3)
>>>> mkl/11.3.0 => mkl/11.3.2 4) python-env/3.4.0 => python-env/3.5.3 5)
>>>> python/3.4.0 => python/3.5.3
>>>>
>>>> [oe at taito-login3 ~]$ type -all python
>>>> python is /appl/opt/python/3.5.3-gnu540/bin/python
>>>> [oe at taito-login3 ~]$ type -all virtualenv
>>>> virtualenv is /appl/opt/python/3.5.3-gnu540/bin/virtualenv
>>>>
>>>> so, i am guessing we could presumably attempt an NLPL-maintained
>>>> installation of PyTorch into a 3.5 virtual environment, which would
>>>> likely require a custom glibc installation too (and the same kind of
>>>> dynamic linking ‘gymnastics’).
>>>>
>>>> i feel i still need to learn more about the CSC environment. are the
>>>> modules available on taito-gpu the same as on the cpu nodes? in other
>>>> words, do both types of nodes see the same file system?
>>>>
>>>> cheers, oe
>>>>
>>>>
>>>> On Wed, Sep 26, 2018 at 9:59 AM, Scherrer, Yves
>>>> <yves.scherrer at helsinki.fi> wrote:
>>>>> Hi,
>>>>>
>>>>>
>>>>>
>>>>> I’ve had a quick look at Stephan’s OpenNMT-py on Abel. The onmt module seems
>>>>> to work, but one generally uses the scripts “preprocess.py”, “train.py” and
>>>>> “translate.py” (at the root directory of the Github repo), and these scripts
>>>>> seem to be missing from the module. Would it be possible to copy these three
>>>>> scripts (there is a fourth one, “server.py”, but this one might not be
>>>>> relevant for common usage) somewhere inside the virtual environment, so that
>>>>> they can be found and called easily?
>>>>>
>>>>>
>>>>>
>>>>> I have to say that I find these stacked virtual environments quite elegant.
>>>>> Too bad that CSC doesn’t even include the virtualenv command in their
>>>>> python-env modules…
>>>>>
>>>>>
>>>>>
>>>>> Best,
>>>>>
>>>>> Yves
>>>>>
>>>>>
>>>>>
>>>>> ________________________________
>>>>> From: Stephan Oepen <oe at ifi.uio.no>
>>>>> Sent: Thursday, September 20, 2018 12:31:58 AM
>>>>> To: Scherrer, Yves
>>>>> Cc: Martin Matthiesen; infrastructure
>>>>>
>>>>> Subject: Re: [NLPL Task Force (A)] OpenNMT installation for NLPL (on Abel)
>>>>>
>>>>> dear all,
>>>>>
>>>>> yes, chaining virtual environments appears to work as one would
>>>>> expect. i might in fact have managed to install OpenNMT-py on Abel,
>>>>> using my new PyTorch 0.4.1 virtual environment, essentially:
>>>>>
>>>>> module load nlpl-pytorch
>>>>> /projects/nlpl/software/opennmt-py/
>>>>> virtualenv /projects/nlpl/software/opennmt-py/0.2.1
>>>>>
>>>>> at this point, i had to manually change the ‘python’, ‘python3’, and
>>>>> ‘python3.5’ files in the new ‘bin/’ directory, to avail themselves of
>>>>> the custom glibc; see
>>>>> ‘http://wiki.nlpl.eu/index.php/Infrastructure/software/glibc’.
>>>>>
>>>>> cd /projects/nlpl/software/modulefiles
>>>>> mkdir nlpl-opennmt-py
>>>>> cp nlpl-pytorch/0.4.1 nlpl-opennmt-py/0.2.1
>>>>> vi nlpl-opennmt-py/0.2.1
>>>>>
>>>>> cd ~/src/nlpl
>>>>> module purge
>>>>> module load nlpl-opennmt-py
>>>>> wget https://github.com/OpenNMT/OpenNMT-py/archive/0.2.1.tar.gz
>>>>> tar zpSxvf 0.2.1.tar.gz
>>>>> cd OpenNMT-py-0.2.1
>>>>> python setup.py install
>>>>>
>>>>> so far, my testing is limited to
>>>>>
>>>>> python -c "import torch; import onmt; print(onmt.__version__);"
>>>>>
>>>>> yves, would you maybe have a chance next week to see whether this
>>>>> installation appears healthy to you?
>>>>>
>>>>> cheers, oe
>>>>>
>>>>>
>>>>> On Wed, Sep 19, 2018 at 1:12 PM, Scherrer, Yves
>>>>> <yves.scherrer at helsinki.fi> wrote:
>>>>>> Hi Stephan, Martin,
>>>>>>
>>>>>>
>>>>>>
>>>>>> I’m catching up on this thread… A few questions from my side:
>>>>>>
>>>>>>
>>>>>>
>>>>>> Regarding Martin’s latest suggestion: that seems indeed to work fine,
>>>>>> although with the exact same commands I get a different version of
>>>>>> PyTorch:
>>>>>>
>>>>>>>>> import torch
>>>>>>
>>>>>>>>> torch.__file__
>>>>>>
>>>>>>
>>>>>> '/appl/opt/python/intelpython36-2018.3/intelpython3/lib/python3.6/site-packages/torch/__init__.py'
>>>>>>
>>>>>>>>> torch.__version__
>>>>>>
>>>>>> '0.4.0a0+3749c58'
>>>>>>
>>>>>>
>>>>>>
>>>>>> In any case, if PyTorch is already installed in some Python distribution,
>>>>>> that would make setting up a specific OpenNMT module rather easy. If not,
>>>>>> virtual environments should work as well (the tricky thing is mainly to
>>>>>> figure out which python versions play well with CUDA…)
>>>>>>
>>>>>>
>>>>>>
>>>>>> Regarding Stephan’s suggestion of virtual environments: do you know if
>>>>>> virtual environments can be “stacked”, i.e. whether I could create an
>>>>>> OpenNMT virtual environment that lies on top of your PyTorch environment?
>>>>>> Or
>>>>>> would I have to re-install another instance of PyTorch in the OpenNMT
>>>>>> virtualenv?
>>>>>>
>>>>>>
>>>>>>
>>>>>> I’ll be travelling for the rest of the week, but will try to have a closer
>>>>>> look at these options next week.
>>>>>>
>>>>>>
>>>>>>
>>>>>> Best,
>>>>>>
>>>>>> Yves
>>>>>>
>>>>>>
>>>>>>
>>>>>> ________________________________
>>>>>> From: Martin Matthiesen <martin.matthiesen at csc.fi>
>>>>>> Sent: Wednesday, September 19, 2018 1:29:35 PM
>>>>>> To: Stephan Oepen
>>>>>> Cc: infrastructure; Scherrer, Yves
>>>>>>
>>>>>> Subject: Re: [NLPL Task Force (A)] OpenNMT installation for NLPL (on Abel)
>>>>>>
>>>>>> Hello Stephan,
>>>>>>
>>>>>> ----- Original Message -----
>>>>>>> From: "Stephan Oepen" <oe at ifi.uio.no>
>>>>>>> To: "Martin Matthiesen" <martin.matthiesen at csc.fi>
>>>>>>> Cc: "infrastructure" <infrastructure at nlpl.eu>, "Yves Scherrer"
>>>>>>> <yves.scherrer at helsinki.fi>
>>>>>>> Sent: Tuesday, 18 September, 2018 14:13:53
>>>>>>> Subject: Re: [NLPL Task Force (A)] OpenNMT installation for NLPL (on
>>>>>>> Abel)
>>>>>>
>>>>>>> sorry, i was the one who had introduced the confusion about mailing
>>>>>>> lists. there is no ‘translation at nlpl.eu’ currently, and upon
>>>>>>> consultation with joerg there appears not to be a great need for it
>>>>>>> either (once i get around to documenting the task force structure on
>>>>>>> the project wiki, i might want to create that list nevertheless).
>>>>>>>
>>>>>>> i am adding yves to thread now, so he at least has a chance of knowing
>>>>>>> what we are talking about :-).
>>>>>>
>>>>>> Ok!
>>>>>>>
>>>>>>> martin, i doubt that an installation of OpenNMT that requires everyone
>>>>>>> to ‘pip install --user’ into their home directory will be a good
>>>>>>> solution. that way, the getting started instructions will be more
>>>>>>> complex, and we lack control over which version of PyTorch gets
>>>>>>> installed at the time the user actually runs the command. my
>>>>>>> immediate reaction at least is that NLPL-supported software should be
>>>>>>> ‘self-contained’, in the sense of not depending on software components
>>>>>>> maintained by the user.
>>>>>>
>>>>>> Ok, I understand.
>>>>>>>
>>>>>>> what i am doing increasingly on abel is deriving virtual environments;
>>>>>>> e.g. my PyTorch installation (for NLPL) straightforwardly builds on
>>>>>>> the USIT-maintained python 3.5. i suppose we should be able to do the
>>>>>>> same thing on taito, i.e. create ‘nlpl-pytorch’ as a virtual
>>>>>>> environment that includes the precompiled PyTorch wheel from your CSC
>>>>>>> colleagues?
>>>>>>
>>>>>> Yes, I guess that is the only sensible solution to not lose track
>>>>>> completely. In the meantime, how would this work for you all:
>>>>>>
>>>>>> [GPU-Env ~]$ module load python-env/intelpython3.6-2018.3
>>>>>> Loading application Intel Distribution for Python 2018 update 3
>>>>>> [GPU-Env ~]$ module list
>>>>>>
>>>>>> Currently Loaded Modules:
>>>>>> 1) gcc/4.9.3 2) cuda/7.5 3) StdEnv 4) git/2.17.1 5)
>>>>>> python-env/intelpython3.6-2018.3
>>>>>>
>>>>>> [GPU-Env ~]$ python3
>>>>>> Python 3.6.3 |Intel Corporation| (default, May 4 2018, 04:22:28)
>>>>>> [GCC 4.8.2 20140120 (Red Hat 4.8.2-15)] on linux
>>>>>> Type "help", "copyright", "credits" or "license" for more information.
>>>>>> Intel(R) Distribution for Python is brought to you by Intel Corporation.
>>>>>> Please check out: https://software.intel.com/en-us/python-distribution
>>>>>>>>> import torch
>>>>>>>>> torch.__version__
>>>>>> '0.4.1'
>>>>>>
>>>>>> Kudos to my colleagues Markus and Jarmo here.
>>>>>>
>>>>>> Martin
>>>>>>
>>>>>>>
>>>>>>> oe
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>> On Mon, Sep 17, 2018 at 5:06 PM, Martin Matthiesen
>>>>>>> <martin.matthiesen at csc.fi> wrote:
>>>>>>>> Hello,
>>>>>>>>
>>>>>>>> We already have a way to use pytorch 0.4.1 on Taito-GPU:
>>>>>>>>
>>>>>>>> module load python-env/intelpython3.6-2018.3
>>>>>>>> [GPU-Env ~]$ pip install -v --user
>>>>>>>> /appl/opt/pytorch/0.4.1/cu90/torch-0.4.1-cp36-cp36m-linux_x86_64.whl
>>>>>>>>
>>>>>>>> One of my colleagues has compiled the module. Note that the module needs
>>>>>>>> python
>>>>>>>> 3.6 to work, the highest available on Taito-GPU.
>>>>>>>>
>>>>>>>> Before I investigate CPU-support or support for other compilers, would
>>>>>>>> this
>>>>>>>> pip-approach work for you?
>>>>>>>>
>>>>>>>> Regards,
>>>>>>>> Martin
>>>>>>>>
>>>>>>>> ----- Original Message -----
>>>>>>>>> From: "Stephan Oepen" <oe at ifi.uio.no>
>>>>>>>>> To: translation at nlpl.eu
>>>>>>>>> Cc: "infrastructure" <infrastructure at nlpl.eu>
>>>>>>>>> Sent: Saturday, 15 September, 2018 18:59:29
>>>>>>>>> Subject: [NLPL Task Force (A)] OpenNMT installation for NLPL (on Abel)
>>>>>>>>
>>>>>>>>> colleagues,
>>>>>>>>>
>>>>>>>>> joerg, martin, and i talked about getting the new release version of
>>>>>>>>> OpenNMT installed for NLPL. it appears it requires the most recent
>>>>>>>>> version of PyTorch, which currently is not available on Taito. martin
>>>>>>>>> will ask for it to be installed by CSC.
>>>>>>>>>
>>>>>>>>> in parallel, i believe i managed to put an NLPL-owned installation of
>>>>>>>>> the right PyTorch version onto Abel, please see:
>>>>>>>>>
>>>>>>>>> http://wiki.nlpl.eu/index.php/Infrastructure/software/pytorch
>>>>>>>>>
>>>>>>>>> before announcing this more widely, i would be grateful for some
>>>>>>>>> testing, in particular for both cpu and gpu usage. would anyone we
>>>>>>>>> readily set up to give this a shot on Abel?
>>>>>>>>>
>>>>>>>>> assuming our PyTorch is healthy, would someone from the helsinki team
>>>>>>>>> have the time to try and install OpenNMT onto Abel, e.g. as
>>>>>>>>>
>>>>>>>>> /projects/nlpl/software/opennmt-py/0.2.1
>>>>>>>>>
>>>>>>>>> there have been two relatively recent requests for OpenNMT in oslo
>>>>>>>>> (one of them for seq2seq dependency parsing :-), so i believe it would
>>>>>>>>> now be warranted to provide it on both systems.
>>>>>>>>>
> >>>>>>>> best wishes, oe
More information about the infrastructure
mailing list