[NLPL Task Force (A)] OpenNMT installation for NLPL (on Abel)
Martin Matthiesen
martin.matthiesen at csc.fi
Fri Sep 28 13:58:09 UTC 2018
Hi Yves,
I talked to Markus about virtualenv and he in turn told me that intelpython uses conda env for virtual environments. virtualenv should also work, you should be able to install it yourself via pip install --user virtualenv. I am not sure here what the right level of support from our side should be. Should we consistently install virtualenv?
Regards,
Martin
--
Martin Matthiesen
CSC - Tieteen tietotekniikan keskus
CSC - IT Center for Science
PL 405, 02101 Espoo, Finland
+358 9 457 2376, martin.matthiesen at csc.fi
Public key : https://pgp.mit.edu/pks/lookup?op=get&search=0x74B12876FD890704
Fingerprint: AA25 6F56 5C9A 8B42 009F BA70 74B1 2876 FD89 0704
----- Original Message -----
> From: "Yves Scherrer" <yves.scherrer at helsinki.fi>
> To: "Stephan Oepen" <oe at ifi.uio.no>
> Cc: "Martin Matthiesen" <martin.matthiesen at csc.fi>, "infrastructure" <infrastructure at nlpl.eu>
> Sent: Wednesday, 26 September, 2018 21:40:50
> Subject: Re: [NLPL Task Force (A)] OpenNMT installation for NLPL (on Abel)
> Further validating your installation, I am currently training a model, and once
> I found that I need to use $CUDA_VISIBLE_DEVICES it also seems to be training
> on GPU :)
>
> I’ll see if I can easily modify my test to use data from the NLPL repository
> (the data is certainly not the problem, but there might be some preprocessing
> steps for which scripts are not (yet) available).
>
> Regarding virtualenv on CSC, it’s hit or miss:
> - python-env/intelpython3.6-2018.3, which Martin mentioned lately and which
> contains PyTorch, doesn’t have virtualenv
> - python-env/3.5.3 has virtualenv, as you correctly observed
> - python-env/3.4.0, which is the default version on taito-shell, doesn’t have
> virtualenv
>
> I’ll have to test if it’s easier to build on the intelpython or the “normal” gnu
> one…
>
> Yves
>
>> On 26 Sep 2018, at 15:57, Stephan Oepen <oe at ifi.uio.no> wrote:
>>
>> many thanks for validating (to some degree at least :-) my OpenNMT-py
>> installation on Abel. i have now added it to the software catalogue
>> and created minimal documentation on the NLPL wiki:
>>
>> http://wiki.nlpl.eu/index.php/Infrastructure/software/catalogue
>> http://wiki.nlpl.eu/index.php/Translation/opennmt-py
>>
>> —could you suggest a minimal example workflow, demonstrating how to
>> train and decode with OpenNMT, ideally using files from our own
>> ‘/proj/nlpl/data/translation/’? speaking of which, should i start
>> replicating that directory from Taito to Abel, i.e. remove what you
>> had installed manually on Abel and instead turn on automated
>> replication once a day?
>>
>> in principle, we should now produce a parallel installation of
>> OpenNMT-py on Taito, of course—which presupposes that we get something
>> parallel worked out for PyTorch.
>>
>> yves, why do you say that CSC does not include ‘virtualenv’ in their
>> python installation? is there something principled that i am missing?
>>
>> [oe at taito-login3 ~]$ module add python-env/3.5.3
>> Loading application python-3.5.3 environment with needed modules
>> Switching compiler gcc to gcc/5.4.0
>> Switching MPI version intelmpi to intelmpi/5.1.3
>>
>> The following have been reloaded with a version change:
>> 1) gcc/4.8.2 => gcc/5.4.0 2) intelmpi/4.1.3 => intelmpi/5.1.3 3)
>> mkl/11.3.0 => mkl/11.3.2 4) python-env/3.4.0 => python-env/3.5.3 5)
>> python/3.4.0 => python/3.5.3
>>
>> [oe at taito-login3 ~]$ type -all python
>> python is /appl/opt/python/3.5.3-gnu540/bin/python
>> [oe at taito-login3 ~]$ type -all virtualenv
>> virtualenv is /appl/opt/python/3.5.3-gnu540/bin/virtualenv
>>
>> so, i am guessing we could presumably attempt an NLPL-maintained
>> installation of PyTorch into a 3.5 virtual environment, which would
>> likely require a custom glibc installation too (and the same kind of
>> dynamic linking ‘gymnastics’).
>>
>> i feel i still need to learn more about the CSC environment. are the
>> modules available on taito-gpu the same as on the cpu nodes? in other
>> words, do both types of nodes see the same file system?
>>
>> cheers, oe
>>
>>
>> On Wed, Sep 26, 2018 at 9:59 AM, Scherrer, Yves
>> <yves.scherrer at helsinki.fi> wrote:
>>> Hi,
>>>
>>>
>>>
>>> I’ve had a quick look at Stephan’s OpenNMT-py on Abel. The onmt module seems
>>> to work, but one generally uses the scripts “preprocess.py”, “train.py” and
>>> “translate.py” (at the root directory of the Github repo), and these scripts
>>> seem to be missing from the module. Would it be possible to copy these three
>>> scripts (there is a fourth one, “server.py”, but this one might not be
>>> relevant for common usage) somewhere inside the virtual environment, so that
>>> they can be found and called easily?
>>>
>>>
>>>
>>> I have to say that I find these stacked virtual environments quite elegant.
>>> Too bad that CSC doesn’t even include the virtualenv command in their
>>> python-env modules…
>>>
>>>
>>>
>>> Best,
>>>
>>> Yves
>>>
>>>
>>>
>>> ________________________________
>>> From: Stephan Oepen <oe at ifi.uio.no>
>>> Sent: Thursday, September 20, 2018 12:31:58 AM
>>> To: Scherrer, Yves
>>> Cc: Martin Matthiesen; infrastructure
>>>
>>> Subject: Re: [NLPL Task Force (A)] OpenNMT installation for NLPL (on Abel)
>>>
>>> dear all,
>>>
>>> yes, chaining virtual environments appears to work as one would
>>> expect. i might in fact have managed to install OpenNMT-py on Abel,
>>> using my new PyTorch 0.4.1 virtual environment, essentially:
>>>
>>> module load nlpl-pytorch
>>> /projects/nlpl/software/opennmt-py/
>>> virtualenv /projects/nlpl/software/opennmt-py/0.2.1
>>>
>>> at this point, i had to manually change the ‘python’, ‘python3’, and
>>> ‘python3.5’ files in the new ‘bin/’ directory, to avail themselves of
>>> the custom glibc; see
>>> ‘http://wiki.nlpl.eu/index.php/Infrastructure/software/glibc’.
>>>
>>> cd /projects/nlpl/software/modulefiles
>>> mkdir nlpl-opennmt-py
>>> cp nlpl-pytorch/0.4.1 nlpl-opennmt-py/0.2.1
>>> vi nlpl-opennmt-py/0.2.1
>>>
>>> cd ~/src/nlpl
>>> module purge
>>> module load nlpl-opennmt-py
>>> wget https://github.com/OpenNMT/OpenNMT-py/archive/0.2.1.tar.gz
>>> tar zpSxvf 0.2.1.tar.gz
>>> cd OpenNMT-py-0.2.1
>>> python setup.py install
>>>
>>> so far, my testing is limited to
>>>
>>> python -c "import torch; import onmt; print(onmt.__version__);"
>>>
>>> yves, would you maybe have a chance next week to see whether this
>>> installation appears healthy to you?
>>>
>>> cheers, oe
>>>
>>>
>>> On Wed, Sep 19, 2018 at 1:12 PM, Scherrer, Yves
>>> <yves.scherrer at helsinki.fi> wrote:
>>>> Hi Stephan, Martin,
>>>>
>>>>
>>>>
>>>> I’m catching up on this thread… A few questions from my side:
>>>>
>>>>
>>>>
>>>> Regarding Martin’s latest suggestion: that seems indeed to work fine,
>>>> although with the exact same commands I get a different version of
>>>> PyTorch:
>>>>
>>>>>>> import torch
>>>>
>>>>>>> torch.__file__
>>>>
>>>>
>>>> '/appl/opt/python/intelpython36-2018.3/intelpython3/lib/python3.6/site-packages/torch/__init__.py'
>>>>
>>>>>>> torch.__version__
>>>>
>>>> '0.4.0a0+3749c58'
>>>>
>>>>
>>>>
>>>> In any case, if PyTorch is already installed in some Python distribution,
>>>> that would make setting up a specific OpenNMT module rather easy. If not,
>>>> virtual environments should work as well (the tricky thing is mainly to
>>>> figure out which python versions play well with CUDA…)
>>>>
>>>>
>>>>
>>>> Regarding Stephan’s suggestion of virtual environments: do you know if
>>>> virtual environments can be “stacked”, i.e. whether I could create an
>>>> OpenNMT virtual environment that lies on top of your PyTorch environment?
>>>> Or
>>>> would I have to re-install another instance of PyTorch in the OpenNMT
>>>> virtualenv?
>>>>
>>>>
>>>>
>>>> I’ll be travelling for the rest of the week, but will try to have a closer
>>>> look at these options next week.
>>>>
>>>>
>>>>
>>>> Best,
>>>>
>>>> Yves
>>>>
>>>>
>>>>
>>>> ________________________________
>>>> From: Martin Matthiesen <martin.matthiesen at csc.fi>
>>>> Sent: Wednesday, September 19, 2018 1:29:35 PM
>>>> To: Stephan Oepen
>>>> Cc: infrastructure; Scherrer, Yves
>>>>
>>>> Subject: Re: [NLPL Task Force (A)] OpenNMT installation for NLPL (on Abel)
>>>>
>>>> Hello Stephan,
>>>>
>>>> ----- Original Message -----
>>>>> From: "Stephan Oepen" <oe at ifi.uio.no>
>>>>> To: "Martin Matthiesen" <martin.matthiesen at csc.fi>
>>>>> Cc: "infrastructure" <infrastructure at nlpl.eu>, "Yves Scherrer"
>>>>> <yves.scherrer at helsinki.fi>
>>>>> Sent: Tuesday, 18 September, 2018 14:13:53
>>>>> Subject: Re: [NLPL Task Force (A)] OpenNMT installation for NLPL (on
>>>>> Abel)
>>>>
>>>>> sorry, i was the one who had introduced the confusion about mailing
>>>>> lists. there is no ‘translation at nlpl.eu’ currently, and upon
>>>>> consultation with joerg there appears not to be a great need for it
>>>>> either (once i get around to documenting the task force structure on
>>>>> the project wiki, i might want to create that list nevertheless).
>>>>>
>>>>> i am adding yves to thread now, so he at least has a chance of knowing
>>>>> what we are talking about :-).
>>>>
>>>> Ok!
>>>>>
>>>>> martin, i doubt that an installation of OpenNMT that requires everyone
>>>>> to ‘pip install --user’ into their home directory will be a good
>>>>> solution. that way, the getting started instructions will be more
>>>>> complex, and we lack control over which version of PyTorch gets
>>>>> installed at the time the user actually runs the command. my
>>>>> immediate reaction at least is that NLPL-supported software should be
>>>>> ‘self-contained’, in the sense of not depending on software components
>>>>> maintained by the user.
>>>>
>>>> Ok, I understand.
>>>>>
>>>>> what i am doing increasingly on abel is deriving virtual environments;
>>>>> e.g. my PyTorch installation (for NLPL) straightforwardly builds on
>>>>> the USIT-maintained python 3.5. i suppose we should be able to do the
>>>>> same thing on taito, i.e. create ‘nlpl-pytorch’ as a virtual
>>>>> environment that includes the precompiled PyTorch wheel from your CSC
>>>>> colleagues?
>>>>
>>>> Yes, I guess that is the only sensible solution to not lose track
>>>> completely. In the meantime, how would this work for you all:
>>>>
>>>> [GPU-Env ~]$ module load python-env/intelpython3.6-2018.3
>>>> Loading application Intel Distribution for Python 2018 update 3
>>>> [GPU-Env ~]$ module list
>>>>
>>>> Currently Loaded Modules:
>>>> 1) gcc/4.9.3 2) cuda/7.5 3) StdEnv 4) git/2.17.1 5)
>>>> python-env/intelpython3.6-2018.3
>>>>
>>>> [GPU-Env ~]$ python3
>>>> Python 3.6.3 |Intel Corporation| (default, May 4 2018, 04:22:28)
>>>> [GCC 4.8.2 20140120 (Red Hat 4.8.2-15)] on linux
>>>> Type "help", "copyright", "credits" or "license" for more information.
>>>> Intel(R) Distribution for Python is brought to you by Intel Corporation.
>>>> Please check out: https://software.intel.com/en-us/python-distribution
>>>>>>> import torch
>>>>>>> torch.__version__
>>>> '0.4.1'
>>>>
>>>> Kudos to my colleagues Markus and Jarmo here.
>>>>
>>>> Martin
>>>>
>>>>>
>>>>> oe
>>>>>
>>>>>
>>>>>
>>>>>
>>>>> On Mon, Sep 17, 2018 at 5:06 PM, Martin Matthiesen
>>>>> <martin.matthiesen at csc.fi> wrote:
>>>>>> Hello,
>>>>>>
>>>>>> We already have a way to use pytorch 0.4.1 on Taito-GPU:
>>>>>>
>>>>>> module load python-env/intelpython3.6-2018.3
>>>>>> [GPU-Env ~]$ pip install -v --user
>>>>>> /appl/opt/pytorch/0.4.1/cu90/torch-0.4.1-cp36-cp36m-linux_x86_64.whl
>>>>>>
>>>>>> One of my colleagues has compiled the module. Note that the module needs
>>>>>> python
>>>>>> 3.6 to work, the highest available on Taito-GPU.
>>>>>>
>>>>>> Before I investigate CPU-support or support for other compilers, would
>>>>>> this
>>>>>> pip-approach work for you?
>>>>>>
>>>>>> Regards,
>>>>>> Martin
>>>>>>
>>>>>> ----- Original Message -----
>>>>>>> From: "Stephan Oepen" <oe at ifi.uio.no>
>>>>>>> To: translation at nlpl.eu
>>>>>>> Cc: "infrastructure" <infrastructure at nlpl.eu>
>>>>>>> Sent: Saturday, 15 September, 2018 18:59:29
>>>>>>> Subject: [NLPL Task Force (A)] OpenNMT installation for NLPL (on Abel)
>>>>>>
>>>>>>> colleagues,
>>>>>>>
>>>>>>> joerg, martin, and i talked about getting the new release version of
>>>>>>> OpenNMT installed for NLPL. it appears it requires the most recent
>>>>>>> version of PyTorch, which currently is not available on Taito. martin
>>>>>>> will ask for it to be installed by CSC.
>>>>>>>
>>>>>>> in parallel, i believe i managed to put an NLPL-owned installation of
>>>>>>> the right PyTorch version onto Abel, please see:
>>>>>>>
>>>>>>> http://wiki.nlpl.eu/index.php/Infrastructure/software/pytorch
>>>>>>>
>>>>>>> before announcing this more widely, i would be grateful for some
>>>>>>> testing, in particular for both cpu and gpu usage. would anyone we
>>>>>>> readily set up to give this a shot on Abel?
>>>>>>>
>>>>>>> assuming our PyTorch is healthy, would someone from the helsinki team
>>>>>>> have the time to try and install OpenNMT onto Abel, e.g. as
>>>>>>>
>>>>>>> /projects/nlpl/software/opennmt-py/0.2.1
>>>>>>>
>>>>>>> there have been two relatively recent requests for OpenNMT in oslo
>>>>>>> (one of them for seq2seq dependency parsing :-), so i believe it would
>>>>>>> now be warranted to provide it on both systems.
>>>>>>>
> >>>>>> best wishes, oe
More information about the infrastructure
mailing list