<html> <head> <meta http-equiv="Content-Type" content="text/html; charset=Windows-1252"> <meta name="Generator" content="Microsoft Exchange Server"> <style></style> </head> <body> <style>  </style> <div lang="EN-US" link="blue" vlink="#954F72"> <div class="x_WordSection1"> Hi, I’m following up on this one with a related issue. I am testing PyTorch independently of OpenNMT-py, but cannot get it to run on (Taito-)GPU. Specifically, although I was logged in to Taito-GPU, I cannot get the test script described on the Wiki page to return True: [GPU-Env lstmtagger]$ srun -n 1 -p gputest --gres=gpu:k80:1 --mem 1G -t 15 --pty python3 /proj/nlpl/software/pytorch/0.4.1/test.py srun: job 32089470 queued and waiting for resources srun: job 32089470 has been allocated resources False I also get ‘False’ when running the following script through sbatch: #SBATCH -J cudatest #SBATCH -o cudatest.%j.out #SBATCH -e cudatest.%j.err #SBATCH -t 0:05:00 #SBATCH -p gputest #SBATCH -N 1 #SBATCH --gres=gpu:k80:1 #SBATCH --mem=1g module use -a /proj/nlpl/software/modulefiles/ module load nlpl-pytorch srun python3 /proj/nlpl/software/pytorch/0.4.1/test.py Has there been any change lately? Or am I missing something obvious? Best, Yves </div> <hr tabindex="-1" style="display:inline-block; width:98%"> <div id="x_divRplyFwdMsg" dir="ltr">From: Stephan Oepen <oe@ifi.uio.no> Sent: Wednesday, September 26, 2018 11:10:12 PM To: Scherrer, Yves Cc: Martin Matthiesen; infrastructure Subject: Re: [NLPL Task Force (A)] OpenNMT installation for NLPL (on Abel) <div> </div> </div> </div> <div class="PlainText">hi again, > i actually had a go at my own glibc and PyTorch installations on Taito, but > so far gpu support is evasive. actually, with a little more tinkering, i now believe i might have a working installation of PyTorch 0.4.1 and OpenNMT-py 0.2.1 on Taito too, seemingly functional on both cpu and gpu nodes: [oe@taito-login4 ~]$ module purge [oe@taito-login4 ~]$ module load nlpl-opennmt-py Loading application python-3.5.3 environment with needed modules [oe@taito-login4 ~]$ module list Currently Loaded Modules: 1) gcc/5.4.0 2) intelmpi/5.1.3 3) mkl/11.3.2 4) python/3.5.3 5) python-env/3.5.3 6) nlpl-pytorch/0.4.1 7) nlpl-opennmt-py/0.2.1 [oe@taito-login4 ~]$ type -all python python is /proj/nlpl/software/opennmt-py/0.2.1/bin/python python is /proj/nlpl/software/pytorch/0.4.1/bin/python python is /appl/opt/python/3.5.3-gnu540/bin/python python is /usr/bin/python [oe@taito-login4 ~]$ python -c "import torch; import onmt; print(torch.cuda.is_available());" False [oe@taito-login4 ~]$ srun -n 1 -p gputest --gres=gpu:k80:1 --mem 1G -t 15 --pty \ python -c "import torch; import onmt; print(torch.cuda.is_available());" True —yves (or joerg), i would have a hard time testing things in much more depth. any chance you would have some time to try and replicate the validation steps your are currently running on Abel on Taito too? with a sense of accomplishment :-), oe </div> </body> </html>