<html>
<head>
<meta http-equiv="Content-Type" content="text/html; charset=Windows-1252">
<meta name="Generator" content="Microsoft Exchange Server">
<!-- converted from text --><style><!-- .EmailQuote { margin-left: 1pt; padding-left: 4pt; border-left: #800000 2px solid; } --></style>
</head>
<body>
<style>
<!--
@font-face
{font-family:"Cambria Math"}
@font-face
{font-family:Calibri}
p.x_MsoNormal, li.x_MsoNormal, div.x_MsoNormal
{margin:0in;
margin-bottom:.0001pt;
font-size:11.0pt;
font-family:"Calibri",sans-serif}
a:link, span.x_MsoHyperlink
{color:blue;
text-decoration:underline}
a:visited, span.x_MsoHyperlinkFollowed
{color:#954F72;
text-decoration:underline}
.x_MsoChpDefault
{}
@page WordSection1
{margin:70.85pt 56.7pt 70.85pt 56.7pt}
div.x_WordSection1
{}
-->
</style>
<div lang="EN-US" link="blue" vlink="#954F72">
<div class="x_WordSection1">
<p class="x_MsoNormal">Hi,</p>
<p class="x_MsoNormal"> </p>
<p class="x_MsoNormal">I’m following up on this one with a related issue. I am testing PyTorch independently of OpenNMT-py, but cannot get it to run on (Taito-)GPU.</p>
<p class="x_MsoNormal"> </p>
<p class="x_MsoNormal">Specifically, although I was logged in to Taito-GPU, I cannot get the test script described on the Wiki page to return True:</p>
<p class="x_MsoNormal"> </p>
<p class="x_MsoNormal">[GPU-Env lstmtagger]$ srun -n 1 -p gputest --gres=gpu:k80:1 --mem 1G -t 15 --pty python3 /proj/nlpl/software/pytorch/0.4.1/test.py</p>
<p class="x_MsoNormal">srun: job 32089470 queued and waiting for resources</p>
<p class="x_MsoNormal">srun: job 32089470 has been allocated resources</p>
<p class="x_MsoNormal">False</p>
<p class="x_MsoNormal"> </p>
<p class="x_MsoNormal">I also get ‘False’ when running the following script through sbatch:</p>
<p class="x_MsoNormal"> </p>
<p class="x_MsoNormal">#SBATCH -J cudatest</p>
<p class="x_MsoNormal">#SBATCH -o cudatest.%j.out</p>
<p class="x_MsoNormal">#SBATCH -e cudatest.%j.err</p>
<p class="x_MsoNormal">#SBATCH -t 0:05:00</p>
<p class="x_MsoNormal">#SBATCH -p gputest</p>
<p class="x_MsoNormal">#SBATCH -N 1</p>
<p class="x_MsoNormal">#SBATCH --gres=gpu:k80:1</p>
<p class="x_MsoNormal">#SBATCH --mem=1g</p>
<p class="x_MsoNormal">module use -a /proj/nlpl/software/modulefiles/</p>
<p class="x_MsoNormal">module load nlpl-pytorch</p>
<p class="x_MsoNormal">srun python3 /proj/nlpl/software/pytorch/0.4.1/test.py</p>
<p class="x_MsoNormal"> </p>
<p class="x_MsoNormal">Has there been any change lately? Or am I missing something obvious?</p>
<p class="x_MsoNormal"> </p>
<p class="x_MsoNormal">Best,</p>
<p class="x_MsoNormal">Yves</p>
<p class="x_MsoNormal"> </p>
<p class="x_MsoNormal"> </p>
</div>
<hr tabindex="-1" style="display:inline-block; width:98%">
<div id="x_divRplyFwdMsg" dir="ltr"><font face="Calibri, sans-serif" color="#000000" style="font-size:11pt"><b>From:</b> Stephan Oepen <oe@ifi.uio.no><br>
<b>Sent:</b> Wednesday, September 26, 2018 11:10:12 PM<br>
<b>To:</b> Scherrer, Yves<br>
<b>Cc:</b> Martin Matthiesen; infrastructure<br>
<b>Subject:</b> Re: [NLPL Task Force (A)] OpenNMT installation for NLPL (on Abel)</font>
<div> </div>
</div>
</div>
<font size="2"><span style="font-size:11pt;">
<div class="PlainText">hi again,<br>
<br>
> i actually had a go at my own glibc and PyTorch installations on Taito, but<br>
> so far gpu support is evasive.<br>
<br>
actually, with a little more tinkering, i now believe i might have a<br>
working installation of PyTorch 0.4.1 and OpenNMT-py 0.2.1 on Taito<br>
too, seemingly functional on both cpu and gpu nodes:<br>
<br>
[oe@taito-login4 ~]$ module purge<br>
[oe@taito-login4 ~]$ module load nlpl-opennmt-py<br>
Loading application python-3.5.3 environment with needed modules<br>
[oe@taito-login4 ~]$ module list<br>
<br>
Currently Loaded Modules:<br>
1) gcc/5.4.0 2) intelmpi/5.1.3 3) mkl/11.3.2 4) python/3.5.3<br>
5) python-env/3.5.3 6) nlpl-pytorch/0.4.1 7) nlpl-opennmt-py/0.2.1<br>
<br>
[oe@taito-login4 ~]$ type -all python<br>
python is /proj/nlpl/software/opennmt-py/0.2.1/bin/python<br>
python is /proj/nlpl/software/pytorch/0.4.1/bin/python<br>
python is /appl/opt/python/3.5.3-gnu540/bin/python<br>
python is /usr/bin/python<br>
[oe@taito-login4 ~]$ python -c "import torch; import onmt;<br>
print(torch.cuda.is_available());"<br>
False<br>
<br>
[oe@taito-login4 ~]$ srun -n 1 -p gputest --gres=gpu:k80:1 --mem 1G -t<br>
15 --pty \<br>
python -c "import torch; import onmt; print(torch.cuda.is_available());"<br>
True<br>
<br>
—yves (or joerg), i would have a hard time testing things in much more<br>
depth. any chance you would have some time to try and replicate the<br>
validation steps your are currently running on Abel on Taito too?<br>
<br>
with a sense of accomplishment :-), oe<br>
</div>
</span></font>
</body>
</html>