[NLPL Task Force (A)] PyTorch on CPU

Tue Sep 25 07:07:41 UTC 2018

Hi,

I see basically three use cases for PyTorch (and similar libraries) on CPU:

  *   Rapid prototyping and testing (without having to use SLURM scripts). Using a combination of direct running on login nodes (I know, you CSC guys don’t like that…) and gputest queue should do the trick here.
  *   Teaching: it would be nice if students could run PyTorch directly from taito-shell, as they are used to for other things. This is low-priority as we don’t use PyTorch for teaching at the moment, but may do so in the near future.
  *   The most important point, though, is that running PyTorch on CPU allows us to benefit from the shorter queues on these nodes and from the “cheaper” billing. In machine translation for example, training new models requires GPU, but using these models to translate text usually doesn’t benefit that much from GPU. In these cases, we have often switched to CPU mode. These are typically very long runs for which the current GPU billing scheme is a bit prohibitive. (I have even been told by one of the CSC guys once that I wasn’t efficiently using GPU and should therefore switch to CPU instead. This was not with PyTorch, but in a similar setting with Theano.)

Best,

Yves

________________________________
From: Martin Matthiesen <martin.matthiesen at csc.fi>
Sent: Friday, September 21, 2018 6:01:06 PM
To: infrastructure
Cc: Scherrer, Yves; Markus Koskela
Subject: PyTorch on CPU

Hello,

I discussed CPU support for PyTorch with Markus today and he told me that PyTorch is working on a CPU-only environment. Case in point is the taito-gpu.csc.fi login node which does not have GPU hardware. It does not work on taito.csc.fi and it is a bit unclear to us, why. There is a missing symbol, but ldd shows that all libraries (and the same ones as on taito-gpu) are found.

I would like to still ask, why cpu-only support is important. I understood it is for rapid-prototyping, but would Taito-gpu's gputest queue[1] effectively achieve the same? The queue is meant for very short runs only (max 15 mins).

Have a nice weekend!
Martin

[1] https://research.csc.fi/taito-gpu-running?inheritRedirect=true

--
Martin Matthiesen
CSC - Tieteen tietotekniikan keskus
CSC - IT Center for Science
PL 405, 02101 Espoo, Finland
+358 9 457 2376, martin.matthiesen at csc.fi
Public key : https://pgp.mit.edu/pks/lookup?op=get&search=0x74B12876FD890704
Fingerprint: AA25 6F56 5C9A 8B42 009F  BA70 74B1 2876 FD89 0704
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.nlpl.eu/archives/infrastructure/attachments/20180925/2defa529/attachment.htm>