[NLPL Task Force (A)] Module PyTorch 1.1.0 / Python 3.7: Pip doesn’t work (Abel)
Stephan Oepen
oe at ifi.uio.no
Thu May 30 10:35:36 UTC 2019
hi johannes,
thanks you for your report. it points out a missing customization
step for these scripts in our automated installation process. i
suspect this problem will be present in other NLPL modules based on
python 3.7; it went unnoticed until now, probably in part because i
have made it a habit to say 'python3 -m pip' instead of just 'pip',
and that habit may also have started to rub off on local NLPL
colleagues already :-).
anyway, if you try again now, i hope the problem you were having is
fixed? also, no need for you to manually pre-load CUDA; the NLPL
modules should automatically load the dependencies they need.
a note of warning, however: we installed TensorFlow 1.13 last week,
which requires CUDA 10.0, which in turn requires an update of the
NVIDIA drivers on the Abel gpu nodes. USIT is currently performing
that update, but it takes time to roll out as they are waiting for
running jobs to complete. we have yet to confirm that our NLPL
gpu-enabled modules will just work on the new gpu driver versions,
once all nodes have it installed. hence, please be in touch in case
you notice anything surprising!
best wishes, oe
On Thu, May 30, 2019 at 12:18 PM Johannes Gontrum <j at gontrum.me> wrote:
>
> Hej!
>
> I just discovered that you already provide a module for the latest version of PyTorch! First of all: Thank you very much for keeping the packages up-to-date so quickly! However, I discovered a problem:
>
> I’m trying to use the latest version of PyTorch (1.1.0) with Python 3.7 on Abel, however pip seems to be misconfigured.
>
> On a node I run:
> module purge
> module use -a /proj*/nlpl/software/modulefiles
> module load cuda/9.0 nlpl-pytorch/1.1.0/3.7
> which pip
> => /projects/nlpl/software/pytorch/1.1.0/bin/3.7/pip
> pip --version
> => bash: /projects/nlpl/software/pytorch/1.1.0/bin/3.7/pip: /projects/nlpl/software/pytorch/1.1.0/bin/python3: bad interpreter: No such file or directory
>
> I believe pip is linked to a wrong Python version, as the correct binary is not in `/projects/nlpl/software/pytorch/1.1.0/bin/python3`, but in `/projects/nlpl/software/pytorch/1.1.0/bin/3.7/python` (subfolder ‘3.7’).
>
> If I try the same for PyTorch 1.0.0, everything works fine.
>
> Could you please look into this? The latest updated to PyTorch brings some features that I’d really love to use in my project.
>
>
> Thank you and best wishes,
> Johannes
> (Language Technoloy Master's student at Uppsala University)
More information about the infrastructure
mailing list