[NLPL Task Force (A)] [uninett.no #211080] NCCL on Saga (for use with CUDA 10.2)
Vegard Eide via RT
metacenter-software at metacenter.no
Wed May 13 06:32:17 UTC 2020
ti. 12. mai 2020 14.31.07 skrev oe at ifi.uio.no:
> nccl_2.5.6-2+cuda10.2_x86_64.txz
> nccl_2.5.6-1+cuda10.1_x86_64.txz
i see. i guess no matter what they say in the release notes, that
should be sufficient reason to install separate modules for each
download that is actually needed. in fact, it appears these are
indeed distinct builds:
$ diff -iwr nccl_2.6.4-1+cuda10.*
Binary files nccl_2.6.4-1+cuda10.0_x86_64/lib/libnccl.so and
nccl_2.6.4-1+cuda10.2_x86_64/lib/libnccl.so differ
Binary files nccl_2.6.4-1+cuda10.0_x86_64/lib/libnccl.so.2 and
nccl_2.6.4-1+cuda10.2_x86_64/lib/libnccl.so.2 differ
Binary files nccl_2.6.4-1+cuda10.0_x86_64/lib/libnccl.so.2.6.4 and
nccl_2.6.4-1+cuda10.2_x86_64/lib/libnccl.so.2.6.4 differ
Binary files nccl_2.6.4-1+cuda10.0_x86_64/lib/libnccl_static.a and
nccl_2.6.4-1+cuda10.2_x86_64/lib/libnccl_static.a differ
for me, just now, i believe i would like to try NCCL 2.6.4 (the
current version) on CUDA 10.2 (default requirement for current
PyTorch) and 10.1 (for current TensorFlow), so hopefully i can make do
(for the time being :-) with just two modules!
Hi,
We have installed
NCCL/2.6.4-CUDA-10.1
NCCL/2.6.4-CUDA-10.2
Notice, loading the modules will not directly load any CUDA module since they
can be used with different CUDA 10.1.x and 10.2.x modules respectively.
Regards
Vegard
More information about the infrastructure
mailing list