[NLPL Task Force (A)] [usit-vd-rt] [rt.uio.no #3396801] newer CUDA versions on Abel
Ole Saastad via RT
hpc-drift at usit.uio.no
Thu May 16 12:19:41 UTC 2019
The new driver is now the distro, all it takes is to reinstall the
nodes in rack 19. This can take some time as there are running jobs.
Ole
On Thu, 2019-05-16 at 14:08 +0200, Stephan Oepen via RT wrote:
> <URL: https://rt.uio.no/Ticket/Display.html?id=3396801 >
>
> > Try logging in to c19-10 and test, the latest driver is now
> > installed.
>
> very cool!
>
> 2019-05-16 13:50:17.663646: I
> tensorflow/core/common_runtime/gpu/gpu_device.cc:1512] Adding visible
> gpu devices: 0
> 2019-05-16 13:50:17.665317: I
> tensorflow/core/common_runtime/gpu/gpu_device.cc:984] Device
> interconnect StreamExecutor with strength 1 edge matrix:
> 2019-05-16 13:50:17.665348: I
> tensorflow/core/common_runtime/gpu/gpu_device.cc:990] 0
> 2019-05-16 13:50:17.665363: I
> tensorflow/core/common_runtime/gpu/gpu_device.cc:1003] 0: N
> 2019-05-16 13:50:17.665533: I
> tensorflow/core/common_runtime/gpu/gpu_device.cc:1115] Created
> TensorFlow device (/job:localhost/replica:0/task:0/device:GPU:0 with
> 5329 MB memory) -> physical GPU (device: 0, name: Tesla K20Xm, pci
> bus
> id: 0000:84:00.0, compute capability: 3.5)
> Device mapping:
> /job:localhost/replica:0/task:0/device:XLA_CPU:0 -> device: XLA_CPU
> device
> /job:localhost/replica:0/task:0/device:XLA_GPU:0 -> device: XLA_GPU
> device
> /job:localhost/replica:0/task:0/device:GPU:0 -> device: 0, name:
> Tesla
> K20Xm, pci bus id: 0000:84:00.0, compute capability: 3.5
> 2019-05-16 13:50:17.667836: I
> tensorflow/core/common_runtime/direct_session.cc:317] Device mapping:
> /job:localhost/replica:0/task:0/device:XLA_CPU:0 -> device: XLA_CPU
> device
> /job:localhost/replica:0/task:0/device:XLA_GPU:0 -> device: XLA_GPU
> device
> /job:localhost/replica:0/task:0/device:GPU:0 -> device: 0, name:
> Tesla
> K20Xm, pci bus id: 0000:84:00.0, compute capability: 3.5
>
> MatMul: (MatMul): /job:localhost/replica:0/task:0/device:GPU:0
> 2019-05-16 13:50:17.669667: I
> tensorflow/core/common_runtime/placer.cc:1059] MatMul:
> (MatMul)/job:localhost/replica:0/task:0/device:GPU:0
> a: (Const): /job:localhost/replica:0/task:0/device:GPU:0
> 2019-05-16 13:50:17.669699: I
> tensorflow/core/common_runtime/placer.cc:1059] a:
> (Const)/job:localhost/replica:0/task:0/device:GPU:0
> b: (Const): /job:localhost/replica:0/task:0/device:GPU:0
> 2019-05-16 13:50:17.669718: I
> tensorflow/core/common_runtime/placer.cc:1059] b:
> (Const)/job:localhost/replica:0/task:0/device:GPU:0
>
> ---my older TensorFlow installation (release 1.11; built against CUDA
> 9.0) did not work out of the box on this node, but i found what it
> takes to make it work again (and still work on the old nodes). so, i
> think we would welcome new drivers on all gpu nodes now :-). (how)
> can that be accomplished?
>
> oe
>
>
--
Ole W. Saastad, Dr.Scient.
UiO/USIT/UVA/ITF/FI
Besøk: Kristen Nygaards hus - Rom 2315
Post: Gaustadalléen 23A, 0349 Oslo
USIT, Postboks 1059 Blindern, 0316 Oslo
Tel: +47-22840752
More information about the infrastructure
mailing list