[NLPL Task Force (A)] [rt.uio.no #3396801] newer CUDA versions on Abel

Stephan Oepen via RT hpc-drift at usit.uio.no
Thu May 16 12:08:21 UTC 2019


> Try logging in to c19-10 and test, the latest driver is now installed.

very cool!

2019-05-16 13:50:17.663646: I
tensorflow/core/common_runtime/gpu/gpu_device.cc:1512] Adding visible
gpu devices: 0
2019-05-16 13:50:17.665317: I
tensorflow/core/common_runtime/gpu/gpu_device.cc:984] Device
interconnect StreamExecutor with strength 1 edge matrix:
2019-05-16 13:50:17.665348: I
tensorflow/core/common_runtime/gpu/gpu_device.cc:990]      0
2019-05-16 13:50:17.665363: I
tensorflow/core/common_runtime/gpu/gpu_device.cc:1003] 0:   N
2019-05-16 13:50:17.665533: I
tensorflow/core/common_runtime/gpu/gpu_device.cc:1115] Created
TensorFlow device (/job:localhost/replica:0/task:0/device:GPU:0 with
5329 MB memory) -> physical GPU (device: 0, name: Tesla K20Xm, pci bus
id: 0000:84:00.0, compute capability: 3.5)
Device mapping:
/job:localhost/replica:0/task:0/device:XLA_CPU:0 -> device: XLA_CPU device
/job:localhost/replica:0/task:0/device:XLA_GPU:0 -> device: XLA_GPU device
/job:localhost/replica:0/task:0/device:GPU:0 -> device: 0, name: Tesla
K20Xm, pci bus id: 0000:84:00.0, compute capability: 3.5
2019-05-16 13:50:17.667836: I
tensorflow/core/common_runtime/direct_session.cc:317] Device mapping:
/job:localhost/replica:0/task:0/device:XLA_CPU:0 -> device: XLA_CPU device
/job:localhost/replica:0/task:0/device:XLA_GPU:0 -> device: XLA_GPU device
/job:localhost/replica:0/task:0/device:GPU:0 -> device: 0, name: Tesla
K20Xm, pci bus id: 0000:84:00.0, compute capability: 3.5

MatMul: (MatMul): /job:localhost/replica:0/task:0/device:GPU:0
2019-05-16 13:50:17.669667: I
tensorflow/core/common_runtime/placer.cc:1059] MatMul:
(MatMul)/job:localhost/replica:0/task:0/device:GPU:0
a: (Const): /job:localhost/replica:0/task:0/device:GPU:0
2019-05-16 13:50:17.669699: I
tensorflow/core/common_runtime/placer.cc:1059] a:
(Const)/job:localhost/replica:0/task:0/device:GPU:0
b: (Const): /job:localhost/replica:0/task:0/device:GPU:0
2019-05-16 13:50:17.669718: I
tensorflow/core/common_runtime/placer.cc:1059] b:
(Const)/job:localhost/replica:0/task:0/device:GPU:0

---my older TensorFlow installation (release 1.11; built against CUDA
9.0) did not work out of the box on this node, but i found what it
takes to make it work again (and still work on the old nodes).  so, i
think we would welcome new drivers on all gpu nodes now :-).  (how)
can that be accomplished?

oe





More information about the infrastructure mailing list