[NLPL Task Force (A)] [rt.uio.no #3406027] gpu usage on Abel

Thomas Röblitz via RT hpc-drift at usit.uio.no
Thu May 16 05:59:04 UTC 2019


Did you observe this at one point in time or have you seen this ongoing over a longer period?

Thomas

On 2019-05-16 00:46:07, oe wrote:
> dear colleagues,
> 
> > User notified.
> 
> i am reopening this ticket as it may look as over-allocation of gpus
> by this user continues:
> 
> [oe at login-0-0 ~]$ squeue --partition=accel | grep michaelm
>           26981933     accel   pe_con michaelm PD       0:00      1
> (Priority)
>           26981931     accel   pe_con michaelm  R 1-10:45:37      1
> c19-15
>           26981930     accel   pe_con michaelm  R 1-11:25:20      1
> c19-16
>           26981928     accel   st_con michaelm  R 1-11:53:02      1
> c19-5
>           26981929     accel   pe_con michaelm  R 1-11:53:02      1
> c19-11
>           26981926     accel   st_con michaelm  R 1-11:53:49      1
> c19-3
>           26981927     accel   st_con michaelm  R 1-11:53:49      1
> c19-8
>           26981924     accel   st_con michaelm  R 1-11:54:36      1
> c19-14
> [oe at login-0-0 ~]$ for i in 3 5 8 11 14 15 16; do ssh c19-${i}
> nvidia-smi | grep Default; done
> | N/A   31C    P0    86W / 235W |    673MiB /  5699MiB |     80%
> Default |
> | N/A   19C    P8    18W / 235W |     11MiB /  5699MiB |      0%
> Default |
> | N/A   30C    P0    85W / 235W |    673MiB /  5699MiB |     82%
> Default |
> | N/A   18C    P8    18W / 235W |     11MiB /  5699MiB |      0%
> Default |
> | N/A   31C    P0    88W / 235W |    673MiB /  5699MiB |     83%
> Default |
> | N/A   18C    P8    18W / 235W |     11MiB /  5699MiB |      0%
> Default |
> | N/A   46C    P0    96W / 235W |   1128MiB /  5699MiB |     82%
> Default |
> | N/A   26C    P8    17W / 235W |     11MiB /  5699MiB |      0%
> Default |
> | N/A   34C    P0    90W / 235W |    673MiB /  5699MiB |     82%
> Default |
> | N/A   20C    P8    18W / 235W |     11MiB /  5699MiB |      0%
> Default |
> | N/A   35C    P0    96W / 235W |   1128MiB /  5699MiB |     86%
> Default |
> | N/A   20C    P8    17W / 235W |     11MiB /  5699MiB |      0%
> Default |
> | N/A   33C    P0    90W / 235W |   1128MiB /  5699MiB |     84%
> Default |
> | N/A   20C    P8    18W / 235W |     11MiB /  5699MiB |      0%
> Default |
> 
> i do not want to police other abel users.  but unless i mis-read the
> above, this usage pattern unnecessarily blocks close to half of one of
> the currently most precious resources available.
> 
> best wishes (from copenhagen), oe




More information about the infrastructure mailing list