[NLPL Task Force (A)] [uninett.no #196965] Tensorflow issues, pt. 2

Vinit Ravishankar via RT support at metacenter.no
Fri Oct 25 09:13:31 UTC 2019


No, I wasn’t aware actually, thanks! I’ve decided to start all over, in a fresh Conda environment — what I now need is TF 1.13.1, which requires cuDNN 7.4 and CUDA 10.0. I’d also need OpenMPI 3.1.2/4.0.0, NCCL 2, and GCC 4.9 (I think?) for the compilation. 

I gather that CUDA 10.0 isn’t actually available (it’s just 9 or 10.1)? Is it possible to (somehow) build TF with CUDA 10? As far as I’m aware, there’s no TF version that really supports 10.1 yet.

– Vinit

> On 25 Oct 2019, at 10:31, Henrik R. Nagel via RT <support at metacenter.no> wrote:
> 
> Hi,
> 
>> Sure, I’m using my own Miniconda environment (Python 3.7), I’ve
>> attached the output to `conda list’.
> 
> It is very difficult for us to solve a problem concerning that much software, which is out of our control. However, in the list it says that you have installed your own version of the CUDA software using Miniconda.
> 
>> module load NCCL/2.4.8-gcccuda-2018b
> 
> This command loads older version of CUDA than the one that you installed. Are you aware that you have loaded two different versions of the CUDA software at the same time? Since you have already installed your own version of the CUDA software, maybe you should avoid loading our NCCL module and instead also install NCCL using Miniconda?
> 
> Best regards,
> 
> Henrik





More information about the infrastructure mailing list