Issue
I'm trying to use tensorflow with my PC's GPU (Nvidia RTX 3070Ti) in python-conda environment. I'm solving a small image-classification problem from kaggle. I've solved it in google-collab, but now I'm intrested in solving it on my local machine. However TF doesn't work properly locally and I have no idea why. I've read tons of solutions but it didn't help yet.
I'm following this guide and always install proper versions of TF and CUDA: https://www.tensorflow.org/install/source_windows
cuda-toolkit 10.1, cudnn 7.6, tf-gpu 2.3, python 3.8
Also I've installed latest NVidia drivers for videocard.
What I've tried:
- I've installed proper version CUDA-toolkit and CUDnn from nvidia site. I've installed it properly and included everything that was needed into PATH. I've checked it - MS Visiual Studio finds both CUDA and CUDnn and can work with it. I've installed proper version of Tensorflow-GPU using conda into my environment.
Result: TF can't find my GPU and uses only CPU.
- I've removed all CUDA and CUDAnn drivers. I've installed CUDA-toolkit, CUDnn and Tensorflow-GPU python packages into my conda environment.
Result: TF recognizes my GPU and uses it! But during DNN training happens error: Failed to launch ptxas Relying on driver to perform ptx compilation. Modify $PATH to customize ptxas location.
And training goes very bad - accuracy is very low and doesn't improving.
When I use absolutely same code and data on google-collab, everything is going smoothly - I get ~90% accuracy on 5th epoch.
I've tried tf 2.1 and relevant cuda and cudnn, but it's still same result!
I've tried to install cudatoolkit-dev, but it didn't help to solve ptxas problem.
I'm about to give up and use PyTorch instead of Tensorflow.
Solution
So here is what worked for me:
- Create 3.9 python environment
- Install cuda and tensorflow packages from "Esri":
conda install -c esri cudatoolkit conda install -c esri cudnn conda install -c esri tensorflow-gpu
- Then install tensorflow-hub:
conda install -c conda-forge tensorflow-hub
It will downgrade installations from previous steps, but it works. Maybe installing tensorflow-hub first could help to avoid it, but I didn't test it.
Answered By - Roman Velichkin
0 comments:
Post a Comment
Note: Only a member of this blog may post a comment.