tf_setup
detailed instructions can be found at tensorflow installation guide
Conda env
Install conda if not present
conda_install_script="Miniconda3-latest-Linux-x86_64.sh"
curl https://repo.anaconda.com/miniconda/Miniconda3-latest-Linux-x86_64.sh -o $conda_install_script
bash $conda_install_script
rm $conda_install_script
Setup conda env
conda create -n cuda11.2.2_py3.10 python=3.10
- You should choose a meaningful name (i.e. cuda11.2.2_py3.10 for cuda 11.2.2 in a python 3.10 environment)
- python=x.y where x.y is the python version number
Then you can activate the conda environment
conda activate cuda11.2.2_py3.10
Install conda packages
conda install -c conda-forge cudatoolkit=x.y.z
conda install -c nvidia cuda-nvcc
The correct package versions can be found in the tensorflow installation guide.
Then you can exit the conda env
conda deactivate
Virtual environment
I highly recommend using virtualenv instead conda for virtualenv, because in my eyes conda is very dangerous. Conda links the executables from the base environment and filesystem in a virtual environment symbolically. This means, if you upgrade i.e. pip, the pip on the filesystem is also upgraded, which, in many cases, is unwanted behavior.
Install virtualenv
Therefore I recommend the usage of a real virtual environment. This can be achieved for example, but not exclusively, with virtualenv
sudo apt update && sudo apt install -y virtualenv
Setup virtual env
Navigate to your project folder and execute the following command
virtualenv venv --python=python3.10
the python version here and in the conda env should match
Install packages
Go into the virtual env
source venv/bin/activate
Then install tensorflow (compatible package in the tensorflow installation guide) and tensorrt
pip install nvidia-pyindex
pip install tensorflow==x.y.z nvidia-tensorrt nvidia-cudnn-cuxx==a.b.c.def
Unfortunately, if tensorrt 7.*.*.* is needed (not available on pypi), you must manually link the installed libnvinver
and libinver_plugin
to version 7, i.e.
ln -s /home/andri/Projects/tf_agents_tutorial/venv/lib/python3.10/site-packages/tensorrt/libnvinfer.so.8 /home/andri/Projects/tf_agents_tutorial/venv/lib/python3.10/site-packages/tensorrt/libnvinfer.so.7
ln -s /home/andri/Projects/tf_agents_tutorial/venv/lib/python3.10/site-packages/tensorrt/libnvinfer_plugin.so.8 /home/andri/Projects/tf_agents_tutorial/venv/lib/python3.10/site-packages/tensorrt/libnvinfer_plugin.so.7
Be aware, that your paths may differ from the given paths. Your paths can be found via
find . -iname *libnvinfer*
Modify activation script
In order for tensorflow to find the cuda dependencies, these must be given in the virtual environment.
For this add these lines at the bottom of the activation script (venv/bin/activate
)
# tf env
conda_tf_env="/home/andri/miniconda3/envs/cuda11.2.2_py3.10"
_OLD_LD_LIBRARY_PATH="$LD_LIBRARY_PATH"
export _OLD_LD_LIBRARY_PATH
LD_LIBRARY_PATH="$LD_LIBRARY_PATH:$conda_tf_env/lib:/home/andri/Projects/tf_agents_tutorial/venv/lib/python3.10/site-packages/tensorrt/"
export LD_LIBRARY_PATH
_OLD_XLA_FLAGS="$XLA_FLAGS"
export _OLD_XLA_FLAGS
XLA_FLAGS="--xla_gpu_cuda_data_dir=$conda_tf_env"
export XLA_FLAGS
Be aware, that your paths may differ from the given paths.
Additionally add the following lines to the deactivate()
function in the activation script
# tf env
LD_LIBRARY_PATH="$_OLD_LD_LIBRARY_PATH"
export LD_LIBRARY_PATH
unset _OLD_LD_LIBRARY_PATH
XLA_FLAGS="$_OLD_XLA_FLAGS"
export XLA_FLAGS
unset _OLD_XLA_FLAGS
Conclusion
conda is brutally bad, so please don't be silly and refuse to use it wherever possible