How I Python
Describing my Python dev setup (virtualenvs, etc.) as someone who has had to work with virtualenv/pip and conda side-by-sidetl;dr
- Use conda/mamba as a pyenv replacement to manage Python interpreters
- Create isolated Python interpreters (with a mamba env) as needed (per-project, etc.)
- Continue to use pip and pypi.org like you normally would
- You now have the option to break out into the more poweful conda/mamba dependency management to leapfrog a pip install issue (like a failing Rust PEP517 compile step or something)
I've never worked as a Python developer. In my personal projects on GitHub, I just copy project structures and build files from what I see in other GitHub projects until I incidentally learn of something better or have a specific need. I first read Hitchhiker's Guide Structuring Your Project in 2018; up until then I would generally struggle to get productive with Python, messing up import paths, lacking __init__.py
s, etc.
However, as a devops person, I had to dive deep into the world of Python packaging tooling when I worked on the NVIDIA RAPIDS data science libraries.
RAPIDS are a collection of Python libraries whose core logic are implemented in C++/CUDA for GPU acceleration, and the Python layer is the high-level API. This is similar to many high-performance math/scientific computing landscape of Python with C/C++ code: PyTorch, Tensorflow, NumPy, etc.
Rule 1: generally don't use your OS Python for non-OS things
When you install a standard Linux distribution, it generally has Python installed to be able to use and run some default and extra packages (either core parts of the Linux distribution or user applications) written in Python.
In general, using this system interpreter is discouraged for your own personal development tools or developing a specific project. The main risk is that if you fuck up your system's Python by putting it in a bad state (installing a bad package, etc.), you'll be interfering with your operating system's natural operation.
My older solution: virtualenvs
The typical solution, and one I used to use, is to use virtualenvs. I would create a per-project virtualenv in ~/venvs
, and I would simply do ~/venvs/<currentproject>/bin/active
when working on a project.
Conda and Mamba
While working on RAPIDS, I had to get better at conda. Conda (and the faster frontend, mamba, which I personally use) are "tools for package and environment management" - to me, they are like virtualenvs on steroids, since their most common use is to allow you to install Python packages and their C/C++ dependency libraries isolated from your system.
In a conda environment, you start with a Python version that is installed into the environment:
(system) sevagh@pop-os:~$ mamba create \
--name demo python=3.12
...
Looking for: ['python=3.12']
conda-forge/linux-64 Using cache
conda-forge/noarch Using cache
Transaction
Prefix: /home/sevagh/mambaforge/envs/demo
Updating specs:
- python=3.12
Package Version Build Channel Size
───────────────────────────────────────────────────────────────────────────────────────
Install:
───────────────────────────────────────────────────────────────────────────────────────
...
+ pip 23.3.2 pyhd8ed1ab_0 conda-forge/noarch Cached
+ python 3.12.1 hab00c5b_1_cpython conda-forge/linux-64 32MB
...
conda/mamba function as a way to shield your OS Python from the per-project Python
However, the conda environment is more powerful than the virtualenv environment. The virtualenv is at the mercy of the C/C++ libraries installed at the OS level. That's why wheels that need their own versions of C/C++ libraries have to include copies:
$ pip install --upgrade numpy
Requirement already satisfied: numpy in /home/sevagh/mambaforge/envs/system/lib/python3.11/site-packages (1.26.2)
Collecting numpy
Downloading numpy-1.26.3-cp311-cp311-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (18.3 MB)
━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 18.3/18.3 MB 23.2 MB/s eta 0:00:00
Installing collected packages: numpy
Attempting uninstall: numpy
Found existing installation: numpy 1.26.2
Uninstalling numpy-1.26.2:
Successfully uninstalled numpy-1.26.2
Successfully installed numpy-1.26.3
$ fd '.*\.so' /home/sevagh/mambaforge/envs/system/lib/python3.11/site-packages/numpy*
core/_multiarray_tests.cpython-311-x86_64-linux-gnu.so
core/_multiarray_umath.cpython-311-x86_64-linux-gnu.so
core/_operand_flag_tests.cpython-311-x86_64-linux-gnu.so
core/_rational_tests.cpython-311-x86_64-linux-gnu.so
core/_simd.cpython-311-x86_64-linux-gnu.so
core/_struct_ufunc_tests.cpython-311-x86_64-linux-gnu.so
core/_umath_tests.cpython-311-x86_64-linux-gnu.so
fft/_pocketfft_internal.cpython-311-x86_64-linux-gnu.so
linalg/_umath_linalg.cpython-311-x86_64-linux-gnu.so
linalg/lapack_lite.cpython-311-x86_64-linux-gnu.so
random/_bounded_integers.cpython-311-x86_64-linux-gnu.so
random/_common.cpython-311-x86_64-linux-gnu.so
random/_generator.cpython-311-x86_64-linux-gnu.so
random/_mt19937.cpython-311-x86_64-linux-gnu.so
random/_pcg64.cpython-311-x86_64-linux-gnu.so
random/_philox.cpython-311-x86_64-linux-gnu.so
random/_sfc64.cpython-311-x86_64-linux-gnu.so
random/bit_generator.cpython-311-x86_64-linux-gnu.so
random/mtrand.cpython-311-x86_64-linux-gnu.so
libs/libgfortran-040039e1.so.5.0.0
libs/libopenblas64_p-r0-0cf96a72.3.23.dev.so
libs/libquadmath-96973f99.so.0.0.0
Meanwhile If I do mamba install numpy
, I get:
Package Version Build Channel Size
───────────────────────────────────────────────────────────────────────────────────────
Install:
───────────────────────────────────────────────────────────────────────────────────────
+ libblas 3.9.0 20_linux64_openblas conda-forge/linux-64 Cached
+ libcblas 3.9.0 20_linux64_openblas conda-forge/linux-64 Cached
+ libgfortran-ng 13.2.0 h69a702a_0 conda-forge/linux-64 23kB
+ libgfortran5 13.2.0 ha4646dd_0 conda-forge/linux-64 1MB
+ liblapack 3.9.0 20_linux64_openblas conda-forge/linux-64 Cached
+ libopenblas 0.3.25 pthreads_h413a1c8_0 conda-forge/linux-64 Cached
+ libstdcxx-ng 13.2.0 h7e041cc_3 conda-forge/linux-64 Cached
+ numpy 1.26.3 py311h64a7726_0 conda-forge/linux-64 8MB
+ python_abi 3.11 4_cp311 conda-forge/linux-64 Cached
Upgrade:
───────────────────────────────────────────────────────────────────────────────────────
- ca-certificates 2022.12.7 ha878542_0 conda-forge
+ ca-certificates 2023.11.17 hbcca054_0 conda-forge/linux-64 Cached
- openssl 3.1.0 h0b41bf4_0 conda-forge
+ openssl 3.2.0 hd590300_1 conda-forge/linux-64 Cached
Managing independent Python interpreters with conda/mamba
Instead of the popular pyenv tool, I prefer to use conda.
Remember: mamba is a faster drop-in replacement for conda
When I first install my OS (Pop!_OS 22.04 NVIDIA driver edition is my daily driver), I install mamba and a few settings from my dotfiles:
$ cat bash/.bashrc
...
# >>> conda initialize >>>
# !! Contents within this block are managed by 'conda init' !!
__conda_setup="$('/home/sevagh/mambaforge/bin/conda' 'shell.bash' 'hook' 2> /dev/null)"
if [ $? -eq 0 ]; then
eval "$__conda_setup"
else
if [ -f "/home/sevagh/mambaforge/etc/profile.d/conda.sh" ]; then
. "/home/sevagh/mambaforge/etc/profile.d/conda.sh"
else
export PATH="/home/sevagh/mambaforge/bin:$PATH"
fi
fi
unset __conda_setup
if [ -f "/home/sevagh/mambaforge/etc/profile.d/mamba.sh" ]; then
. "/home/sevagh/mambaforge/etc/profile.d/mamba.sh"
fi
# <<< conda initialize <<<
mamba activate system
My condarc file:
channels:
- conda-forge
#auto_activate_base: false
#changeps1: false
system
and have that as the default activated env in my bashrc file. This is for my personal dev tools: things like yt-dlp, pympress, grip. Mamba by default has a base
environment that I don't care for, which is why I use my own system
environment as a catchall/"daily driver" Python. When I open a new terminal instance, I always have (system)
printed my mamba so I'm aware that:
- mamba is active and working
- The current version of Python, pip, and everything else Python-related is pointing to the mamba copies:
(system) sevagh@pop-os:~$ which python /home/sevagh/mambaforge/envs/system/bin/python (system) sevagh@pop-os:~$ which pip /home/sevagh/mambaforge/envs/system/bin/pip
Working on projects
For projects, I will create a new conda/mamba env, but I won't actually use conda or mamba for dependencies. I will prefer requirements.txt files and pyproject.toml these days. The best thing though is that I don't need a virtualenv, since I'm in the isolated Python interpreter created for that project.
This gives me a choice of breaking out into the more complicated (but more powerful) conda/mamba package distribution, but starting off with the slightly easier pip tools.