Ah, more complex than I thought: "venvstacks allows you to package Python applications and all their dependencies into a portable, deterministic format, without needing to include copies of these large Python frameworks in every application archive.", and in "Caveats and Limitations" (please, all projects should have one): "does NOT support combining arbitrary virtual environments with each other".
Is there a helper to merge venv1 and venv2, or create venv2 which uses venv1 dependencies and on load both are merged?
The hard part is figuring out what "merge" means for your use case. If there's a definite set of packages that should be in the environment that are all already at definite locations on the local drive, there are many possible approaches (copying, `.pth` files, hard links, and symlinks should also work) to stitching together the venv you want. But you can't just feed the individual package paths to Python, because `sys.path` entries are places that Python will look for the top-level package folders (and top-level module `.py` files - which explains why), not the paths to individual importable things.
More importantly, at runtime you can only have one version of a given package, because the imports are resolved at runtime. Pip won't put multiple versions of the same library into the same environment normally; you can possibly force it to (or more likely, explicitly do it yourself) but then everyone that wants that library will find whichever version gets `import`ed and cached first, which will generally be whichever one is found first on `sys.path` when the first `import` statement is reached at runtime. (Yes, the problem is the same if you only have one venv in the first place, in the sense that your dependency graph could be unsolvable. But naively merging venvs could mean not noticing the problem until, at runtime, something tries to import something else, gets an incompatible version, and fails.)
For example, there can be "dev" group that includes "test", "mkdocs", "nuitka" groups (nuitka wants to be run with venv it builds binary for, so to keep venv minimal, it is in a separate group)
My understanding is that the entire reason that venv exists is because python's library system is nothing but dependency spaghetti: whatever is needed by one project conflicts with whatever is needed by another so you have to give them bespoke library environments where those conflicts won't interfere with one another.
In that perspective "merging" them directly defeats the purpose. What is needed is a better library ecosystem.
venvs are used to isolate groups of dependencies, but it's not just about conflicts. Many other languages expect you to do the same thing; people complain less because either the language statically resolves imports and can support multiple versions of a library in the same environment; and/or because the ecosystem has some conventions that allow the tooling to detect the "current environment" better, and/or because the standard installer isn't itself a library that defaults to appearing in every environment and installing specifically for "its own" environment; and/or (possibly the biggest one) they don't have to worry about building and installing complex multi-language projects where the one they're using is just providing a binding.
An important reason for using them is to test deployment: if your code works in a venv that only has specific things installed (not just some default "sandbox" where you install everything), then you can be sure that you didn't forget to list a dependency in the metadata. Plus you can test with ranges of versions of your dependencies and confirm which ones your library will work with.
They're also convenient for making neat, isolated installations for applications. Pipx wraps pip and venv to do this, and as I understand it there's similarly uvx for uv. This is largely about making sure you avoid "interference", but also about pinning specific versions of dependencies. It also lowers the bar somewhat for your users: they still need to have a basic idea of what Python is and know that they have a suitable version of Python installed, but you can tell them to install Pipx if they don't have it and run a single install command, instead of having to wrestle with both pip and venv and then also know how to access the venv's console scripts that might not have been put on PATH.
People want to be able to have a free-wheeling NodeJS-like ecosystem (in a language that shares the whole "`import`s are tracked down at runtime rather than being statically linked" thing, which in turn means you can only practically have one version of the same dependency in a given environment) while also having pieces that are, well, as complex as SciPy, Tensorflow etc. are. And they want to integrate all the missing functionality without anyone ever getting their workflow broken, because everyone still remembers the 2->3 transition. They want this even though the traditional `setup.py`-only way of doing things was abhorrently designed (i.e. you have to run arbitrary code at install time to get any metadata at all - including metadata about build dependencies, the absence of which might cause that code to fail).
I didn't realize how bad it was until I recently looked into doing a Python Qt application. There's two competing Python language bindings for Qt: PyQt and PySide. The latter appears more free and active, so that seemed the way to go.
How do you even make your stuff dependent on/broken with specific Python versions? I mean how in hell?
The fact that venv is so widely used in Python was always an indication that not all was well dependency-wise, but it doesn't seem like it's optional anymore.
My guess would be that because it's an integration with a sprawling native library, there is a lot of code which touches Python's own C API. That is not entirely stable, and it has a lot more source stability than binary stability:
All I see is that newer library versions drop support for old Python versions (seems natural, unless you have unlimited resources). For example, If you need Python 2 support, use corresponding old versions.
The problem is that people are asking for more and more out of these tools. The tools we had last year are great at solving all the problems we had 20 years ago. But as long as we keep coming up with new problems, we're going to keep needing new tools.
We figured, but people can't stand still and just Do Shit (TM).
Poetry works fine and solves dependencies and peer dependencies. But I guess the javascript-style churn of 'every week a new tool which is better than the formers' has arrived too in Python land.
Poetry in theory works fine but I found it a pain to use on Windows. As in flakey. A project works, then a dependency can't be found, so I reboot and then it can. The tools to manage and diagnose why it is not installing to the right env are non existant. I am probably not holding it right... but should I need to deep dive rabbit hole into installing dependencies. It is a non issue in NPM, .NET's Nuget, Haskell Stack, Ruby, etc.
It would be nice if Python could actually support this.
But 'venv detection' is conventionally (everywhere) done by a stupid, never-designed involving inspecting argv0. At the same time, most Python packaging tools make various assumptions like "if I'm in a venv, I can write into it", or "there will only ever be one level of venvery, so I'm not in a venv iff I'm looking at the system Python".
Nesting/layering venvs has never been supported by any Python dependency management tool except for experiments that were rejected by more senior people in the community, the maintainers of major dependency management tools, etc.
Same kind of thing goes for allowing multiple versions of a dependency in the dependency tree at runtime.
There's little appetite for fixing Python's dependency management issues in a principled or general way, because that would require breaking things or adding limitations. You can't have dependencies partially managed by arbitrary code that runs only at install time and also build good packaging tools, for instance.
In all seriousness, it is a bit tiresome when you come back to the python world and see a completely fragmented ecosystem (poetry, pdm, pip-tools, uv, and the older ways).
These tools are attempting to solve different problems (really, different, overlapping sets of problems) - because there isn't even really agreement on what the problems are.
My personal view is that there "one standard tool" makes sense for users (i.e. people who will install applications, install libraries in order to create their own personal projects, etc.) but not for developers (i.e. people who will publish their code, or need to care about rigorous testing, published documentation etc.). The latter require the users' install tool, plus specific-scoped tools according to their personal preferences and their personal conception of what the problems are in "development" that need solving. That is, they should be able to build their own toolchain, although they might start with a standard recommendation.
Pip's scope has already crept in the sense that it exposes a `wheel` subcommand. It needs to be able to build wheels in order to install from sdists, but exposing this functionality means providing... half of a proper build frontend. (PyPA already offers a complete, simple, proper build frontend, called `build`; if `pip wheel` is part of your deployment process, you should probably be using `build` instead.)
Python’s Cambrian explosion of dependency management continues. Maybe one day via some strange machine form of genetic algorithms a canonical system will emerge.
I was expecting this to be containers under the hood, instead it seems the idea is basically to bundle your virtualenv and ship it to prod? I think I'll stick with Docker...
And with just 3 layers: Runtime, Framework, Application. But at least you are not switching tools, and it presumably would prevent you from installing LARGE_v1.1, and then installing TINY_v2.2 in a later layer, which however upgrades LARGE to v1.2 and your docker images are now twice the size.
Does this support non-python dependencies too e.g. gstreamer? It wasn't clear to me.
I'm very wary of a new tool announcement that doesn't appear to mention why the existing tools/solutions were not sufficient. Which gap does this fill?
I watch my CI pipeline spend a minute or two creating the virtualenv and pip install into it several times a day, despite the fact that nothing has changed (no change to requirements.txt, etc), and I wish there was a well-supported way to package up a venv into some kind of artifact (keyed by Python version, host platform and package versions) so we only had to rebuild it if something changed. This sounds a bit like that, but also sounds like something more complicated than what I was looking for.
Is it re-solving for dependencies every time? Have you checked where the bottleneck is? (Don't CI pipeline tools hold onto Pip's download cache? Maybe this is part of why Setuptools gets so many downloads...)
Here is one piece of knowledge earned the hard way. Do not use the official python GDAL bindings (`pip install GDAL`) unless you really must. Go with rasterio and don't look back.
Excellent advice. Add fiona and shapely for manipulating vector data, and pyproj for projections and coordinate systems. Yes there are corner cases where installing 'real' GDAL is still needed, but for most common use cases you can avoid it.
I must be missing something because I understood that pip cached packages already.
Perhaps it still creates a copy of the package files in the virtual environment, thus the cache only saves repeated downloads and not local disk space. If that's the case then this does look really useful.
Pip caches downloads, but in the sort of filesystem database that git uses. You can't even directly pull .whl files (nor unzipped folders) out of it. Every virtual environment gets its own, unzipped copy of the wheels represented in that database. So yes, it's only "saving repeated downloads". (And it will still talk to PyPI by default to see if there's a newer version. There's an outstanding issue to add a proper "offline mode": https://github.com/pypa/pip/issues/8057)
It doesn't need to work like that. A lot of libraries would work fine directly from the wheel, because they're essentially renamed zip files and Python knows how to import from the archive contents directly. But this doesn't work if the package is supposed to come with any mutable data, nor if it tries to use ordinary file I/O for its immutable data (you're supposed to use a standard library helper for that, but awareness is poor and it's a hassle anyway). Long ago, the "egg" format expected you to set a "zip-safe" flag (via Setuptools, back when it was your actual packaging tool rather than just a behind-the-scenes helper wrapped in multiple backwards-compatibility layers) so that installers could choose to leave the archive zipped. But I don't know that Pip ever actually used that information, and it was easy to get wrong.
But more importantly, the contents of the virtual environment could be referenced from a cache that contained actual wheels (packed or unpacked) by hard links, `.pth` files (with a slight performance hit) or symlinks (I'm pretty sure; haven't tested).
I’m using poetry because of the default lock file usage (there isn’t a global, mutable state, all changes required to update the lock file). I wish that this function could be incorporated to poetry directly
Since you mentioned uv and the topic is virtual environments...
I am using uv and it seems great.
I don't understand the difference between using "python -m venv .venv; source .venv/bin/activate" and creating a venv with uv and then running the same source command. What does uv offer/integrate that's not present already in python venv module?
Got it. I'm just unsure whether I can use the other uv features like pin with it? That feature feels like it was added and the benefits or trade-offs weren't documented. At least I don't see it.
I just take a look at the uv feature list and the feature I wanted most is Python version management. I'm using micromamba to install all interested Python versions (from 3.8 - 3.12) and set tell poetry about the python location.
But like you said, poetry is working so well so I'll wait a little bit longer before jumping the ship.
Ah, more complex than I thought: "venvstacks allows you to package Python applications and all their dependencies into a portable, deterministic format, without needing to include copies of these large Python frameworks in every application archive.", and in "Caveats and Limitations" (please, all projects should have one): "does NOT support combining arbitrary virtual environments with each other".
Is there a helper to merge venv1 and venv2, or create venv2 which uses venv1 dependencies and on load both are merged?
The hard part is figuring out what "merge" means for your use case. If there's a definite set of packages that should be in the environment that are all already at definite locations on the local drive, there are many possible approaches (copying, `.pth` files, hard links, and symlinks should also work) to stitching together the venv you want. But you can't just feed the individual package paths to Python, because `sys.path` entries are places that Python will look for the top-level package folders (and top-level module `.py` files - which explains why), not the paths to individual importable things.
More importantly, at runtime you can only have one version of a given package, because the imports are resolved at runtime. Pip won't put multiple versions of the same library into the same environment normally; you can possibly force it to (or more likely, explicitly do it yourself) but then everyone that wants that library will find whichever version gets `import`ed and cached first, which will generally be whichever one is found first on `sys.path` when the first `import` statement is reached at runtime. (Yes, the problem is the same if you only have one venv in the first place, in the sense that your dependency graph could be unsolvable. But naively merging venvs could mean not noticing the problem until, at runtime, something tries to import something else, gets an incompatible version, and fails.)
To create venv2 that uses venv1, define pep-735 dependencies groups in your pyproject.toml Specifically, a group can include other groups
uv supports groups and can create venv with the desired group set https://docs.astral.sh/uv/concepts/dependencies/#development...
For example, there can be "dev" group that includes "test", "mkdocs", "nuitka" groups (nuitka wants to be run with venv it builds binary for, so to keep venv minimal, it is in a separate group)
My understanding is that the entire reason that venv exists is because python's library system is nothing but dependency spaghetti: whatever is needed by one project conflicts with whatever is needed by another so you have to give them bespoke library environments where those conflicts won't interfere with one another.
In that perspective "merging" them directly defeats the purpose. What is needed is a better library ecosystem.
venvs are used to isolate groups of dependencies, but it's not just about conflicts. Many other languages expect you to do the same thing; people complain less because either the language statically resolves imports and can support multiple versions of a library in the same environment; and/or because the ecosystem has some conventions that allow the tooling to detect the "current environment" better, and/or because the standard installer isn't itself a library that defaults to appearing in every environment and installing specifically for "its own" environment; and/or (possibly the biggest one) they don't have to worry about building and installing complex multi-language projects where the one they're using is just providing a binding.
An important reason for using them is to test deployment: if your code works in a venv that only has specific things installed (not just some default "sandbox" where you install everything), then you can be sure that you didn't forget to list a dependency in the metadata. Plus you can test with ranges of versions of your dependencies and confirm which ones your library will work with.
They're also convenient for making neat, isolated installations for applications. Pipx wraps pip and venv to do this, and as I understand it there's similarly uvx for uv. This is largely about making sure you avoid "interference", but also about pinning specific versions of dependencies. It also lowers the bar somewhat for your users: they still need to have a basic idea of what Python is and know that they have a suitable version of Python installed, but you can tell them to install Pipx if they don't have it and run a single install command, instead of having to wrestle with both pip and venv and then also know how to access the venv's console scripts that might not have been put on PATH.
It's crazy to me that in 2025 we still haven't figured out python dependency management
People want to be able to have a free-wheeling NodeJS-like ecosystem (in a language that shares the whole "`import`s are tracked down at runtime rather than being statically linked" thing, which in turn means you can only practically have one version of the same dependency in a given environment) while also having pieces that are, well, as complex as SciPy, Tensorflow etc. are. And they want to integrate all the missing functionality without anyone ever getting their workflow broken, because everyone still remembers the 2->3 transition. They want this even though the traditional `setup.py`-only way of doing things was abhorrently designed (i.e. you have to run arbitrary code at install time to get any metadata at all - including metadata about build dependencies, the absence of which might cause that code to fail).
Of course we haven't figured it out yet.
I love the fact that almost every single answer to your comment is a completely different take.
I didn't realize how bad it was until I recently looked into doing a Python Qt application. There's two competing Python language bindings for Qt: PyQt and PySide. The latter appears more free and active, so that seemed the way to go.
They have a compatibility matrix. It's mostly red. https://wiki.qt.io/Qt_for_Python#Python_compatibility_matrix
How do you even make your stuff dependent on/broken with specific Python versions? I mean how in hell?
The fact that venv is so widely used in Python was always an indication that not all was well dependency-wise, but it doesn't seem like it's optional anymore.
My guess would be that because it's an integration with a sprawling native library, there is a lot of code which touches Python's own C API. That is not entirely stable, and it has a lot more source stability than binary stability:
https://docs.python.org/3/c-api/stable.html
As time passes, it makes sense to support only recent versions of both Qt and Python, hence that matrix.
All I see is that newer library versions drop support for old Python versions (seems natural, unless you have unlimited resources). For example, If you need Python 2 support, use corresponding old versions.
The problem is that people are asking for more and more out of these tools. The tools we had last year are great at solving all the problems we had 20 years ago. But as long as we keep coming up with new problems, we're going to keep needing new tools.
We figured, but people can't stand still and just Do Shit (TM).
Poetry works fine and solves dependencies and peer dependencies. But I guess the javascript-style churn of 'every week a new tool which is better than the formers' has arrived too in Python land.
Poetry in theory works fine but I found it a pain to use on Windows. As in flakey. A project works, then a dependency can't be found, so I reboot and then it can. The tools to manage and diagnose why it is not installing to the right env are non existant. I am probably not holding it right... but should I need to deep dive rabbit hole into installing dependencies. It is a non issue in NPM, .NET's Nuget, Haskell Stack, Ruby, etc.
Debian had it figured out, then they had to go and ruin it all by hopping on the venv bandwagon.
Dependency management and imports (without a proper namespace solution) make me angry beyond reason. I love and hate Python.
Don't worry, there's still a couple of months left until then.
In fairness, this is a response to a need which became common only relatively recently
contrary to popular belief, Python itself doesn't have a dependency problem.
The native ABI PyObject Cuda .so .dll shit had wayy to many serious problems.
Other lang also had the same problem, think something like cgo or JNI
What are you talking about? Everyone just uses OmegaStar.
What do you mean? Just venv your venv.
It would be nice if Python could actually support this.
But 'venv detection' is conventionally (everywhere) done by a stupid, never-designed involving inspecting argv0. At the same time, most Python packaging tools make various assumptions like "if I'm in a venv, I can write into it", or "there will only ever be one level of venvery, so I'm not in a venv iff I'm looking at the system Python".
Nesting/layering venvs has never been supported by any Python dependency management tool except for experiments that were rejected by more senior people in the community, the maintainers of major dependency management tools, etc.
Same kind of thing goes for allowing multiple versions of a dependency in the dependency tree at runtime.
There's little appetite for fixing Python's dependency management issues in a principled or general way, because that would require breaking things or adding limitations. You can't have dependencies partially managed by arbitrary code that runs only at install time and also build good packaging tools, for instance.
Python venv tools are the new JS frameworks: every day some other ridiculous tool is born.
And yes, everything is already solved with: due diligence (non-existent in scientific community) and nix.
nix mentioned let's go
In all seriousness, it is a bit tiresome when you come back to the python world and see a completely fragmented ecosystem (poetry, pdm, pip-tools, uv, and the older ways).
These tools are attempting to solve different problems (really, different, overlapping sets of problems) - because there isn't even really agreement on what the problems are.
My personal view is that there "one standard tool" makes sense for users (i.e. people who will install applications, install libraries in order to create their own personal projects, etc.) but not for developers (i.e. people who will publish their code, or need to care about rigorous testing, published documentation etc.). The latter require the users' install tool, plus specific-scoped tools according to their personal preferences and their personal conception of what the problems are in "development" that need solving. That is, they should be able to build their own toolchain, although they might start with a standard recommendation.
Pip's scope has already crept in the sense that it exposes a `wheel` subcommand. It needs to be able to build wheels in order to install from sdists, but exposing this functionality means providing... half of a proper build frontend. (PyPA already offers a complete, simple, proper build frontend, called `build`; if `pip wheel` is part of your deployment process, you should probably be using `build` instead.)
Can you clarify your second sentence? I'm having a hard time understanding the point you're making.
https://news.ycombinator.com/item?id=42032174
Nix is still a nuisance to use, specially if you do not know the nix language (but even if you do).
Couple more resources:
Github Repo: https://github.com/lmstudio-ai/venvstacks
Blog post: https://lmstudio.ai/blog/venvstacks
Docs: https://venvstacks.lmstudio.ai/
Changed to the blog post from https://pypi.org/project/venvstacks/, since it gives more background. Thanks!
So, Docker container image, without OverlayFS benefits?
Python’s Cambrian explosion of dependency management continues. Maybe one day via some strange machine form of genetic algorithms a canonical system will emerge.
uv is clearly becoming the norm
[dead]
I was expecting this to be containers under the hood, instead it seems the idea is basically to bundle your virtualenv and ship it to prod? I think I'll stick with Docker...
And with just 3 layers: Runtime, Framework, Application. But at least you are not switching tools, and it presumably would prevent you from installing LARGE_v1.1, and then installing TINY_v2.2 in a later layer, which however upgrades LARGE to v1.2 and your docker images are now twice the size.
Does this support non-python dependencies too e.g. gstreamer? It wasn't clear to me.
I'm very wary of a new tool announcement that doesn't appear to mention why the existing tools/solutions were not sufficient. Which gap does this fill?
Edit: the answer to my second question is at https://venvstacks.lmstudio.ai/design/
Why would this be an advantage to Astral’s uv?
None. FWIW, this exists because they wanted to ship a whole Python environment with an electron app.
I watch my CI pipeline spend a minute or two creating the virtualenv and pip install into it several times a day, despite the fact that nothing has changed (no change to requirements.txt, etc), and I wish there was a well-supported way to package up a venv into some kind of artifact (keyed by Python version, host platform and package versions) so we only had to rebuild it if something changed. This sounds a bit like that, but also sounds like something more complicated than what I was looking for.
Gitlab CI can do that. https://docs.gitlab.com/ee/ci/caching/#compute-the-cache-key...
Is it re-solving for dependencies every time? Have you checked where the bottleneck is? (Don't CI pipeline tools hold onto Pip's download cache? Maybe this is part of why Setuptools gets so many downloads...)
I thought setup-python handled that already, no? https://github.com/actions/setup-python?tab=readme-ov-file#c...
There's also the pip-compatible mode of uv which is much, much faster than pip.
Put it in a container?
But can it do GDAL.
(This is the Doom equivalent joke for Python environments!)
Here is one piece of knowledge earned the hard way. Do not use the official python GDAL bindings (`pip install GDAL`) unless you really must. Go with rasterio and don't look back.
Go with rasterio and don't look back
Excellent advice. Add fiona and shapely for manipulating vector data, and pyproj for projections and coordinate systems. Yes there are corner cases where installing 'real' GDAL is still needed, but for most common use cases you can avoid it.
Yes, I have generally tried to do that. But sometimes there are third party libraries that depend on GDAL and then there's no way around it.
What's wrong with Pyenv & Poetry?
Nothing per se. But there are a few different workflows and project types they either don't support or make very difficult.
If pyenv and poetry solve all your problems then it's a perfectly fine setup.
I must be missing something because I understood that pip cached packages already.
Perhaps it still creates a copy of the package files in the virtual environment, thus the cache only saves repeated downloads and not local disk space. If that's the case then this does look really useful.
Pip caches downloads, but in the sort of filesystem database that git uses. You can't even directly pull .whl files (nor unzipped folders) out of it. Every virtual environment gets its own, unzipped copy of the wheels represented in that database. So yes, it's only "saving repeated downloads". (And it will still talk to PyPI by default to see if there's a newer version. There's an outstanding issue to add a proper "offline mode": https://github.com/pypa/pip/issues/8057)
It doesn't need to work like that. A lot of libraries would work fine directly from the wheel, because they're essentially renamed zip files and Python knows how to import from the archive contents directly. But this doesn't work if the package is supposed to come with any mutable data, nor if it tries to use ordinary file I/O for its immutable data (you're supposed to use a standard library helper for that, but awareness is poor and it's a hassle anyway). Long ago, the "egg" format expected you to set a "zip-safe" flag (via Setuptools, back when it was your actual packaging tool rather than just a behind-the-scenes helper wrapped in multiple backwards-compatibility layers) so that installers could choose to leave the archive zipped. But I don't know that Pip ever actually used that information, and it was easy to get wrong.
But more importantly, the contents of the virtual environment could be referenced from a cache that contained actual wheels (packed or unpacked) by hard links, `.pth` files (with a slight performance hit) or symlinks (I'm pretty sure; haven't tested).
I’m using poetry because of the default lock file usage (there isn’t a global, mutable state, all changes required to update the lock file). I wish that this function could be incorporated to poetry directly
For a variety of reasons this season is a good time to look at `uv`.
See this by the author of Rye:
https://lucumr.pocoo.org/2024/8/21/harvest-season/
"Unified" packaging:
https://astral.sh/blog/uv-unified-python-packaging
Since you mentioned uv and the topic is virtual environments...
I am using uv and it seems great.
I don't understand the difference between using "python -m venv .venv; source .venv/bin/activate" and creating a venv with uv and then running the same source command. What does uv offer/integrate that's not present already in python venv module?
It replaces pip and uses venv module under the hood if I understand correctly
Got it. I'm just unsure whether I can use the other uv features like pin with it? That feature feels like it was added and the benefits or trade-offs weren't documented. At least I don't see it.
Poetry works so well for me at the moment, I prefer to let uv cook
I just take a look at the uv feature list and the feature I wanted most is Python version management. I'm using micromamba to install all interested Python versions (from 3.8 - 3.12) and set tell poetry about the python location.
But like you said, poetry is working so well so I'll wait a little bit longer before jumping the ship.
Rye wraps uv and adds python version management, among other things.
How do venvstacks differ from PyInstaller and other bundlers?
Why not just use PDM
venv is admitting that there no longer is a Python language, only multitudes of pythons, one for each application you want to run.