Uncloud[0] is a container orchestrator without a control plane. Think multi-machine Docker Compose with automatic WireGuard mesh, service discovery, and HTTPS via Caddy. Each machine just keeps a p2p-synced copy of cluster state (using Fly.io's Corrosion), so there's no quorum to maintain.
I’m building Uncloud after years of managing Kubernetes in small envs and at a unicorn. I keep seeing teams reach for K8s when they really just need to run a bunch of containers across a few machines with decent networking, rollouts, and HTTPS. The operational overhead of k8s is brutal for what they actually need.
A few things that make it unique:
- uses the familiar Docker Compose spec, no new DSL to learn
- builds and pushes your Docker images directly to your machines without an external registry (via my other project unregistry [1])
- imperative CLI (like Docker) rather than declarative reconciliation. Easier mental model and debugging
- works across cloud VMs, bare metal, even a Raspberry Pi at home behind NAT (all connected together)
"I keep seeing teams reach for K8s when they really just need to run a bunch of containers across a few machines"
Since k8s is very effective at running a bunch of containers across a few machines, it would appear to be exactly the correct thing to reach for. At this point, running a small k8s operation, with k3s or similar, has become so easy that I can't find a rational reason to look elsewhere for container "orchestration".
I can only speak for myself, but I considered a few options, including "simple k8s" like [Skate](https://skateco.github.io/), and ultimately decided to build on uncloud.
It was as much personal "taste" than anything, and I would describe the choice as similar to preferring JSON over XML.
For whatever reason, kubernetes just irritates me. I find it unpleasant to use. And I don't think I'm unique in that regard.
100%. I’m really not sure why K8S has become the complexity boogeyman. I’ve seen CDK apps or docker compose files that are way more difficult to understand than the equivalent K8S manifests.
But that's not what anyone is arguing here, nor what (to me it seems at least) uncloud is about. It's about simpler HA multinode setup with a single/low double digit containers.
If you already know k8s, this is probably true. If you don't it's hard to know what bits you need, and need to learn about, to get something simple set up.
Very cool! I think I'll have some opportunity soon to give it a shot, I have just the set of projects that have been needing a tool like this. One thing I think I'm missing after perusing the docs however is, how does one onboard other engineers to the cluster after it has been set up? And similarly, how does deployment from a CI/CD runner work? I don't see anything about how to connect to an existing cluster from a new machine, or at least not that I'm recognizing.
There isn't a cli function for adding a connection (independently of adding a new machine/node) yet, but they are in a simple config file (`~/.config/uncloud/config.yaml`) that you can copy or easily create manually for now. It looks like this:
I took some inspiration from Kamal, e.g. the imperative model but kamal is more a deployment tool.
In addition to deployments, uncloud handles clustering - connects machines and containers together. Service containers can discover other services via internal DNS and communicate directly over the secure overlay network without opening any ports on the hosts.
As far as I know kamal doesn’t provide an easy way for services to communicate across machines.
Services can also be scaled to multiple replicas across machines.
Thanks! I noticed afterwards that you mention Kamal in your readme, but you may want to add a comparison section that you link to where you compare your solution to others.
Are you working on this full time and if so, how are you funding it? Are you looking to monetize this somehow?
I’m working full time on this, yes. Funding from my savings at the moment and don’t have plans for any external funding or VC.
For monetisation, considering building a self-hosted and managed (SaaS) webUI for managing remote clusters and apps on them with value-added PaaS-like features.
Thanks for the both great tools. just i didn't understand one thing ?
the request flow, imaging we have 10 servers where we choose this request goes to server 1 and the other goes to 7 for example. and since its zero down time, how it says server 5 is updating so till it gets up no request should go there.
I think there are two different cases here. Not sure which one you’re talking about.
1. External requests, e.g. from the internet via the reverse proxy (Caddy) running in the cluster.
The rollout works on the container, not the server level. Each container registers itself in Caddy so it knows which containers to forward and distribute requests to.
When doing a rollout, a new version of container is started first, registers in caddy, then the old one is removed. This is repeated for each service container. This way, at any time there are running containers that serve requests.
It doesn’t say any server that requests shouldn’t go there. It just updates upstreams in the caddy config to send requests to the containers that are up and healthy.
2. Service to service requests within the cluster. In this case, a service DNS name is resolved to a list of IP addresses (running containers). And the client decides which one to send a request to or whether to distribute requests among them.
When the service is updated, the client needs to resolve the name again to get the up-to-date list of IPs.
Many http clients handle this automatically so using http://service-name as an endpoint typically just works. But zero downtime should still be handled by the client in this case.
Awesome tool! Does it provide some basic features that you would get from running a control plane.
Like rescheduling automatically a container on another server if a server is down? Deploying on the less filled server first if you have set limits in your containers?
There is no automatic rescheduling in uncloud by design. At least for now. We will see how far we can get without it.
If you want your service to tolerate a host going down, you should deploy multiple replicas for that service on multiple machines in advance. 'uc scale' command can be used to run more replicas for an already deployed service.
Regarding deploying on the less filled machine first is doable but not supported right now. By default, it picks the first machine randomly and tries to distributes replicas evenly among all available machines. You can also manually specify what target machine(s) each service should run on in your Compose file.
I want to avoid recreating the complexity with placement constraints, (anti-)affinity, etc. that makes K8s hard to reason about. There is a huge class of apps that need more or less static infra, manual placement, and a certain level of redundancy. That's what I'm targeting with Uncloud.
You have a graph that shows a multi provider setup for a domain. Where would routing to either machine happen? As in which ip would you use on the dns side?
Neat, as you include quite a few tool for services to be reachable together (not necessarily to the outside), do you also have tooling to make those services more interoperable?
So it's a kind of better Docker Swarm? It's interesting, but honestly I'd rather have something declarative, so I can use it with Pulumi, would it be complicated to add a declarative engine on top of the tool? Which discovers what services are already up, do a diff with the new declaration, and handles changes?
Having spent most of my career in kubernetes (usually managed by cloud), I always wonder when I see things like this, what is the use case or benefit of not having a control plane?
To me, the control plane is the primary feature of kubernetes and one I would not want to go without.
I know this describes operational overhead as a reason, but how it relates to the control plane is not clear to me. even managing a few hundred nodes and maybe 10,000 containers, relatively small - I update once a year and the managed cluster updates machine images and versions automatically. Are people trying to self host kubernetes for production cases, and that’s where this pain comes from?
Not rude at all. The benefit is a much simpler model where you simply connect machines in a network where every machine is equal. You can add more, remove some. No need to worry about an HA 3-node centralised “cluster brain”. There isn’t one.
It’s a similar experience when a cloud provider manages the control plane for you. But you have to worry about the availability when you host everything yourself. Losing etcd quorum results in an unusable cluster.
Many people want to avoid this, especially when running at a smaller scale like a handful of machines.
The cluster network can even partition and each partition continues to operate allowing to deploy/update apps individually.
That’s essentially what we all did in a pre-k8s era with chef and ansible but without the boilerplate and reinventing the wheel, and using the learnings from k8s and friends.
If you are a small operation and trying to self host k3s or k8s or any number of out of the box installations that are probably at least as complex as docker compose swarms, for any non trivial production case, presents similar problems in monitoring and availability as ones you’d get with off the shelf cloud provider managed services, except the managed solutions come without the pain in the ass. Except you don’t have a control plane.
I have managed custom server clusters in a self hosted situation. the problems are hard, but if you’re small, why would you reach for such a solution in the first place? you’d be better off paying for a managed service. What situation forces so many people to reach to self hosted kubernetes?
> a few hundred nodes and maybe 10,000 containers, relatively small
That feels not small to me. For something I'm working on I'll probably have two nodes and around 10 containers. If it works out and I get some growth, maybe that will go up to, say, 5-7 nodes and 30 or so containers? I dunno. I'd like some orchestration there, but k8s feels way too heavy even for my "grown" case.
I feel like there are potentially a lot of small businesses at this sort of scale?
Of course they are…? That’s half the point of k8s - if you want to self host, you can, but it’s just like backups: if you never try it, you should assume you can’t do it when you need to
On cloud, in my experience, you are mostly paying for compute with managed kubernetes instances. The overhead and price is almost never kubernetes itself, but the compute and storage you are provisioning, which, thanks to the control plane, you have complete control over. what am i missing?
I wouldn’t dare try to with a small shop try to self host a production kubernetes solution unless i was under duress. But I just dont see what the control plane has to do with it. It’s the feature that makes kubernetes worth it.
Nomad still has a tangible learning curve, which (in my very biased opinion) is almost non-existent with Uncloud assuming the user has already heard about Docker and Compose.
They had quite a few release in the last year so it's not dead that's for sure, but unclear how many new customers they are able to sign up. And with IBM in charge, it's also unclear at what moment they will loose interest.
Does it support a way to bundle things close to each other, for example, not having a database container hosted in a different datacenter than the web app?
The `compose.yaml` spec for services let's you specify which machines to deploy it on, so you could target the database and web app to the same machine (or subset of machines).
There is also an internal DNS for service discovery and it supports a `nearest.` prefix, which will preferentially use instances of a service running on the same machine. For example, I run a globally replicated NATS service and then connect to it from other services using the `nearest.nats.internal` address to connect to the machine-local NATS node.
Uncloud is not a Kubernetes distribution and doesn't use K8s primitives (although there are of course some similarities). It's closer to Compose/Swarm in how you declare and manage your services. Which has pros and cons depending on what you need and what your (or your team's) experience with Kubernetes is.
This is extremely interesting to me. I've been using docker swarm, but there is this growing feeling of staleness. Dokku feels a bit too light, K8 absolutely too heavy. This proposition strikes my sweet spot - especially the part where I keep my existing docker compose declarations
Hey, creator here. Thanks for sharing this!
Uncloud[0] is a container orchestrator without a control plane. Think multi-machine Docker Compose with automatic WireGuard mesh, service discovery, and HTTPS via Caddy. Each machine just keeps a p2p-synced copy of cluster state (using Fly.io's Corrosion), so there's no quorum to maintain.
I’m building Uncloud after years of managing Kubernetes in small envs and at a unicorn. I keep seeing teams reach for K8s when they really just need to run a bunch of containers across a few machines with decent networking, rollouts, and HTTPS. The operational overhead of k8s is brutal for what they actually need.
A few things that make it unique:
- uses the familiar Docker Compose spec, no new DSL to learn
- builds and pushes your Docker images directly to your machines without an external registry (via my other project unregistry [1])
- imperative CLI (like Docker) rather than declarative reconciliation. Easier mental model and debugging
- works across cloud VMs, bare metal, even a Raspberry Pi at home behind NAT (all connected together)
- minimal resource footprint (<150MB ram)
[0]: https://github.com/psviderski/uncloud
[1]: https://github.com/psviderski/unregistry
This is a cool tool, I like the idea. But the way `uc machine init` works under the hood is really scary. Lot's of `curl | bash` run as root.
While I would love to test this tool, this is not something I would run on any machine :/
+1 on this
I wanted to try it out but was put off by this[0]. It’s just straight up curl | bash as root from raw.githubusercontent.com.
If this is the install process for a server (and not just for the CLI) I don’t want to think about security in general for the product.
Sorry, I really wanted to like this, but pass.
[0] https://github.com/psviderski/uncloud/blob/ebd4622592bcecedb...
"I keep seeing teams reach for K8s when they really just need to run a bunch of containers across a few machines"
Since k8s is very effective at running a bunch of containers across a few machines, it would appear to be exactly the correct thing to reach for. At this point, running a small k8s operation, with k3s or similar, has become so easy that I can't find a rational reason to look elsewhere for container "orchestration".
I can only speak for myself, but I considered a few options, including "simple k8s" like [Skate](https://skateco.github.io/), and ultimately decided to build on uncloud.
It was as much personal "taste" than anything, and I would describe the choice as similar to preferring JSON over XML.
For whatever reason, kubernetes just irritates me. I find it unpleasant to use. And I don't think I'm unique in that regard.
100%. I’m really not sure why K8S has become the complexity boogeyman. I’ve seen CDK apps or docker compose files that are way more difficult to understand than the equivalent K8S manifests.
Managing hundreds or thousands of containers across hundreds or thousands of k8s nodes has a lot of operational challenges.
Especially in-house on bare metal.
But that's not what anyone is arguing here, nor what (to me it seems at least) uncloud is about. It's about simpler HA multinode setup with a single/low double digit containers.
Talos has made this super easy in my experience.
I don't think that argument matches with they "just need to run a bunch of containers across a few machines"
That’s awesome if k3s works for you, nothing wrong with this. You’re simply not the target user then.
If you already know k8s, this is probably true. If you don't it's hard to know what bits you need, and need to learn about, to get something simple set up.
Indeed, it seems a knee jerk response without justification. k3s is pretty damn minimal.
Very cool! I think I'll have some opportunity soon to give it a shot, I have just the set of projects that have been needing a tool like this. One thing I think I'm missing after perusing the docs however is, how does one onboard other engineers to the cluster after it has been set up? And similarly, how does deployment from a CI/CD runner work? I don't see anything about how to connect to an existing cluster from a new machine, or at least not that I'm recognizing.
There isn't a cli function for adding a connection (independently of adding a new machine/node) yet, but they are in a simple config file (`~/.config/uncloud/config.yaml`) that you can copy or easily create manually for now. It looks like this:
And you really just need one entry for typical use. The subsequent entries are only used if the previous node(s) are down.For CI/CD, check out this GitHub Action: https://github.com/thatskyapplication/uncloud-action.
You can either specify one of the machine SSH target in the config.yaml or pass it directly to the 'uc' CLI command, e.g.
uc --connect user@host deploy
How's this similar to and different from Kamal? https://kamal-deploy.org/
I took some inspiration from Kamal, e.g. the imperative model but kamal is more a deployment tool.
In addition to deployments, uncloud handles clustering - connects machines and containers together. Service containers can discover other services via internal DNS and communicate directly over the secure overlay network without opening any ports on the hosts.
As far as I know kamal doesn’t provide an easy way for services to communicate across machines.
Services can also be scaled to multiple replicas across machines.
Thanks! I noticed afterwards that you mention Kamal in your readme, but you may want to add a comparison section that you link to where you compare your solution to others.
Are you working on this full time and if so, how are you funding it? Are you looking to monetize this somehow?
Thank you for the suggestion!
I’m working full time on this, yes. Funding from my savings at the moment and don’t have plans for any external funding or VC.
For monetisation, considering building a self-hosted and managed (SaaS) webUI for managing remote clusters and apps on them with value-added PaaS-like features.
That sounds interesting, maybe I could help on the business side of things somehow. I'll email you my calendar link.
Awesome, will reach out!
This is neat, regarding clustering - can this work with distributed erlang/elixir?
Thanks for the both great tools. just i didn't understand one thing ? the request flow, imaging we have 10 servers where we choose this request goes to server 1 and the other goes to 7 for example. and since its zero down time, how it says server 5 is updating so till it gets up no request should go there.
I think there are two different cases here. Not sure which one you’re talking about.
1. External requests, e.g. from the internet via the reverse proxy (Caddy) running in the cluster.
The rollout works on the container, not the server level. Each container registers itself in Caddy so it knows which containers to forward and distribute requests to.
When doing a rollout, a new version of container is started first, registers in caddy, then the old one is removed. This is repeated for each service container. This way, at any time there are running containers that serve requests.
It doesn’t say any server that requests shouldn’t go there. It just updates upstreams in the caddy config to send requests to the containers that are up and healthy.
2. Service to service requests within the cluster. In this case, a service DNS name is resolved to a list of IP addresses (running containers). And the client decides which one to send a request to or whether to distribute requests among them.
When the service is updated, the client needs to resolve the name again to get the up-to-date list of IPs. Many http clients handle this automatically so using http://service-name as an endpoint typically just works. But zero downtime should still be handled by the client in this case.
Awesome tool! Does it provide some basic features that you would get from running a control plane.
Like rescheduling automatically a container on another server if a server is down? Deploying on the less filled server first if you have set limits in your containers?
Thank you! That's actually the trade off.
There is no automatic rescheduling in uncloud by design. At least for now. We will see how far we can get without it.
If you want your service to tolerate a host going down, you should deploy multiple replicas for that service on multiple machines in advance. 'uc scale' command can be used to run more replicas for an already deployed service.
Longer term, I'm thinking we can have a concept of primary/standby replicas for services that can only have one running replica, e.g. databases. Something similar to how Fly.io does this: https://fly.io/docs/apps/app-availability/#standby-machines-...
Regarding deploying on the less filled machine first is doable but not supported right now. By default, it picks the first machine randomly and tries to distributes replicas evenly among all available machines. You can also manually specify what target machine(s) each service should run on in your Compose file.
I want to avoid recreating the complexity with placement constraints, (anti-)affinity, etc. that makes K8s hard to reason about. There is a huge class of apps that need more or less static infra, manual placement, and a certain level of redundancy. That's what I'm targeting with Uncloud.
You have a graph that shows a multi provider setup for a domain. Where would routing to either machine happen? As in which ip would you use on the dns side?
Not OP, but you could do "simple" dns load balancing between both endpoints.
Neat, as you include quite a few tool for services to be reachable together (not necessarily to the outside), do you also have tooling to make those services more interoperable?
Do you have an example of what you mean? I'm not entirely clear on your question.
So it's a kind of better Docker Swarm? It's interesting, but honestly I'd rather have something declarative, so I can use it with Pulumi, would it be complicated to add a declarative engine on top of the tool? Which discovers what services are already up, do a diff with the new declaration, and handles changes?
does it support ipv6?
There is an open issue that confirms enabling ipv6 for containers works: https://github.com/psviderski/uncloud/issues/126 But this hasn’t been enabled by default.
What specifically do you mean by ipv6 support?
> What specifically do you mean by ipv6 support?
This question does not make sense. This is equivalent to asking "What specifically do you mean by ipv4 support"
These days both protocols must be supported, and if there is a blocker it should be clearly mentioned.
How do you want to allocate ipv6 addresses to containers? Turns out there are lots of answers. Some people even want to do ipv6 NAT.
haha, uncloud does have a control plane: the mind of the person running "uc" CLI commands
> I’m building Uncloud after years of managing Kubernetes
did you manage Kubernetes, or did you make the fateful mistake of managing microk8s?
Having spent most of my career in kubernetes (usually managed by cloud), I always wonder when I see things like this, what is the use case or benefit of not having a control plane?
To me, the control plane is the primary feature of kubernetes and one I would not want to go without.
I know this describes operational overhead as a reason, but how it relates to the control plane is not clear to me. even managing a few hundred nodes and maybe 10,000 containers, relatively small - I update once a year and the managed cluster updates machine images and versions automatically. Are people trying to self host kubernetes for production cases, and that’s where this pain comes from?
Sorry if it is a rude question.
Not rude at all. The benefit is a much simpler model where you simply connect machines in a network where every machine is equal. You can add more, remove some. No need to worry about an HA 3-node centralised “cluster brain”. There isn’t one.
It’s a similar experience when a cloud provider manages the control plane for you. But you have to worry about the availability when you host everything yourself. Losing etcd quorum results in an unusable cluster.
Many people want to avoid this, especially when running at a smaller scale like a handful of machines.
The cluster network can even partition and each partition continues to operate allowing to deploy/update apps individually.
That’s essentially what we all did in a pre-k8s era with chef and ansible but without the boilerplate and reinventing the wheel, and using the learnings from k8s and friends.
If you are a small operation and trying to self host k3s or k8s or any number of out of the box installations that are probably at least as complex as docker compose swarms, for any non trivial production case, presents similar problems in monitoring and availability as ones you’d get with off the shelf cloud provider managed services, except the managed solutions come without the pain in the ass. Except you don’t have a control plane.
I have managed custom server clusters in a self hosted situation. the problems are hard, but if you’re small, why would you reach for such a solution in the first place? you’d be better off paying for a managed service. What situation forces so many people to reach to self hosted kubernetes?
k3s uses sqlite, so not etcd.
It can use sqlite (single master), or for cluster it can use pg, or mysql, but etcd by default
> a few hundred nodes and maybe 10,000 containers, relatively small
That feels not small to me. For something I'm working on I'll probably have two nodes and around 10 containers. If it works out and I get some growth, maybe that will go up to, say, 5-7 nodes and 30 or so containers? I dunno. I'd like some orchestration there, but k8s feels way too heavy even for my "grown" case.
I feel like there are potentially a lot of small businesses at this sort of scale?
Kubernetes is not only an orchestrator but a scheduler.
Is a way to run arbitrary processes on a bunch of servers.
But what if your processes are known beforehand? Than you don't need a scheduler, nor an orchestrator.
If it's just your web app with two containers and nothing more?
> Are people trying to self host kubernetes
Of course they are…? That’s half the point of k8s - if you want to self host, you can, but it’s just like backups: if you never try it, you should assume you can’t do it when you need to
Try it on bare metal where you're managing the distributed storage and the hardware and the network and the upgrades too :)
Why would you want to do that though?
On cloud, in my experience, you are mostly paying for compute with managed kubernetes instances. The overhead and price is almost never kubernetes itself, but the compute and storage you are provisioning, which, thanks to the control plane, you have complete control over. what am i missing?
I wouldn’t dare try to with a small shop try to self host a production kubernetes solution unless i was under duress. But I just dont see what the control plane has to do with it. It’s the feature that makes kubernetes worth it.
Tinkerbell / MetalKube, ClusterAPI, Rook, Cilium?
A control plane makes controlling machines easier, that's the point of a control plane.
If not K8S, why not Nomad (https://github.com/hashicorp/nomad)?
Nomad still has a tangible learning curve, which (in my very biased opinion) is almost non-existent with Uncloud assuming the user has already heard about Docker and Compose.
Nomad is great, but you will still end up with a control plane.
Isn't Nomad pretty much dead now?
They had quite a few release in the last year so it's not dead that's for sure, but unclear how many new customers they are able to sign up. And with IBM in charge, it's also unclear at what moment they will loose interest.
Does it support a way to bundle things close to each other, for example, not having a database container hosted in a different datacenter than the web app?
The `compose.yaml` spec for services let's you specify which machines to deploy it on, so you could target the database and web app to the same machine (or subset of machines).
There is also an internal DNS for service discovery and it supports a `nearest.` prefix, which will preferentially use instances of a service running on the same machine. For example, I run a globally replicated NATS service and then connect to it from other services using the `nearest.nats.internal` address to connect to the machine-local NATS node.
As a happy user of coolify, what’s the difference between these two?
Even coolify lets you add as many machines as you want and then manage docker containers in all machines from one coolify installation.
Vendor lock-in, Kubernetes is future proof.
Wow this looks really cool. Congratulations!
How does this compare to k3s?
Uncloud is not a Kubernetes distribution and doesn't use K8s primitives (although there are of course some similarities). It's closer to Compose/Swarm in how you declare and manage your services. Which has pros and cons depending on what you need and what your (or your team's) experience with Kubernetes is.
How does this compare to something like Dokku?
Is dokku multi node?
This is extremely interesting to me. I've been using docker swarm, but there is this growing feeling of staleness. Dokku feels a bit too light, K8 absolutely too heavy. This proposition strikes my sweet spot - especially the part where I keep my existing docker compose declarations
Looks lovely.. I'll definitely will give it a try when time comes.
I'm using Dokploy, would that be very similar, or quite different?