Show HN: Xata, open-source Postgres platform with copy-on-write branches

(github.com)

3 points | by tudorg 6 hours ago ago

1 comments

tudorg 6 hours ago

Hi HN,

This is Tudor from Xata. You can think of Xata as an open-source, self-hosted, alternative to Aurora/Neon. Highlight features are:

- Fast copy-on-write branching.

- Automatic scale-to-zero and wake-up on new connections.

- 100% Vanilla Postgres. We run upstream Postgres, no modifications.

- Production grade: high availability, read replicas, automatic failover/switchover, upgrades, backups with PITR, IP filtering, etc.

You can self-host it, or you can use our [cloud service](https://xata.io).

Background story: we exist as a company for almost 5 years, offered a Postgres service from the start, and have launched several different products and open source projects here on HN before, including pgroll and pgstream. About a year and half ago, we’ve started to rearchitect our core platform from scratch. It is running in production for almost an year now, and it’s serving customers of all sizes, including many multi-TB databases.

One of our goals in designing the new platform was to make it cloud independent and with a careful selection of dependencies. Part of the reason was for us to be able to offer it in any cloud, and the other part is the subject of the announcement today: we wanted to have it open source and self-hostable.

Use cases: We think Xata OSS is appropriate for two use cases:

- get fast your preview / testing / dev / ephemeral environments with realistic data. We think for many companies this is a better alternative to seed or synthetic data, and allows you to catch more classes of bugs. Combined with anonymization, especially in the world of coding agents, this is an important safety and productivity enabler.

- offer an internal PGaaS. The alternative we usually see at customers is that they use a Kubernetes operator to achieve this. But there’s more to a Postgres platform than just the operator. Xata is more opinionated and comes with APIs and CLI.

Technical details: We wanted from the start to offer CoW branching and vanilla Postgres. This basically meant that we wanted to do CoW at the storage layer, under Postgres. We’ve have tested a bunch of storage system for performance and reliability and ultimately landed on using OpenEBS. OpenEBS is an umbrella project for more storage engines for Kubernetes, and the one that we use is the replicated storage engine (aka Mayastor).

Small side note on separation of storage from compute: since the introduction of PlanetScale Metal, there has been a lot of discussion about the performance of local storage. We had these discussions internally as well, and what’s nice about OpenEBS is that it actually supports both: there are local storage engines and over-the-network storage engines. For our purpose of running CoW branches, however, the advantages of the separation are pretty clear: it allows spreading the compute to multiple nodes, while keeping the storage volumes colocated, which is needed for CoW. So for now the Xata platform is focused on this, but it’s entirely possible to run Xata with local storage: basically a storage-class change away.

Another small side note: while Mayastor is serving us well, and it’s what we recommend for OSS installations, we have been working on our own storage engine in parallel (called Xatastor). It is the key to having sub-second branching and wake-up times and we’ll release it in a couple of weeks.

For the compute layer, we are building on top of CloudNativePG. It’s a stable and battle-tested operator covering all the production great concerns. We did add quite a lot of services around it, though: our custom SQL gateway, a “branch” operator, control plane and authentication services, etc.

The end result is what we think is an opinionated but flexible Postgres platform. More high level and easier to use than a K8s operator, and with a lot of battery included goodies.

Let us know if you have any questions!