I evaluated D1 for a project a few months ago, and found that global performance was pretty terrible. I don't know what exactly the issue with their architecture is, but if you look at the time-to-first-byte numbers here[1], you can see that even for the D1 demo database the numbers outside Europe are abysmal, and even within Europe having a TTFB of > 200ms isn't great.
This post helps understand some basic DB pitfalls for frontend developers, but I wouldn't use D1 regardless. If you can figure out how to use D1 as a frontend dev, you can use a hosted Postgres solution and get much more power and performance.
Using D1 in production for over an year on multiple projects - I can confirm response times to simple queries regularly take 400ms and beyond. On top there's constant network, connection and a plethora of internal errors. I too don't recommend it for production unless you're working on a toy project.
Same. North America performance (US and Mexico) had ~200ms+ latency per query, spiking to 500ms or higher in the test application I made using workers and D1. Their support channel was a discord, so I posted in it and never got a reply.
I was surprised because Cloudflare’s central messaging is that their network is fast, and disappointed becuase I’m a happy user of their other products (domains, DNS, pages, and R2).
> I was surprised because Cloudflare’s central messaging is that their network is fast, and disappointed becuase I’m a happy user of their other products (domains, DNS, pages, and R2).
I've glanced through D1's docs and I immediately noticed system traits like:
- database stored in a single primary region where all writes need to go,
- cold starts involve opening a connection to D1,
- cache misses in local replicas involve fetching data back from the primary region,
- D1 is built upon sqlite, which I think doesn't support write concurrency well.
- D1 doesn't actively cache results from the primary region to the edge, so you'll have cache misses pretty frequently.
My take from reading some docs is that you've got to partition your data properly, likely per-user. Then hopefully most of that users interactions are within the same datacentre.
That’s what the docs say but if you try to do this you quickly realize that the docs are living in a pipe dream. It’s not possible to set up per-user data in D1. Like in theory you probably could, but the DX infrastructure to make it possible is non-existent - you have to explicitly bind each database into your worker. At best you could try to manually shard data but that has a lot of drawbacks. Or maybe have the worker republish itself whenever a new user is registered? That seems super dangerous and unlikely to work in a concurrent fashion without something like a DO to synchronize everything (you don’t want to publish multiple workers at once with disjoint bindings & you probably want to batch updates to the worker).
When I asked on Discord, someone from Cloudflare confirmed that DO is indeed the only way to do tenancy-based sharding (you give the DO a name to obtain a handle to the specific DO instance to talk to), but the DX experience between DO and D1 is quite stark; D1 has better in DX in many ways but can’t scale, DO can scale but has terrible DX.
> My take from reading some docs is that you've got to partition your data properly, likely per-user.
I dont't think your take makes sense. I'll explain why.
Cloudflare's doc on D1's service limits states that paid plans have a hard limit on 50k databases per paid account. That's roomy for sharding, but you still end up with a database service that is hosted in a single data center whose clients are served from one of the >300 data centers, and whose cache misses still require pulling data from the primary region. Hypothetically sharding does buy you less write contention, but even in read-heavy applications you still end up with all >300 data centers having to pull data from the primary region whenever a single worker does a write.
I am running 2 production apps on Cloudflare workers, both using D1 for primary storage. I found the performance ok, especially after enabling Smart Placement [1].
> If you can figure out how to use D1 as a frontend dev, you can use a hosted Postgres solution and get much more power and performance.
After a cursory glance at Cloudflare D1's description and service limits, I was left with the impression that the usecases are not that of a generic RDBMS. I mean, the limit for each database is 10GB, and apparently a D1 database resides in a single location. Of course this means >200ms to even hit the server when requests don't come from the same data center.
Perhaps this is one of those things where a underperforming implementation is released to minimize time-to-market, and then they work on performance. However, for a distributed service this suggests to me that this solution only works as an ad-hoc storage of complex relational data for high-read, low-write scenarios, where you have Cloudflare Workers query D1 but either aggressively cache the response or store queries in Cloudflare KV to avoid the roundtrip cost.
> Of course this means >200ms to even hit the server when requests don't come from the same data center.
What world are you living in?
When using a European server from Europe I expect < 10ms ping. Less than 100ms for East coast server, less than 150ms for west coast ones. Only cases I'd expect more than 200ms (and not a lot more) is when hitting Australian servers.
It doesn't even work well for high-read/low-write scenarios, that was my issue with it. If I'm caching everything in Cloudflare KV anyway, why would I want to use a terribly performing SQL store for cold operations? Much better to just use a hosted postgres and cache things in CF KV if that isn't fast enough.
> . If you can figure out how to use D1 as a frontend dev, you can use a hosted Postgres solution and get much more power and performance.
If your application server, and your hosted postgres server live in different buildings, you'll also be disappointed in performance and pay 10x the price of D1.
I see it the same as serverless, which is almost-but-not-quite a backend in the same way that D1 is almost-but-not-quite a database.
Your persistent servers may have a larger fixed cost up-front, but you can save on engineering hours otherwise spent handling serverless foot-guns.
If you introduce a cartesian explosion of database queries on your persistent instance, it'll fairly quickly just crash and give you downtime. On your serverless version, it'll just keep going for as long as it can and you won't find out until an eye-watering bill lands in your inbox at the end of the month.
Hard downtime is not desirable, but that 10x initial cost can be dwarfed by an errant commit that leads to a practically endless execution loop on your API or DB.
Another fun limitation is that a transaction cannot span multiple D1 requests, so you can't select from the database, execute application logic, and then write to the database in an atomic way. At most, you can combine multiple statements into a single batch request that is executed atomically.
When I needed to ensure atomicity in such a multi-part "transaction", I ended up making a batch request, where the first statement in the batch checks a precondition and forces a JSON parsing error if the precondition is not met, aborting the rest of the batch statements.
SELECT
IIF(<precondition>, 1, json_extract("inconsistent", "$")) AS consistent
FROM ...
I was lucky here. For anything more complex, one would probably need to create tables to store temporary values, and translate a lot of application logic into SQL statements to achieve atomicity.
The optimizations listed in the article are common fallbacks of all serverless databases. Unless you are super diligent with writing queries to your database, it's going to be costly. The only real application I found so far are small projects where less than 5 tables are needed and no JOINs are required. That means projects like: page visitor counts, mailing lists, website pageview tracking are a perfect fit for serverless databases.
I used Mongo serverless few years ago when it was first released, I didn't know how the pricing works so I wasn't aware how much these full table scans will cost me even on a small collection with 100k records...
For example in logdy.dev[1] I'm using D1 to collect all of the things listed above and it works like a charm with Cloudflare Workers.
Just last week I published a post on how to export D1 and analyze it with Meatabase[2], for the next post I think I'm going to describe the whole stack.
I evaluated D1 for a project a few months ago, and found that global performance was pretty terrible. I don't know what exactly the issue with their architecture is, but if you look at the time-to-first-byte numbers here[1], you can see that even for the D1 demo database the numbers outside Europe are abysmal, and even within Europe having a TTFB of > 200ms isn't great.
This post helps understand some basic DB pitfalls for frontend developers, but I wouldn't use D1 regardless. If you can figure out how to use D1 as a frontend dev, you can use a hosted Postgres solution and get much more power and performance.
[1] https://speedvitals.com/ttfb-test?url=https://northwind.d1sq...
Using D1 in production for over an year on multiple projects - I can confirm response times to simple queries regularly take 400ms and beyond. On top there's constant network, connection and a plethora of internal errors. I too don't recommend it for production unless you're working on a toy project.
Same. North America performance (US and Mexico) had ~200ms+ latency per query, spiking to 500ms or higher in the test application I made using workers and D1. Their support channel was a discord, so I posted in it and never got a reply.
I was surprised because Cloudflare’s central messaging is that their network is fast, and disappointed becuase I’m a happy user of their other products (domains, DNS, pages, and R2).
> I was surprised because Cloudflare’s central messaging is that their network is fast, and disappointed becuase I’m a happy user of their other products (domains, DNS, pages, and R2).
I've glanced through D1's docs and I immediately noticed system traits like:
- database stored in a single primary region where all writes need to go,
- cold starts involve opening a connection to D1,
- cache misses in local replicas involve fetching data back from the primary region,
- D1 is built upon sqlite, which I think doesn't support write concurrency well.
- D1 doesn't actively cache results from the primary region to the edge, so you'll have cache misses pretty frequently.
Etc.
These traits don't scream performance.
My take from reading some docs is that you've got to partition your data properly, likely per-user. Then hopefully most of that users interactions are within the same datacentre.
That’s what the docs say but if you try to do this you quickly realize that the docs are living in a pipe dream. It’s not possible to set up per-user data in D1. Like in theory you probably could, but the DX infrastructure to make it possible is non-existent - you have to explicitly bind each database into your worker. At best you could try to manually shard data but that has a lot of drawbacks. Or maybe have the worker republish itself whenever a new user is registered? That seems super dangerous and unlikely to work in a concurrent fashion without something like a DO to synchronize everything (you don’t want to publish multiple workers at once with disjoint bindings & you probably want to batch updates to the worker).
When I asked on Discord, someone from Cloudflare confirmed that DO is indeed the only way to do tenancy-based sharding (you give the DO a name to obtain a handle to the specific DO instance to talk to), but the DX experience between DO and D1 is quite stark; D1 has better in DX in many ways but can’t scale, DO can scale but has terrible DX.
> My take from reading some docs is that you've got to partition your data properly, likely per-user.
I dont't think your take makes sense. I'll explain why.
Cloudflare's doc on D1's service limits states that paid plans have a hard limit on 50k databases per paid account. That's roomy for sharding, but you still end up with a database service that is hosted in a single data center whose clients are served from one of the >300 data centers, and whose cache misses still require pulling data from the primary region. Hypothetically sharding does buy you less write contention, but even in read-heavy applications you still end up with all >300 data centers having to pull data from the primary region whenever a single worker does a write.
I am running 2 production apps on Cloudflare workers, both using D1 for primary storage. I found the performance ok, especially after enabling Smart Placement [1].
1. https://developers.cloudflare.com/workers/configuration/smar...
> If you can figure out how to use D1 as a frontend dev, you can use a hosted Postgres solution and get much more power and performance.
After a cursory glance at Cloudflare D1's description and service limits, I was left with the impression that the usecases are not that of a generic RDBMS. I mean, the limit for each database is 10GB, and apparently a D1 database resides in a single location. Of course this means >200ms to even hit the server when requests don't come from the same data center.
Perhaps this is one of those things where a underperforming implementation is released to minimize time-to-market, and then they work on performance. However, for a distributed service this suggests to me that this solution only works as an ad-hoc storage of complex relational data for high-read, low-write scenarios, where you have Cloudflare Workers query D1 but either aggressively cache the response or store queries in Cloudflare KV to avoid the roundtrip cost.
> Of course this means >200ms to even hit the server when requests don't come from the same data center.
What world are you living in?
When using a European server from Europe I expect < 10ms ping. Less than 100ms for East coast server, less than 150ms for west coast ones. Only cases I'd expect more than 200ms (and not a lot more) is when hitting Australian servers.
It doesn't even work well for high-read/low-write scenarios, that was my issue with it. If I'm caching everything in Cloudflare KV anyway, why would I want to use a terribly performing SQL store for cold operations? Much better to just use a hosted postgres and cache things in CF KV if that isn't fast enough.
We also evaluated D1 and have been utterly disappointed by the performance. It was unacceptable for us (as for any serious production app).
I was actually very impressed how slow and bad it is.
> . If you can figure out how to use D1 as a frontend dev, you can use a hosted Postgres solution and get much more power and performance.
If your application server, and your hosted postgres server live in different buildings, you'll also be disappointed in performance and pay 10x the price of D1.
I see it the same as serverless, which is almost-but-not-quite a backend in the same way that D1 is almost-but-not-quite a database.
Your persistent servers may have a larger fixed cost up-front, but you can save on engineering hours otherwise spent handling serverless foot-guns.
If you introduce a cartesian explosion of database queries on your persistent instance, it'll fairly quickly just crash and give you downtime. On your serverless version, it'll just keep going for as long as it can and you won't find out until an eye-watering bill lands in your inbox at the end of the month.
Hard downtime is not desirable, but that 10x initial cost can be dwarfed by an errant commit that leads to a practically endless execution loop on your API or DB.
Another fun limitation is that a transaction cannot span multiple D1 requests, so you can't select from the database, execute application logic, and then write to the database in an atomic way. At most, you can combine multiple statements into a single batch request that is executed atomically.
When I needed to ensure atomicity in such a multi-part "transaction", I ended up making a batch request, where the first statement in the batch checks a precondition and forces a JSON parsing error if the precondition is not met, aborting the rest of the batch statements.
I was lucky here. For anything more complex, one would probably need to create tables to store temporary values, and translate a lot of application logic into SQL statements to achieve atomicity.Has anyone tried analyzing Durable Object with SQL storage performance? Is it as bad as D1?
I'm pretty sure D1 and DO build upon each other. I'm not sure exactly how, but I do remember reading about them using shared infra.
I am also curious to see these D1 vs DO comparisons if someone has.
The optimizations listed in the article are common fallbacks of all serverless databases. Unless you are super diligent with writing queries to your database, it's going to be costly. The only real application I found so far are small projects where less than 5 tables are needed and no JOINs are required. That means projects like: page visitor counts, mailing lists, website pageview tracking are a perfect fit for serverless databases.
I used Mongo serverless few years ago when it was first released, I didn't know how the pricing works so I wasn't aware how much these full table scans will cost me even on a small collection with 100k records...
For example in logdy.dev[1] I'm using D1 to collect all of the things listed above and it works like a charm with Cloudflare Workers. Just last week I published a post on how to export D1 and analyze it with Meatabase[2], for the next post I think I'm going to describe the whole stack.
[1]https://logdy.dev/ [2]https://logdy.dev/blog/post/how-to-connect-cloudflare-d1-sql...
>with Meatabase
They mean Metabase, though Meatabase could get interesting as a product
https://www.metabase.com
Interesting how does the performance compare to KV or Durable Objects?
D1 is at least as fast as KV, and in some cases faster: https://github.com/bruceharrison1984/kv-d1-benchmark