> If it takes a long time to copy a database and it gets updated midway through, rsync may give me an invalid database file. The first half of the file is pre-update, the second half file is post-update, and they don’t match. When I try to open the database locally, I get an error
Of course! You can't copy the file of a running, active db receiving updates, that can only result in corruption.
> You can't copy the file of a running, active db receiving updates, that can only result in corruption
To push back against "only" -- there is actually one scenario where this works. Copying a file or a subvolume on Btrfs or ZFS can be done atomically, so if it's an ACID database or an LSM tree, in the worst case it will just rollback. Of course, if it's multiple files you have to take care to wrap them in a subvolume so that all of them are copied in the same transaction, simply using `cp --reflink=always` won't do.
Possibly freezing the process with SIGSTOP would yield the same result, but I wouldn't count on that
It can't be done without fs specific snapshots - otherwise how would it distinguish between a cp/rsync needing consistent reads vs another sqlite client wanting the newest data?
I would assume cp uses ioctl (with atomic copies of individual files on filesystems that support CoW like APFS and BTRFS), whereas sqlite probably uses mmap?
I was trying to find evidence that reflink copies are atomic and could not and LLMs seem to think they are not. So at best may be a btrfs only feature?
> Clones are atomic with regards to concurrent writes, so no locks need to be taken to obtain a consistent cloned copy.
I'm not aware of any of the filesystems that use it (Btrfs, XFS, Bcachefs, ZFS) that deviate from expected atomic behavior, at least with single files being the atom in question for `FICLONE` operation.
Taking an LVM snapshot and then copying the sqlite database from that is sufficient to keep it from being corrupted, but you can have incomplete transactions that will be rolled back during crash recovery.
The problem is that LVM snapshots operate at the block device level and only ensure there are no torn or half-written blocks. It doesn't know about the filesystem's journal or metadata.
To get a consistent point-in-time snapshot without triggering crash-recovery and losing transactions, you also need to lock the sqlite database or filesystem from writes during the snapshot.
PRAGMA wal_checkpoint(FULL);
BEGIN IMMEDIATE; -- locks out writers
. /* trigger your LVM snapshot here */
COMMIT;
You can also use fsfreeze to get the same level of safety:
What does "losing transactions" mean? Some transactions will have committed before your backup and some will have committed after and therefore won't be included. I don't see the problem you are trying to solve?
Whether a transaction had started and gets transparently rolled back, or you had prevented from starting, what is the difference to you? Either way, you have a point-in-time snapshot, that time is the latest commit before the LVM snapshot.
You're discussing this in terms of "safety" and that doesn't seem right to me.
While I run and love litestream on my own system, I also like that they have a pretty comprehensive guide on how to do something like this manually, via built-in tools: https://litestream.io/alternatives/cron/
Litestream is really cool! I'm planning to use it to backup and restore my SQLite in the container level, just like what that ex-google guy who started a startup of a small KVM and had a flood in his warehouse while on vacation did. If I'm not mistaken. I would link here the perfect guide he wrote but there's 0 chance I'll find it. If you understand the reference please post the link.
Despite the beta label and lack of a 1.x release, I would consider the project pretty stable. We've used it in production for over 18 months to support an offline-first point of sale system. We haven't had any issues with Litestream.
>You can't copy the file of a running, active db receiving updates, that can only result in corruption
There is a slight 'well akshully' on this. A DB flush and FS snapshot where you copy the snapshotted file will allow this. MSSQL VSS snapshots would be an example of this.
Similarly you can rsync a Postgres data directory safely while the db is running, with the caveat that you likely lose any data written while the rsync is running. And if you want that data, you can get it with the WAL files.
It’s been years since I needed to do this, but if I remember right, you can clone an entire pg db live with a `pg_backup_start()`, rsync the data directory, pg_backup_stop() and rsync the WAL files written since backup start.
For moving DBs where I'm allowed minutes of downtime I do rsync (slow) first from the live, while hot, then just stop that one, then rsync again (fast) then make the new one hot.
Works a treat when other (better) method are not available.
If the corruption is detectable and infrequent enough for your purposes, then it does work, with a simple “retry until success” loop. (That’s how TCP works, for example.)
Not all corruption is detectable. You could make a copy during a transaction where only a subset of the transactions saved pages are persisted but all branch & leaf pages are pointed to correctly. That would give you a state of the database that never actually existed and break atomicity.
Well, I don't know rsync that well. If you're saying it doesn't detect changes to files while it's being copied, then I'll believe you.
And, as far as I know, it's impossible to detect absolutely all corruption.
But you can pretty easily detect, e.g., that a file has or has not changed since before you copied it to after, on a system with a basically functioning filesystem and clock, with a reasonable/useful level of confidence.
> Of course! You can't copy the file of a running, active db receiving updates, that can only result in corruption
Do people really not understand how file storage works? I cannot rightly apprehend the confusion of ideas that would produce an attempt to copy a volatile database without synchronization and expect it to work.
The confusion of ideas here is understandable IMO: people assume everything is atomic. Databases of course famously have ACID guarantees. But it's easy for people to assume copying is also an atomic operation. Honestly if someone works too much with databases and not enough with filesystems it's a mistake easily made.
It was early days... very early days. He didn't have the benefit of trying to help his (metaphorical) grandparents get their emails or worked under a manager who thinks 2023-era ChatGPT is only slightly less reliable than the Standard Model of Physics, if not slightly more.
> The protocol is for the replica to send a cryptographic hash of each of its pages over to the origin side, then the origin sends back the complete content of any page for which the hash does not match.
Yeah, but unfortunately the SQLite team doesn't include that tool with their "autotools" tarball, which is what most distros (and brew) use to package SQLite. The only way to use the tool is to compile it yourself.
`sqlite3_rsync` must be installed on the remote host too, so now you're cross-compiling for all the hosts you manage. It also must be installed into the PATH the ssh uses, which for a number of operating systems doesn't include /usr/local/bin. So I guess you're now placing your sshd config under configuration management to allow that.
These tasks aren't that challenging but they sure are a yak shave.
How to copy databases between computers? Just send a circle and forget about the rest of the owl.
As others have mentioned an incremental rsync would be much faster, but what bothers me the most is that he claims that sending SQL statements is faster than sending database and COMPLETELY omiting the fact that you have to execute these statements. And then run /optimize/. And then run /vacuum/.
Currently I have scenario in which I have to "incrementally rebuild *" a database from CSV files. While in my particular case recreating the database from scratch is more optimal - despite heavy optimization it still takes half an hour just to run batch inserts on an empty database in memory, creating indexes, etc.
For my use case (recreating in-memory from scratch) it basically boils down to three points: (1) journal_mode = off (2) wrapping all inserts in a single transaction (3) indexes after inserts.
For whatever it's worth I'm getting 15M inserts per minute on average, and topping around 450k/s for trivial relationship table on a stock Ryzen 5900X using built-in sqlite from NodeJS.
Would it be useful for you to have a SQL database that’s like SQLite (single file but not actually compatible with the SQLite file format) but can do 100M/s instead?
I have been looking for replacement of SQLite for years -- admittedly, not very actively, embedded databases are just a hobby obsession and my life did not allow me much leisure time in the last years -- and still couldn't find one.
The written-in-Rust `sled` database is more like a key-value store and I had partial successes with it but it's too much work making a KV store a relational database.
I have a PoC KV store that does > 150M write/s (Rust) using a single core (for 8 bytes of data inserted - it gets bandwidth limited on disk quite quickly even latest NVME PCIE5 disks). The plan is to have it support application RAID0 out of the box so that you could spread the data across multiple disks, but of course that's something you have to setup up-front when you create the DB.
I then would add a SQL engine on top - not sure how much SQL would slow things down but hopefully not much. I haven't found anyone who's interested in anything like that though.
And yes I realize this is several orders of magnitude more performance than any other DB out there.
kv doesn't cut my use case. After import I'm running several queries to remove unwanted data. I was shortly contemplaiting filtering while I was importing but (a) I couldn't really handle relationships - hence two distinct steps - remove and remove orphans, and (b) even if I could it's really much cleaner code having a single simple function to import a table, and then having a clean one liners to remove weed. And (c) I need sql queries later on.
Honestly I don't see much use for yet-another-sqlite.
The premise of having 100M/s writes instead of 500k/s sounds bit unrealistic, but at the same time, while simply importing tuples and completely ignoring stuff like foreign keys, I'm only utilizing one core. I had on my todo list an experiment to run these imports in paralell into different databases and then merging them somehow, but I ran out of time. Again, 10Gb sqlite is quite large.
On the other hand, I think the adoption and the fact that you can take your db basicially anywhere and it will run out of the bat is something you can't ignore. I was briefly looking at pglite but I don't really see benefits apart from a niche use case when you really need that compatibility with big brother.
And then sqlite has so many hidden gems, like the scenario where you can use a sqlite file hosted on http like a remote database! I can post my 10Gb database on S3 and run count(*) on main table and it will only take like 40kb of bandwidth.
> Honestly I don't see much use for yet-another-sqlite.
Agreed. I want something better than SQLite, something that learns from it and upgrades it further.
> On the other hand, I think the adoption and the fact that you can take your db basicially anywhere and it will run out of the bat is something you can't ignore.
Absolutely. That's why I am soon finishing my Elixir -> Rust -> SQLite library since I happen to believe most apps don't even need a dedicated DB server.
> I was briefly looking at pglite but I don't really see benefits apart from a niche use case when you really need that compatibility with big brother.
I would probably easily pay 1000 EUR next month if I could have SQLite with PostgreSQL's strict schema. That's the one weakness of SQLite that I hate with a passion. I know about strict mode. I am using it. Still not good enough. I want "type affinity" gone forever. It's obviously a legacy feature and many people came to rely on it.
Hence I concluded that SQLite will never change and something newer should arrive at one point. Though how do you beat the (likely) millions of tests that SQLite has? You don't... but we have to start somewhere.
> And then sqlite has so many hidden gems, like the scenario where you can use a sqlite file hosted on http like a remote database! I can post my 10Gb database on S3 and run count() on main table and it will only take like 40kb of bandwidth.*
Admittedly I never saw the value in that, to me that just seems like you are having a remote database again, at which point why not just go for PostgreSQL which is stricter and has much less surprises. But that's my bias towards strictness and catching bugs at the door and not 10 km down the road.
I hear you. As someone who lost rest of his hair during last 10 years talking to frontend kids claiming that types are for grannies - I'm on your side.
But having that said, sqlite has a niche application and you can enforce types on app layer, the same way you do it with web apps. Trust, but verify. Better - don't trust at all. At the end of the day the way I see it - it's just like protobuffers et al. - you put some data into a black box stream of bytes and it's your responsiblity to ensure correctness on both ends.
@serverless
It's twofold. On one hand you have the ability to move faster. On the other you have less moving parts that need maintenance and can break. Plus, for me personally, it's the default mindset. Let me give you an example - in 2025 still most online shops have filters that will trigger a full reload of the web page.
When I'm clicking on TVs in a shop I don't need to reload the webpage everytime I click on something, the app could easily got the whole stock in single json and filter results on the fly while I'm fiddling with filters. Sure it doesn't work for amazon, but it works for 95% of shops online. Yet no one is doing it. Why?
My point - I'm looking for a way to simplify processes, and for some niche applications it's just more convenient.
I tested couple different approaches, including pglite, but node finally shipped native sqlite with version 23 and it's fine for me.
I'm a huge fan of serverless solutions and one of the absolute hidden gems about sqlite is that you can publish the database on http server and query it extremely efficitent from a client.
I even have a separate miniature benchmark project I thought I might publish, but then I decided it's not worth anyones time. x]
It's worth noting that the data in that benchmark is tiny (28MB). While this varies between database engines, "one transaction for everything" means keeping some kind of allocations alive.
The optimal transaction size is difficult to calculate so should be measured, but it's almost certainly never beneficial to spend multiple seconds on a single transaction.
There will also be weird performance changes when the size of data (or indexed data) exceeds the size of main memory.
Hilarious, 3000+ votes for a Stack Overflow question that's not a question. But it is an interesting article. Interesting enough that it gets to break all the rules, I guess?
As with any optimization, it matters where your bottleneck is here. Sounds like theirs is bandwidth but CPU/Disk IO is plentiful since they mentioned that downloading 250MB database takes minute where I just grabbed 2GB SQLite test database from work server in 15 seconds thanks to 1Gbps fiber.
30 minutes seems long. Is there a lot of data? I’ve been working on bootstrapping sqlite dbs off of lots of json data and by holding a list of values and then inserting 10k at a time with inserts, Ive found a good perf sweet spot where I can insert plenty of rows (millions) in minutes. I had to use some tricks with bloom filters and LRU caching, but can build a 6 gig db in like 20ish minutes now
I create a new in-mem db, run schema and then import every table in one single transaction (in my testing it showed that it doesn't matter if it's a single batch or multiple single inserts as long are they part of single transaction).
I do a single string replacement per every CSV line to handle an edge case. This results in roughly 15 million inserts per minute (give or take, depending on table length and complexity). 450k inserts per second is a magic barrier I can't break.
I then run several queries to remove unwanted data, trim orphans, add indexes, and finally run optimize and vacuum.
Millions of rows in minutes sounds not ok, unless your tables have a large number of columns. A good rule is that SQLite's insertion performance should be at least 1% of sustained max write bandwidth of your disk; preferably 5%, or more. The last bulk table insert I was seeing 20%+ sustained; that came to ~900k inserts/second for an 8 column INT table (small integers).
The recently released sqlite_rsync utility uses a version of the rsync algorithm optimized to work on the internal structure of a SQLite database. It compares the internal data pages efficiently, then only syncs changed or missing pages.
Nice tricks in the article, but you can more easily use the builtin utility now :)
sqlite_rsync can only be used in WAL mode. A further constraint of WAL mode is the database file must be stored on local disk. Clearly, you'd want to do this almost all the time, but for the times this is not possible this utility won't work.
I just checked in an experimental change to sqlite3_rsync that allows it to work on non-WAL-mode database files, as long as you do not use the --wal-only command-line option. The downside of this is that the origin database will block all writers while the sync is going on, and the replicate database will block both reads and writers during the sync, because to do otherwise requires WAL-mode. Nevertheless, being able to sync DELETE-mode databases might well be useful, as you observe.
I'm not sure understand your comment. Regardless of WAL or network filesystem usage, the sqlite file cannot be written to from multiple processes simultaneously. Am I missing something here, or did you misstate?
The VACUUM command with an INTO clause is an alternative to the backup API for generating backup copies of a live database. The advantage of using VACUUM INTO is that the resulting backup database is minimal in size and hence the amount of filesystem I/O may be reduced.
It's cool but it does not address the issue of indexes, mentioned in the original post. Not carrying index data over the slow link was the key idea. The VACUUM INTO approach keeps indexes.
A text file may be inefficient as is, but it's perfectly compressible, even with primitive tools like gzip. I'm not sure the SQLite binary format compresses equality well, though it might.
> A text file may be inefficient as is, but it's perfectly compressible, even with primitive tools like gzip. I'm not sure the SQLite binary format compresses equality well, though it might.
I hope you’re saying because of indexes? I think you may want to revisit how compression works to fix your intuition. Text+compression will always be larger and slower than equivalent binary+compression assuming text and binary represent the same contents? Why? Binary is less compressible as a percentage but starts off smaller in absolute terms which will result in a smaller absolute binary. A way to think about it is information theory - binary should generally represent the data more compactly already because the structure lived in the code. Compression is about replacing common structure with noise and it works better if there’s a lot of redundant structure. However while text has a lot of redundant structure, that’s actually bad for the compressor because it has to find that structure and process more data to do that. Additionally, is using generic mathematical techniques to remove that structure which are genetically optimal but not as optimal as removing that structure by hand via binary is.
There’s some nuance here because the text represents slightly different things than the raw binary SQLite (how to restore data in the db vs the precise relationships + data structures for allowing insertion/retrieval. But still I’d expect it to end up smaller compressed for non trivial databases
Below I'm discussing compressed size here rather than how "fast" it is to copy databases.
Yeah there are indexes. And even without indexes there is an entire b-tree sitting above the data. So we're weighing the benefits of having a domain dependent compression (binary format) vs dropping all of the derived data. I'm not sure how that will go, but lets try one.
Here is sqlite file containing metadata for apple's photo's application:
About 6% smaller for dump vs the original binary (but there are a bunch of indexes in this one). For me, I don't think it'd be worth the small space savings to spend the extra time doing the dump.
With indexes dropped and vacuumed, the compressed binary is 8% smaller than compressed text (despite btree overhead):
566177792 May 1 09:09 photos_noindex.sqlite
262067325 May 1 09:09 photos_noindex.sqlite.gz
About 13.5% smaller than compressed binary with indices. And one could re-add the indices on the other side.
Yup, these results are pretty consistent with what I'd expect (& why I noted the impact of indices) cause even string data has a lot of superfluous information when expressed in the DDL ("INSERT INTO foo ...") - I would expect all of that to exceed any bookkeeping within the btree. And non-string values like blobs or numbers are going to be stored more efficiently than in the dump which is a text encoding (or even hex for blobs) which is going to blow things up further.
The main point is to skip the indices, which you have to do pre-compression.
When I do stuff like this, I stream the dump straight into gzip. (You can usually figure out a way to stream directly to the destination without an intermediate file at all.)
Plus this way it stays stored compressed at its destination. If your purpose is backup rather than a poor man's replication.
The main point was decreasing the transfer time - if rsync -z makes it short enough, it doesn't matter if the indices are there or not, and you also skip the step of re-creating the DB from the text file.
The point of the article is that it does matter if the indices are there. And indices generally don't compress very well anyways. What compresses well are usually things like human-readable text fields or booleans/enums.
If working from files on disk that happen not to be cached, the speed differences are likely to disappear, even on many NVMe disks.
(It just so happens that the concatenation of all text-looking .tar files I happen to have on this machine is roughly a gigabyte (though I did the math for the actual size)).
Looks like it depends heavily on choice of file, but I see good performance on both compressible and uncompressible files. Small files tend to perform (relatively) bad though. Here is a sample of 3 large files with different compression ratios:
Ain't no way zstd compresses at 5+, even at -1. That's the sort of throughputs you see on lz4 running on a bunch of core (either half a dozen very fast, or 12~16 merely fast).
Valve has different needs then most. Their files are rarely change so they only need to do expensive compression once and they save a ton in bandwidth/storage along with fact that their users are more tolerant of download responsiveness.
This will always be something you have to determine for your own situation. At least at my work, CPU cores are plentiful, IO isn't. We rarely have apps that need more than a fraction of the CPU cores (barring garbage collection). Yet we are often serving fairly large chunks of data from those same apps.
Depends. Run a benchmark on your own hardware/network. ZFS uses in-flight compression because CPUs are generally faster than disks. That may or may not be the case for your setup.
What? Compression is absolutely essential throughout computing as a whole, especially as CPUs have gotten faster. If you have compressible data sent over the network (or even on disk / in RAM) there's a good chance you should be compressing it. Faster links have not undercut this reality in any significant way.
Whether or not to compress data before transfer is VERY situationally dependent. I have seen it go both ways and the real-world results do not not always match intuition. At the end of the day, if you care about performance, you still have to do proper testing.
(This is the same spiel I give whenever someone says swap on Linux is or is not always beneficial.)
He absolutely should be doing this, because by using rsync on a compressed file he's passing by the whole point of using rsync, which is the rolling-checksum based algorithm that allows to transfer diffs.
In DuckDB you can do the same but export to Parquet, this way the data is an order of magnitude smaller than using text-based SQL statements. It's faster to transfer and faster to load.
That's not it. This only exports the table's data, not the database. You lose the index, comments, schemas, partitioning, etc... The whole point of OP's article is how to export the indices in an efficient way.
Also I wonder how big your test database is and it's schema. For large tables Parquet is way more efficient than a 20% reduction.
If there's UUIDs, they're 36 bits each in text mode and 16 bits as binary in Parquet. And then if they repeat you can use a dictionary in your Parquet to save the 16 bits only once.
It's also worth trying to use brotli instead of zstd if small files is your goal.
SQLite has an session extension, which will track changes to a set of tables and produce a changeset/patchset which can patch previous version of an SQLite database.
I have yet to see a single SQLite binding supporting this, so it’s quite useless unless you’re writing your application in C, or are open to patching the language binding.
In one of my projects I have implemented my own poor man’s session by writing all the statements and parameters into a separate database, then sync that and replay. Works well enough for a ~30GB database that changes by ~0.1% every day.
Well, my upcoming Elixir wrapper of a Rust wrapper of SQLite (heh, I am aware how it sounds) will support it. I am pretty sure people do find it useful and would use it. If not, the 1-2 days of hobby coding to deliver it are not something I would weep over.
I have updated the Lua binding to support the session extension (http://lua.sqlite.org/home/timeline?r=session) and it's been integrated into the current version of cosmopolitan/redbean. This was partially done to support application-level sync of SQLite DBs, however this is still a work in progress.
Ignoring for the moment issues of syncing a database where the source DB may be running inserts/updates:
if one dumps tables as separate CSV files/streams and using DuckDB converts them to individual parquet files the rsync should run faster since hopefully not every table is modified between each new syncing. There is an obvious overhead of the back and forth conversions but DuckDB can directly export a database to SQLite.
I have not tested it myself, so it is just a brainstorming.
Last but not least: when compressing/decompressing text dumps use igzip or pigz if you want to speed things up. Also benchmark the compression levels.
If you're regularly syncing from an older version to a new version, you can likely optimize further using gzip with "--rsyncable" option. It will reduce the compression by ~1% but make it so differences from one version to the next are localized instead of cascading through the full length of the compression output.
Another alternative is to skip compression of the dump output, let rsync calculate the differences from an previous uncompressed dump to the current dump, then have rsync compress the change sets it sends over the network. (rsync -z)
This reminds me of something I did back in 2008, we were using Postgres, not SQLite, and we needed to get a backup from one production machine onto several other machines. Copying saturated the network for an hour - And we needed to copy four times if I remember right. We didn't want to wait four hours.
I found this tool laying around "udpcast" and used it to send the backup over the network just once to all the destinations at the same time.
Does the author not know that rsync can use compression (rsync -z | --compress | --compress-level=<n> ), or does he not think it worthwhile to compare that data point?
I just tried some comparisons (albeit with a fairly small sqlite file). The text compressed to only about 84% of the size of the compressed binary database, which isn't negligible, but not necessarily worth fussing over in every situation. (The binary compressed to 7.1%, so it's 84% relative to that).
bzip2 performed better on both formats; its compression of the binary database was better than gzip's compression of the text (91.5%) and bzip2's text was better than binary (92.5).
Though that is not available inside rsync, it indicates that if you're going with an external compression solution, maybe gzip isn't the best choice if you care about every percentage reduction.
If you don't care about every percentage reduction, maybe just rsync compression.
One thing worth mentioning is that if you are updating the file, rsync will only compress what is sent. To replicate that with the text solution, you will have to be retaining the text on both sides to do the update between them.
I've seen a suggestion several times to compress the data before sending. If remote means in the same data center, there's a good chance compressing the data is just slowing you down. Not many machines can gzip/bzip2/7zip at better than the 1 gigabyte per second you can get from 10 Gbps networks.
I recently set up some scripts to do this and it wasn't quite as simple as I had hoped. I had to pass some extra flags to pg_restore for --no-owner --no-acl, and then it still had issues when the target db has data in it, even with --clean and --create. And sometimes it would leave me in a state where it dropped the database and had trouble restoring, and so I'd be totally empty.
What I ended up doing is creating a new database, pg_restore'ing into that one with --no-owner and --no-acl, forcibly dropping the old database, and then renaming the new to the old one's name. This has the benefit of not leaving me high and dry should there be an issue with restoring.
I used to work at a company that had a management interface that used sqlite as database, its multi-node / fallover approach was also just... copying the file and rsyncing it. I did wonder about data integrity though, what if the file is edited while it's being copied over? But there's probably safeguards in place.
Anyway I don't think the database file size was really an issue, it was a relatively big schema but not many indices and performance wasn't a big consideration - hence why the backend would concatenate query results into an XML file, then pass it through an xml->json converter, causing 1-2 second response times on most requests. I worked on a rewrite using Go where requests were more like 10-15 milliseconds.
But, I still used sqlite because that was actually a pretty good solution for the problem at hand; relatively low concurrency (up to 10 active simultaneous users), no server-side dependencies or installation needed, etc.
SQLite has a write-ahead log (WAL). You can use Litestream on top of that. You get single RW, multiple readers (you lose the C in CAP), and can promote a reader when the writer fails.
Clearly, when you don’t transfer indexes, you will not have to transfer as much data.
However, the tradeoff is that the database is going to have to do more work to regenerate those indexes when you reconstruct it from the text dump at the destination (as opposed to the case where the indexes were included)
Why not just compress the whole database using `gzip` or `lz4` before rsyncing it instead? `zstd` works too but seems like it had a bug regarding compressing file with modified content.
better yet, split your sqlite file to smaller piece. it is not like it needs to contain all the app data in a single sqlite file.
I have recently discovered a tool called mscp which opens open multiple scp threads to copy down large files. It works great for speeding up these sorts of downloads.
zstd would be a better choice. It’s bonkers fast (especially when used with multithreading) and still compresses better than gzip. Alternatively, I’d recommend looking into bzip3, but I’m not sure if it would save time.
isn't this rather obvious? doesn't everyone do this when it makes sense? obviously, it applies to other DBs, and you don't even need to store the file (just a single ssh from dumper to remote undumper).
if retaining the snapshot file is of value, great.
I'd be a tiny bit surprised if rsync could recognize diffs in the dump, but it's certainly possible, assuming the dumper is "stable" (probably is because its walking the tables as trees). the amount of change detected by rsync might actually be a useful thing to monitor.
How long does this procedure take in comparison to the network transfer?
My first try would've been to copy the db file first, gzip it and then transfer it but I can't tell whether compression will be that useful in binary format.
The sqlite file format (https://www.sqlite.org/fileformat.html) does not talk about compression, so I would wager unless you are storing already compressed content (media maybe?) or random numbers (encrypted data), it should compress reasonably well.
Quite simply, I have a table with 4 columns -- A, B, C, D. Each column is just an 8-byte integer. It has hundreds of millions of rows. It has an index on B+C+D, an index on C+D, and one on D.
All of these are required because the user needs to be able to retrieve aggregate data based on range conditions around lots of combinations of the columns. Without all the indices, certain queries take a couple minutes. With them, each query takes milliseconds to a couple seconds.
I thought of every possible way to avoid having all three indices, but it just wasn't possible. It's just how performant data lookup works.
You shouldn't assume people are being careless with indices. Far too often I see the opposite.
how well does just the sqlite database gzip, the indexes are a lot of redundant data so your going to get some efficiencies there, probably less locality of data then the text file though so maybe less?
Since sqlite is just a simple file-level locking DB, I'm pretty shocked they don't have an option to let the indexes be stored in separate files for all kinds of obvious and beneficial reasons, like the fact that you can easily exclude them from backups if they were, and you can make them "rebuild" just by deleting them. Probably their reason for keeping all internal has to do with being sure indexes are never out of sync, but that could just as easily be accomplished with hashing algos.
I’ve been looking into a way to replicate a SQLite database and came across the LiteFS project by Fly.io. Seems like a solid drop-in solution backed by FUSE and Consul. Anybody used it in production? My use case is high availability between multiple VMs.
Pretty good point. I just wonder if databases in generally can be perfectly reconstructed from a text dump. For instance, do the insertion orders change in any of the operations between dumping and importing?
> If it takes a long time to copy a database and it gets updated midway through, rsync may give me an invalid database file. The first half of the file is pre-update, the second half file is post-update, and they don’t match. When I try to open the database locally, I get an error
Of course! You can't copy the file of a running, active db receiving updates, that can only result in corruption.
For replicating sqlite databases safely there is
https://github.com/benbjohnson/litestream
"For replicating sqlite databases safely there is (litestream) ..."
A reminder that litestream can run over plain old SFTP[1] which means you can stream database replication to just about any UNIX endpoint over SSH.
I have a favorite[2] but any SFTP server will do ...
[1] https://github.com/benbjohnson/litestream/issues/140
[2] https://www.rsync.net/resources/notes/2021-q3-rsync.net_tech...
Whats the closest counterpart in Postgres ecosystem?
log shipping[1] or logical replication[2]
[1] https://www.postgresql.org/docs/current/warm-standby.html [2] https://www.postgresql.org/docs/current/logical-replication....
> You can't copy the file of a running, active db receiving updates, that can only result in corruption
To push back against "only" -- there is actually one scenario where this works. Copying a file or a subvolume on Btrfs or ZFS can be done atomically, so if it's an ACID database or an LSM tree, in the worst case it will just rollback. Of course, if it's multiple files you have to take care to wrap them in a subvolume so that all of them are copied in the same transaction, simply using `cp --reflink=always` won't do.
Possibly freezing the process with SIGSTOP would yield the same result, but I wouldn't count on that
It can't be done without fs specific snapshots - otherwise how would it distinguish between a cp/rsync needing consistent reads vs another sqlite client wanting the newest data?
I would assume cp uses ioctl (with atomic copies of individual files on filesystems that support CoW like APFS and BTRFS), whereas sqlite probably uses mmap?
So not a "naive" cp, but at least it calls a shared ioctl that is implemented buy multiple filesystems.
I was trying to find evidence that reflink copies are atomic and could not and LLMs seem to think they are not. So at best may be a btrfs only feature?
From Linux kernel documentation:
https://man7.org/linux/man-pages/man2/ioctl_ficlone.2.html
> Clones are atomic with regards to concurrent writes, so no locks need to be taken to obtain a consistent cloned copy.
I'm not aware of any of the filesystems that use it (Btrfs, XFS, Bcachefs, ZFS) that deviate from expected atomic behavior, at least with single files being the atom in question for `FICLONE` operation.
Thanks! I similarly thought that a file clone would work but I couldn't confirm it.
Obligatory "LVM still exists and snapshots are easy enough to overprovision for"
Taking an LVM snapshot and then copying the sqlite database from that is sufficient to keep it from being corrupted, but you can have incomplete transactions that will be rolled back during crash recovery.
The problem is that LVM snapshots operate at the block device level and only ensure there are no torn or half-written blocks. It doesn't know about the filesystem's journal or metadata.
To get a consistent point-in-time snapshot without triggering crash-recovery and losing transactions, you also need to lock the sqlite database or filesystem from writes during the snapshot.
You can also use fsfreeze to get the same level of safety: Bonus - validate the snapshotted db file with:What does "losing transactions" mean? Some transactions will have committed before your backup and some will have committed after and therefore won't be included. I don't see the problem you are trying to solve?
Whether a transaction had started and gets transparently rolled back, or you had prevented from starting, what is the difference to you? Either way, you have a point-in-time snapshot, that time is the latest commit before the LVM snapshot.
You're discussing this in terms of "safety" and that doesn't seem right to me.
The built-in .backup command is also intended as an official tool for making “snapshotted” versions of a live db that can be copied around.
This leverages a dedicated on-line backup API: https://sqlite.org/backup.html
While I run and love litestream on my own system, I also like that they have a pretty comprehensive guide on how to do something like this manually, via built-in tools: https://litestream.io/alternatives/cron/
Litestream is really cool! I'm planning to use it to backup and restore my SQLite in the container level, just like what that ex-google guy who started a startup of a small KVM and had a flood in his warehouse while on vacation did. If I'm not mistaken. I would link here the perfect guide he wrote but there's 0 chance I'll find it. If you understand the reference please post the link.
Haha, that sounds like me. Here's the writeup you're talking about:
https://mtlynch.io/litestream/
And here's the flooding story:
https://mtlynch.io/solo-developer-year-6/#the-most-terrifyin...
Sidenote: I still use Litestream in every project where I use SQLite.
Litestream looks interesting but they are still in beta, and seem to have not had a release in over a year, although SQLite doesn't move that quickly.
Is Litestream still an active project?
Despite the beta label and lack of a 1.x release, I would consider the project pretty stable. We've used it in production for over 18 months to support an offline-first point of sale system. We haven't had any issues with Litestream.
That's great, thanks for sharing!
seems like a new release is being worked on: https://github.com/benbjohnson/litestream/pull/636
>You can't copy the file of a running, active db receiving updates, that can only result in corruption
There is a slight 'well akshully' on this. A DB flush and FS snapshot where you copy the snapshotted file will allow this. MSSQL VSS snapshots would be an example of this.
Similarly you can rsync a Postgres data directory safely while the db is running, with the caveat that you likely lose any data written while the rsync is running. And if you want that data, you can get it with the WAL files.
It’s been years since I needed to do this, but if I remember right, you can clone an entire pg db live with a `pg_backup_start()`, rsync the data directory, pg_backup_stop() and rsync the WAL files written since backup start.
For moving DBs where I'm allowed minutes of downtime I do rsync (slow) first from the live, while hot, then just stop that one, then rsync again (fast) then make the new one hot.
Works a treat when other (better) method are not available.
MSSQL also offers a virtual backup device interface that 3rd party tools can implement.
https://learn.microsoft.com/en-us/sql/relational-databases/b...
"If it takes a long time to copy a database and it gets updated midway through, rsync may give me an invalid database file"
Wot? There are multiple ways of snapshotting/checkpointing, starting at the virty level and working on down the stack through the application level.
If the corruption is detectable and infrequent enough for your purposes, then it does work, with a simple “retry until success” loop. (That’s how TCP works, for example.)
Not all corruption is detectable. You could make a copy during a transaction where only a subset of the transactions saved pages are persisted but all branch & leaf pages are pointed to correctly. That would give you a state of the database that never actually existed and break atomicity.
> Not all corruption is detectable.
Well, I don't know rsync that well. If you're saying it doesn't detect changes to files while it's being copied, then I'll believe you.
And, as far as I know, it's impossible to detect absolutely all corruption.
But you can pretty easily detect, e.g., that a file has or has not changed since before you copied it to after, on a system with a basically functioning filesystem and clock, with a reasonable/useful level of confidence.
> Of course! You can't copy the file of a running, active db receiving updates, that can only result in corruption
Do people really not understand how file storage works? I cannot rightly apprehend the confusion of ideas that would produce an attempt to copy a volatile database without synchronization and expect it to work.
The confusion of ideas here is understandable IMO: people assume everything is atomic. Databases of course famously have ACID guarantees. But it's easy for people to assume copying is also an atomic operation. Honestly if someone works too much with databases and not enough with filesystems it's a mistake easily made.
> I cannot rightly apprehend the confusion of ideas
I see you are a man of culture.
Charles Babbage is smart, but either he lacks empathy to understand other people or he's just saying that deliberately for comedic effect.
It was early days... very early days. He didn't have the benefit of trying to help his (metaphorical) grandparents get their emails or worked under a manager who thinks 2023-era ChatGPT is only slightly less reliable than the Standard Model of Physics, if not slightly more.
Oh he definitely lacked empathy.
But things haven't improved much. Today we have "prompt engineers" whose only job is to input the right question in order to get the right answer.
SQLite has an official tool for this, fwiw: https://www.sqlite.org/rsync.html
It works at the page level:
> The protocol is for the replica to send a cryptographic hash of each of its pages over to the origin side, then the origin sends back the complete content of any page for which the hash does not match.
Yeah, but unfortunately the SQLite team doesn't include that tool with their "autotools" tarball, which is what most distros (and brew) use to package SQLite. The only way to use the tool is to compile it yourself.
Yeah, that’s a bummer. It does appear to be in nixpkgs, though:
Realistically, are you using SQLite if you can’t compile and source control your rev of the codebase? Is that really a big deal?
Yes, it's extremely common to be using it and not even be compiling anything yourself, let alone C or any support libraries.
`sqlite3_rsync` must be installed on the remote host too, so now you're cross-compiling for all the hosts you manage. It also must be installed into the PATH the ssh uses, which for a number of operating systems doesn't include /usr/local/bin. So I guess you're now placing your sshd config under configuration management to allow that.
These tasks aren't that challenging but they sure are a yak shave.
How to copy databases between computers? Just send a circle and forget about the rest of the owl.
As others have mentioned an incremental rsync would be much faster, but what bothers me the most is that he claims that sending SQL statements is faster than sending database and COMPLETELY omiting the fact that you have to execute these statements. And then run /optimize/. And then run /vacuum/.
Currently I have scenario in which I have to "incrementally rebuild *" a database from CSV files. While in my particular case recreating the database from scratch is more optimal - despite heavy optimization it still takes half an hour just to run batch inserts on an empty database in memory, creating indexes, etc.
I hope you've found https://stackoverflow.com/questions/1711631/improve-insert-p...
It's a very good writeup on how to do fast inserts in sqlite3
Yes! That was actually quite helpful.
For my use case (recreating in-memory from scratch) it basically boils down to three points: (1) journal_mode = off (2) wrapping all inserts in a single transaction (3) indexes after inserts.
For whatever it's worth I'm getting 15M inserts per minute on average, and topping around 450k/s for trivial relationship table on a stock Ryzen 5900X using built-in sqlite from NodeJS.
Would it be useful for you to have a SQL database that’s like SQLite (single file but not actually compatible with the SQLite file format) but can do 100M/s instead?
Which database might that be?
I have been looking for replacement of SQLite for years -- admittedly, not very actively, embedded databases are just a hobby obsession and my life did not allow me much leisure time in the last years -- and still couldn't find one.
The written-in-Rust `sled` database is more like a key-value store and I had partial successes with it but it's too much work making a KV store a relational database.
I have a PoC KV store that does > 150M write/s (Rust) using a single core (for 8 bytes of data inserted - it gets bandwidth limited on disk quite quickly even latest NVME PCIE5 disks). The plan is to have it support application RAID0 out of the box so that you could spread the data across multiple disks, but of course that's something you have to setup up-front when you create the DB.
I then would add a SQL engine on top - not sure how much SQL would slow things down but hopefully not much. I haven't found anyone who's interested in anything like that though.
And yes I realize this is several orders of magnitude more performance than any other DB out there.
kv doesn't cut my use case. After import I'm running several queries to remove unwanted data. I was shortly contemplaiting filtering while I was importing but (a) I couldn't really handle relationships - hence two distinct steps - remove and remove orphans, and (b) even if I could it's really much cleaner code having a single simple function to import a table, and then having a clean one liners to remove weed. And (c) I need sql queries later on.
Honestly I don't see much use for yet-another-sqlite.
The premise of having 100M/s writes instead of 500k/s sounds bit unrealistic, but at the same time, while simply importing tuples and completely ignoring stuff like foreign keys, I'm only utilizing one core. I had on my todo list an experiment to run these imports in paralell into different databases and then merging them somehow, but I ran out of time. Again, 10Gb sqlite is quite large.
On the other hand, I think the adoption and the fact that you can take your db basicially anywhere and it will run out of the bat is something you can't ignore. I was briefly looking at pglite but I don't really see benefits apart from a niche use case when you really need that compatibility with big brother.
And then sqlite has so many hidden gems, like the scenario where you can use a sqlite file hosted on http like a remote database! I can post my 10Gb database on S3 and run count(*) on main table and it will only take like 40kb of bandwidth.
> Honestly I don't see much use for yet-another-sqlite.
Agreed. I want something better than SQLite, something that learns from it and upgrades it further.
> On the other hand, I think the adoption and the fact that you can take your db basicially anywhere and it will run out of the bat is something you can't ignore.
Absolutely. That's why I am soon finishing my Elixir -> Rust -> SQLite library since I happen to believe most apps don't even need a dedicated DB server.
> I was briefly looking at pglite but I don't really see benefits apart from a niche use case when you really need that compatibility with big brother.
I would probably easily pay 1000 EUR next month if I could have SQLite with PostgreSQL's strict schema. That's the one weakness of SQLite that I hate with a passion. I know about strict mode. I am using it. Still not good enough. I want "type affinity" gone forever. It's obviously a legacy feature and many people came to rely on it.
Hence I concluded that SQLite will never change and something newer should arrive at one point. Though how do you beat the (likely) millions of tests that SQLite has? You don't... but we have to start somewhere.
> And then sqlite has so many hidden gems, like the scenario where you can use a sqlite file hosted on http like a remote database! I can post my 10Gb database on S3 and run count() on main table and it will only take like 40kb of bandwidth.*
Admittedly I never saw the value in that, to me that just seems like you are having a remote database again, at which point why not just go for PostgreSQL which is stricter and has much less surprises. But that's my bias towards strictness and catching bugs at the door and not 10 km down the road.
@types
I hear you. As someone who lost rest of his hair during last 10 years talking to frontend kids claiming that types are for grannies - I'm on your side. But having that said, sqlite has a niche application and you can enforce types on app layer, the same way you do it with web apps. Trust, but verify. Better - don't trust at all. At the end of the day the way I see it - it's just like protobuffers et al. - you put some data into a black box stream of bytes and it's your responsiblity to ensure correctness on both ends.
@serverless
It's twofold. On one hand you have the ability to move faster. On the other you have less moving parts that need maintenance and can break. Plus, for me personally, it's the default mindset. Let me give you an example - in 2025 still most online shops have filters that will trigger a full reload of the web page. When I'm clicking on TVs in a shop I don't need to reload the webpage everytime I click on something, the app could easily got the whole stock in single json and filter results on the fly while I'm fiddling with filters. Sure it doesn't work for amazon, but it works for 95% of shops online. Yet no one is doing it. Why?
My point - I'm looking for a way to simplify processes, and for some niche applications it's just more convenient.
Not really.
I tested couple different approaches, including pglite, but node finally shipped native sqlite with version 23 and it's fine for me.
I'm a huge fan of serverless solutions and one of the absolute hidden gems about sqlite is that you can publish the database on http server and query it extremely efficitent from a client.
I even have a separate miniature benchmark project I thought I might publish, but then I decided it's not worth anyones time. x]
It's worth noting that the data in that benchmark is tiny (28MB). While this varies between database engines, "one transaction for everything" means keeping some kind of allocations alive.
The optimal transaction size is difficult to calculate so should be measured, but it's almost certainly never beneficial to spend multiple seconds on a single transaction.
There will also be weird performance changes when the size of data (or indexed data) exceeds the size of main memory.
Hilarious, 3000+ votes for a Stack Overflow question that's not a question. But it is an interesting article. Interesting enough that it gets to break all the rules, I guess?
It's a (quite old) community wiki post. These do (and especially did back then) work and are treated differently.
yes, but they punt on this issue:
CREATE INDEX then INSERT vs. INSERT then CREATE INDEX
i.e. they only time INSERTs, not the CREATE INDEX after all the INSERTs.
As with any optimization, it matters where your bottleneck is here. Sounds like theirs is bandwidth but CPU/Disk IO is plentiful since they mentioned that downloading 250MB database takes minute where I just grabbed 2GB SQLite test database from work server in 15 seconds thanks to 1Gbps fiber.
30 minutes seems long. Is there a lot of data? I’ve been working on bootstrapping sqlite dbs off of lots of json data and by holding a list of values and then inserting 10k at a time with inserts, Ive found a good perf sweet spot where I can insert plenty of rows (millions) in minutes. I had to use some tricks with bloom filters and LRU caching, but can build a 6 gig db in like 20ish minutes now
It's roughly 10Gb across several CSV files.
I create a new in-mem db, run schema and then import every table in one single transaction (in my testing it showed that it doesn't matter if it's a single batch or multiple single inserts as long are they part of single transaction).
I do a single string replacement per every CSV line to handle an edge case. This results in roughly 15 million inserts per minute (give or take, depending on table length and complexity). 450k inserts per second is a magic barrier I can't break.
I then run several queries to remove unwanted data, trim orphans, add indexes, and finally run optimize and vacuum.
Here's quite recent log (on stock Ryzen 5900X):
Millions of rows in minutes sounds not ok, unless your tables have a large number of columns. A good rule is that SQLite's insertion performance should be at least 1% of sustained max write bandwidth of your disk; preferably 5%, or more. The last bulk table insert I was seeing 20%+ sustained; that came to ~900k inserts/second for an 8 column INT table (small integers).
Saying that 30 minutes seems long is like saying that 5 miles seems far.
The recently released sqlite_rsync utility uses a version of the rsync algorithm optimized to work on the internal structure of a SQLite database. It compares the internal data pages efficiently, then only syncs changed or missing pages.
Nice tricks in the article, but you can more easily use the builtin utility now :)
I blogged about how it works in detail here: https://nochlin.com/blog/how-the-new-sqlite3_rsync-utility-w...
Also note:
sqlite3_rsync is now built into the rsync.net platform.
… just added last week and not rolled out in all regions but … all initial users reported it worked exactly as they expected it to.Demands increasing page size if you sync frequently (bandwidth).
sqlite_rsync can only be used in WAL mode. A further constraint of WAL mode is the database file must be stored on local disk. Clearly, you'd want to do this almost all the time, but for the times this is not possible this utility won't work.
I just checked in an experimental change to sqlite3_rsync that allows it to work on non-WAL-mode database files, as long as you do not use the --wal-only command-line option. The downside of this is that the origin database will block all writers while the sync is going on, and the replicate database will block both reads and writers during the sync, because to do otherwise requires WAL-mode. Nevertheless, being able to sync DELETE-mode databases might well be useful, as you observe.
If you are able, please try out this enhancement and let me know if it solves your problem. See <https://sqlite.org/src/info/2025-05-01T16:07Z> for the patch.
Update: This enhancement is now on trunk and will be included in the 3.50.0 release of SQLite due out in about four weeks.
WAL mode works on many network filesystems provided it's being written from a single host at a time.
I'm not sure understand your comment. Regardless of WAL or network filesystem usage, the sqlite file cannot be written to from multiple processes simultaneously. Am I missing something here, or did you misstate?
Saving to text file is inefficient. I save sqlite databases using VACUUM INTO, like this:
From https://sqlite.org/lang_vacuum.html :It's cool but it does not address the issue of indexes, mentioned in the original post. Not carrying index data over the slow link was the key idea. The VACUUM INTO approach keeps indexes.
A text file may be inefficient as is, but it's perfectly compressible, even with primitive tools like gzip. I'm not sure the SQLite binary format compresses equality well, though it might.
SQLite tosses out the SQL once it is parsed into bytecode. Using text is just going to take longer, even though I’m sure it works great.
You can modify the database before vacuuming by making a new in-memory database, copying selected tables into it, and then vacuuming that to disk.
This should be the accepted answer.
> A text file may be inefficient as is, but it's perfectly compressible, even with primitive tools like gzip. I'm not sure the SQLite binary format compresses equality well, though it might.
I hope you’re saying because of indexes? I think you may want to revisit how compression works to fix your intuition. Text+compression will always be larger and slower than equivalent binary+compression assuming text and binary represent the same contents? Why? Binary is less compressible as a percentage but starts off smaller in absolute terms which will result in a smaller absolute binary. A way to think about it is information theory - binary should generally represent the data more compactly already because the structure lived in the code. Compression is about replacing common structure with noise and it works better if there’s a lot of redundant structure. However while text has a lot of redundant structure, that’s actually bad for the compressor because it has to find that structure and process more data to do that. Additionally, is using generic mathematical techniques to remove that structure which are genetically optimal but not as optimal as removing that structure by hand via binary is.
There’s some nuance here because the text represents slightly different things than the raw binary SQLite (how to restore data in the db vs the precise relationships + data structures for allowing insertion/retrieval. But still I’d expect it to end up smaller compressed for non trivial databases
Below I'm discussing compressed size here rather than how "fast" it is to copy databases.
Yeah there are indexes. And even without indexes there is an entire b-tree sitting above the data. So we're weighing the benefits of having a domain dependent compression (binary format) vs dropping all of the derived data. I'm not sure how that will go, but lets try one.
Here is sqlite file containing metadata for apple's photo's application:
Doing a VACUUM INTO: gzip -k photos.sqlite (this took 20 seconds): sqlite3 -readonly photos.sqlite .dump > photos.dump (10 seconds): gzip -k photos.dump (21 seconds): About 6% smaller for dump vs the original binary (but there are a bunch of indexes in this one). For me, I don't think it'd be worth the small space savings to spend the extra time doing the dump.With indexes dropped and vacuumed, the compressed binary is 8% smaller than compressed text (despite btree overhead):
About 13.5% smaller than compressed binary with indices. And one could re-add the indices on the other side.Yup, these results are pretty consistent with what I'd expect (& why I noted the impact of indices) cause even string data has a lot of superfluous information when expressed in the DDL ("INSERT INTO foo ...") - I would expect all of that to exceed any bookkeeping within the btree. And non-string values like blobs or numbers are going to be stored more efficiently than in the dump which is a text encoding (or even hex for blobs) which is going to blow things up further.
Brilliant. >60% savings. 700mb? wow.
Is that really necessary?
Depending on the bandwidth at the target site, which may be pretty remote, and not exposing a public internet service.
Ah no, I meant “is the snark necessary?” to the parent comment. I enjoyed the read!
Does that preserve the indexes? As the TFA mentioned, the indexes are why the sqlite files are huge.
You're right. It does. I never thought about it until you asked.
I think it won't preserve the index but it will recreate the index while running the text sql.
I was surprised that he didn't try to use on the flight compression, provided by rsync:
Probably it's faster to compress to gzip and later transfer. But it's nice to have the possibility to improve the transfer with a a flag.Or better yet, since they cite corruption issues, sqlite3_rsync (https://sqlite.org/rsync.html) with -z
sqlite transaction- and WAL-aware rsync with inflight compression.
The main point is to skip the indices, which you have to do pre-compression.
When I do stuff like this, I stream the dump straight into gzip. (You can usually figure out a way to stream directly to the destination without an intermediate file at all.)
Plus this way it stays stored compressed at its destination. If your purpose is backup rather than a poor man's replication.
The main point was decreasing the transfer time - if rsync -z makes it short enough, it doesn't matter if the indices are there or not, and you also skip the step of re-creating the DB from the text file.
The point of the article is that it does matter if the indices are there. And indices generally don't compress very well anyways. What compresses well are usually things like human-readable text fields or booleans/enums.
I believe compression is only good on slow speed networks.
It would have to be one really fast network... zstd compresses and decompresses at 5+ GB (bytes, not bits) per second.
I just tested on a ramdisk:
If working from files on disk that happen not to be cached, the speed differences are likely to disappear, even on many NVMe disks.(It just so happens that the concatenation of all text-looking .tar files I happen to have on this machine is roughly a gigabyte (though I did the math for the actual size)).
Looks like it depends heavily on choice of file, but I see good performance on both compressible and uncompressible files. Small files tend to perform (relatively) bad though. Here is a sample of 3 large files with different compression ratios:
Ain't no way zstd compresses at 5+, even at -1. That's the sort of throughputs you see on lz4 running on a bunch of core (either half a dozen very fast, or 12~16 merely fast).
Where are you getting this performance? On the average computer this is by far not the speed.
Valve tends to take a different view...
Valve has different needs then most. Their files are rarely change so they only need to do expensive compression once and they save a ton in bandwidth/storage along with fact that their users are more tolerant of download responsiveness.
Is the network only doing an rsync? Then you are probably right.
For every other network, you should compress as you are likely dealing with multiple tenants that would all like a piece of your 40Gbps bandwidth.
In your logic, you should not compress as multiple tenants would all like a piece of your CPU.
This will always be something you have to determine for your own situation. At least at my work, CPU cores are plentiful, IO isn't. We rarely have apps that need more than a fraction of the CPU cores (barring garbage collection). Yet we are often serving fairly large chunks of data from those same apps.
Depends. Run a benchmark on your own hardware/network. ZFS uses in-flight compression because CPUs are generally faster than disks. That may or may not be the case for your setup.
What? Compression is absolutely essential throughout computing as a whole, especially as CPUs have gotten faster. If you have compressible data sent over the network (or even on disk / in RAM) there's a good chance you should be compressing it. Faster links have not undercut this reality in any significant way.
Whether or not to compress data before transfer is VERY situationally dependent. I have seen it go both ways and the real-world results do not not always match intuition. At the end of the day, if you care about performance, you still have to do proper testing.
(This is the same spiel I give whenever someone says swap on Linux is or is not always beneficial.)
or used --remove-source-files so they didn't have to ssh back to rm
He absolutely should be doing this, because by using rsync on a compressed file he's passing by the whole point of using rsync, which is the rolling-checksum based algorithm that allows to transfer diffs.
In DuckDB you can do the same but export to Parquet, this way the data is an order of magnitude smaller than using text-based SQL statements. It's faster to transfer and faster to load.
https://duckdb.org/docs/stable/sql/statements/export.html
you can do it with a command line like this:
in my test database this is about 20% smaller than the gzipped text SQL statements.That's not it. This only exports the table's data, not the database. You lose the index, comments, schemas, partitioning, etc... The whole point of OP's article is how to export the indices in an efficient way.
You'd want to do this:
Also I wonder how big your test database is and it's schema. For large tables Parquet is way more efficient than a 20% reduction.If there's UUIDs, they're 36 bits each in text mode and 16 bits as binary in Parquet. And then if they repeat you can use a dictionary in your Parquet to save the 16 bits only once.
It's also worth trying to use brotli instead of zstd if small files is your goal.
SQLite has an session extension, which will track changes to a set of tables and produce a changeset/patchset which can patch previous version of an SQLite database.
https://www.sqlite.org/sessionintro.html
I have yet to see a single SQLite binding supporting this, so it’s quite useless unless you’re writing your application in C, or are open to patching the language binding.
In one of my projects I have implemented my own poor man’s session by writing all the statements and parameters into a separate database, then sync that and replay. Works well enough for a ~30GB database that changes by ~0.1% every day.
Well, my upcoming Elixir wrapper of a Rust wrapper of SQLite (heh, I am aware how it sounds) will support it. I am pretty sure people do find it useful and would use it. If not, the 1-2 days of hobby coding to deliver it are not something I would weep over.
I have updated the Lua binding to support the session extension (http://lua.sqlite.org/home/timeline?r=session) and it's been integrated into the current version of cosmopolitan/redbean. This was partially done to support application-level sync of SQLite DBs, however this is still a work in progress.
There are atleast two SQLite bindings for Go.
https://github.com/crawshaw/sqlite
https://github.com/eatonphil/gosqlite/
Ended up with the latter, but did have to add one function binding in C, to inspect changesets.
I'm open to adding it to my driver, if people consider it essential.
Every extra bit makes AOT compiling the Wasm slower (impacting startup time).
I also wanna keep the number of variants reasonable, or my repo blows up.
Add your votes for additional features to this issue: https://github.com/ncruces/go-sqlite3/issues/126
Have you used that? I've read the documentation but I don't think I've ever heard from anyone who uses the extension.
I have, atleast to confirm it does what it says on the tin.
Idea for an offline first app, where each app install call pull a changeset and apply it to their local db.
I really wish SQLite had some default way of doing change data capture via session or something similar.
Ignoring for the moment issues of syncing a database where the source DB may be running inserts/updates:
if one dumps tables as separate CSV files/streams and using DuckDB converts them to individual parquet files the rsync should run faster since hopefully not every table is modified between each new syncing. There is an obvious overhead of the back and forth conversions but DuckDB can directly export a database to SQLite. I have not tested it myself, so it is just a brainstorming.
Last but not least: when compressing/decompressing text dumps use igzip or pigz if you want to speed things up. Also benchmark the compression levels.
If you're regularly syncing from an older version to a new version, you can likely optimize further using gzip with "--rsyncable" option. It will reduce the compression by ~1% but make it so differences from one version to the next are localized instead of cascading through the full length of the compression output.
Another alternative is to skip compression of the dump output, let rsync calculate the differences from an previous uncompressed dump to the current dump, then have rsync compress the change sets it sends over the network. (rsync -z)
I think this could be a single pipeline?
ssh username@server "sqlite3 my_remote_database.db .dump | gzip -c" | gunzip -c | sqlite3 my_local_database.db
gzip/gunzip might also be redundant if using ssh compression with -oCompression=on or -C on the ssh call
My first thought, too. It also seems somewhat glaringly obvious that it needs a `pv` in there, as well.
Great example of Cunningham's Law: https://en.wikipedia.org/wiki/Ward_Cunningham#:~:text=%22Cun...
Wait... why would you even think about rsyncing a database that can get changed while being copied?
Isn't this a case for proper database servers with replication?
Or if it's an infrequent process done for dev purposes just shut down the application doing writes on the other side?
This reminds me of something I did back in 2008, we were using Postgres, not SQLite, and we needed to get a backup from one production machine onto several other machines. Copying saturated the network for an hour - And we needed to copy four times if I remember right. We didn't want to wait four hours.
I found this tool laying around "udpcast" and used it to send the backup over the network just once to all the destinations at the same time.
https://kylecordes.com/2008/multicast-your-db-backups-with-u...
Does the author not know that rsync can use compression (rsync -z | --compress | --compress-level=<n> ), or does he not think it worthwhile to compare that data point?
I just tried some comparisons (albeit with a fairly small sqlite file). The text compressed to only about 84% of the size of the compressed binary database, which isn't negligible, but not necessarily worth fussing over in every situation. (The binary compressed to 7.1%, so it's 84% relative to that).
bzip2 performed better on both formats; its compression of the binary database was better than gzip's compression of the text (91.5%) and bzip2's text was better than binary (92.5).
Though that is not available inside rsync, it indicates that if you're going with an external compression solution, maybe gzip isn't the best choice if you care about every percentage reduction.
If you don't care about every percentage reduction, maybe just rsync compression.
One thing worth mentioning is that if you are updating the file, rsync will only compress what is sent. To replicate that with the text solution, you will have to be retaining the text on both sides to do the update between them.
I am sure you can just pipe all this so you don't have to use an intermediate gunzip file.
Just ssh the machine, dump the SQL and load it back into SQLite locally.
rsync will transmit only the delta between the source and destination.
Believe there are command-line flags necessary to make it do that though: --inplace --no-whole-file
I've seen a suggestion several times to compress the data before sending. If remote means in the same data center, there's a good chance compressing the data is just slowing you down. Not many machines can gzip/bzip2/7zip at better than the 1 gigabyte per second you can get from 10 Gbps networks.
One of the coolest things you can do with Postgresql is pipe pg_dump straight into psql connected to another cluster on another host.
I recently set up some scripts to do this and it wasn't quite as simple as I had hoped. I had to pass some extra flags to pg_restore for --no-owner --no-acl, and then it still had issues when the target db has data in it, even with --clean and --create. And sometimes it would leave me in a state where it dropped the database and had trouble restoring, and so I'd be totally empty.
What I ended up doing is creating a new database, pg_restore'ing into that one with --no-owner and --no-acl, forcibly dropping the old database, and then renaming the new to the old one's name. This has the benefit of not leaving me high and dry should there be an issue with restoring.
sqlite provide the sqlite3_rsync command to safely copy databases https://sqlite.org/rsync.html
I used to work at a company that had a management interface that used sqlite as database, its multi-node / fallover approach was also just... copying the file and rsyncing it. I did wonder about data integrity though, what if the file is edited while it's being copied over? But there's probably safeguards in place.
Anyway I don't think the database file size was really an issue, it was a relatively big schema but not many indices and performance wasn't a big consideration - hence why the backend would concatenate query results into an XML file, then pass it through an xml->json converter, causing 1-2 second response times on most requests. I worked on a rewrite using Go where requests were more like 10-15 milliseconds.
But, I still used sqlite because that was actually a pretty good solution for the problem at hand; relatively low concurrency (up to 10 active simultaneous users), no server-side dependencies or installation needed, etc.
SQLite has a write-ahead log (WAL). You can use Litestream on top of that. You get single RW, multiple readers (you lose the C in CAP), and can promote a reader when the writer fails.
>I did wonder about data integrity though, what if the file is edited while it's being copied over? But there's probably safeguards in place.
You could do a filesystem snapshot and copy from that, but neither a cp or rsync is atomic.
sqlite3 has a backup API for this, which you can invoke using the .backup command in the sqlite3 CLI.
And then there is also https://www.sqlite.org/rsync.html
You can save time by using `zcat` instead of `cat` and skip the `gunzip my_local_database.db.txt.gz` step.
Why would anyone use gzip instead of zstd in 2025? zstd is superior in every dimension.
gzip is a legacy algorithm that imo only gets used for compatibility with legacy software that understands nothing but gzip.
You could speed up by using pigz (parallel gzip) too.
If you're going to use a less universal tool for compression you might as well go with zstd.
You don’t need cat at all for the restore. Can simply do:
sqlite3 data/database.db < “{backup_file}"
It all depends what you’re optimizing for.
Clearly, when you don’t transfer indexes, you will not have to transfer as much data.
However, the tradeoff is that the database is going to have to do more work to regenerate those indexes when you reconstruct it from the text dump at the destination (as opposed to the case where the indexes were included)
That makes zero sense. Incremental backup via rsync/sqlite3_rsync should always be faster.
For incremental backups sure, but I think OP's solution would win for one-off snapshots.
In case someone is looking for a proper way of making db copy in PostgreSQL and not SQLLite, pg_basebackup is the proper way: https://www.postgresql.org/docs/current/app-pgbasebackup.htm...
"Skip the index and turn it on later" is an old trick from mainframes and DB2.
He could’ve skipped the intermediate steps and made it a one-liner by piping the sql dump over ssh right into the sqlite3 reconstruction command.
???
Why not just compress the whole database using `gzip` or `lz4` before rsyncing it instead? `zstd` works too but seems like it had a bug regarding compressing file with modified content.
better yet, split your sqlite file to smaller piece. it is not like it needs to contain all the app data in a single sqlite file.
I have recently discovered a tool called mscp which opens open multiple scp threads to copy down large files. It works great for speeding up these sorts of downloads.
https://github.com/upa/mscp
zstd would be a better choice. It’s bonkers fast (especially when used with multithreading) and still compresses better than gzip. Alternatively, I’d recommend looking into bzip3, but I’m not sure if it would save time.
I guess for me it is obvious you don't try to copy running DB only a backup.
So I see basic stuff needs to be repeated as people still miss those kinds of things.
But I learned that you can easily dump SQLite to a text file - neat!
isn't this rather obvious? doesn't everyone do this when it makes sense? obviously, it applies to other DBs, and you don't even need to store the file (just a single ssh from dumper to remote undumper).
if retaining the snapshot file is of value, great.
I'd be a tiny bit surprised if rsync could recognize diffs in the dump, but it's certainly possible, assuming the dumper is "stable" (probably is because its walking the tables as trees). the amount of change detected by rsync might actually be a useful thing to monitor.
How long does this procedure take in comparison to the network transfer?
My first try would've been to copy the db file first, gzip it and then transfer it but I can't tell whether compression will be that useful in binary format.
The sqlite file format (https://www.sqlite.org/fileformat.html) does not talk about compression, so I would wager unless you are storing already compressed content (media maybe?) or random numbers (encrypted data), it should compress reasonably well.
Native compression in sqlite is offered as a closed and licensed extension.
https://sqlite.org/com/cerod.html
I wonder if there's a way to export to parquet files? They are designed to be extremely compact.
DuckDB can read and write both SQLite and Parquet.
In curious how your indices are twice the data. Sounds like you just put indices in anything you see.
I definitely have databases like this.
It's not carelessness, it's performance.
Quite simply, I have a table with 4 columns -- A, B, C, D. Each column is just an 8-byte integer. It has hundreds of millions of rows. It has an index on B+C+D, an index on C+D, and one on D.
All of these are required because the user needs to be able to retrieve aggregate data based on range conditions around lots of combinations of the columns. Without all the indices, certain queries take a couple minutes. With them, each query takes milliseconds to a couple seconds.
I thought of every possible way to avoid having all three indices, but it just wasn't possible. It's just how performant data lookup works.
You shouldn't assume people are being careless with indices. Far too often I see the opposite.
Sometimes using mmap for SQL queries can speed them up (if you want to avoid adding too many indexes or cannot modify the db).
Set this on the connection:
PRAGMA mmap_size=…
https://www.sqlite.org/mmap.html
I think this works as it removes data copying between the os and your process.
Your process reads data from the os to execute the query.
With mmap the data reads have less overhead, so a full table scan may be fast enough.
Hah they need to try harder then, I have seen more than 20x the data volume in systems where people are both paranoid and ignorant, a dangerous combo!
Doesn't this just push the runtime into index recomputation on the destination database?
Yes, however they seem to have a pretty slow internet connection
> Downloading a 250MB database from my web server takes about a minute over my home Internet connection
So for the original 3.4GB database that's nearly 15mn waiting for the download.
I usually use scp for this case, sometimes rsync version is not compatible between 2 machines
`zfs snapshot` and `zfs send -i` is hard to beat for DB backup.
how well does just the sqlite database gzip, the indexes are a lot of redundant data so your going to get some efficiencies there, probably less locality of data then the text file though so maybe less?
What technologies we have in 2025!
This is basically the way every other database is moved around.
Since sqlite is just a simple file-level locking DB, I'm pretty shocked they don't have an option to let the indexes be stored in separate files for all kinds of obvious and beneficial reasons, like the fact that you can easily exclude them from backups if they were, and you can make them "rebuild" just by deleting them. Probably their reason for keeping all internal has to do with being sure indexes are never out of sync, but that could just as easily be accomplished with hashing algos.
I’ve been looking into a way to replicate a SQLite database and came across the LiteFS project by Fly.io. Seems like a solid drop-in solution backed by FUSE and Consul. Anybody used it in production? My use case is high availability between multiple VMs.
Nice!
Pretty good point. I just wonder if databases in generally can be perfectly reconstructed from a text dump. For instance, do the insertion orders change in any of the operations between dumping and importing?
Very neat walkthrough, clear commands and I appreciate the explanations as to why this may help in OPs case
I'm surprised sqlite is duplicating data to make indexes? Surely it would just be manipulating groups of pointers?
it's a file - what am I missing? scp host:path .
Then entire point of the article is to answer this specific question
All this obsession with making processes like this faster
When is a guy supposed to get a coffee and stretch his legs anymore?