Lots of talk about must‑have features and backups here...
BUT there's another piece that makes or breaks these tools... whether they can build a community around them and stick around for years...
Open‑source cloud storage projects come and go when maintainers burn out... a sustainable business model or strong contributor base matters as much as technical checklists...
ALSO interoperability is underrated... if your drive can speak WebDAV or S3 and plug into existing identity systems, teams are more likely to try it...
In the end people want something that won't vanish after the honeymoon... that's harder than adding a progress bar...
Indeed. "S3 compatible" is the state of the art for object storage imho. As long as you can talk to a storage system that supports the basic S3 primitives, longevity is improved and there is no lock in. You can use S3 proper, Backblaze, Wasabi, Backblaze B2, local storage exposing an S3 api, etc. Any replacement is mostly drop in assuming it can read, scan, index existing objects.
Seafile is the only good enough thing i've found so far for self-hosted file sync. But it is still a pain to upgrade the server version. nextCloud and friends is a complete disaster in my oppinion.
> nextCloud and friends is a complete disaster in my oppinion.
Why is that?
Have been using NextCloud in our company and for myself, and I couldn't be happier, no issues since 3 years, all the tools and plugins I need, sync running perfect and hassle-free and performant. I thought it's generally liked up until now - I didn't try any of the alternatives though, so they might indeed be better. Though I don't have any reason to try them tbh, as NC works almost too well.
Using Nextcloud on the web feels like a state of the art 2015 PHP web UI. It is... fine. But compare it to immich for example and they're just not playing in the same league imo
100%. Though their UI has been update a little with the last major release.
> But compare it to immich for example and they're just not playing in the same league imo
I mean, this doesn't make sense at all, tbf. They're literally not in the same league, as their targeting different use cases. Nextcloud offers a MUCH broader experience, while Immich has a very clear cut focus and does nothing outside of that. Comparing it doesn't make any sense. Except if you're actually talking about the UI exclusively. Then, yes, Immich feels much more modern and smooth.
Theres a lot of weird setup often required on the backend in my experience, but when it works, it works well. But until you get everything dialed in it can have weird issues that don't have a clear path to fix them.
It might be better in their weird AIO solution? But i dont like the idea of giving a docker container the ability to spawn more containers. I just use one of their normal docker containers and have had to manually change a lot to make it work as they actually suggest. Like just recently i setup their notify_push plugin as it improves performance - but the provided setup instructions didn't work in my setup and i had to manually tweak several things.
Nextcloud suffers from flexibility, it's got a lot to offer but requires dialling in to your specific use case, the mistake most admins is to assume you can just run it without tuning, it has too many differing options to do that smoothly out of the box.
The ability to just run it in a snap has really contributed to this imho, Nextcloud is enterprise software you just happen to be able to run in your homelab.
Resilio is also pretty good, depending on your use case. (Syncthing is great too, but Resilio seems faster and better at NAT traversal in my experience.)
A bit off-topic, but is there a way I can convince various apps (Viber, WhatsApp) to use some replacement instead of Google Drive for backup? They do not offer such an option, but maybe by rooting the phone and faking the interface, or ...?
Open source drive tools live or die on three things.
1) Simple sync that never surprises.
2) Clean conflict handling you can explain to a non tech friend.
3) And zero drama upgrades.
If Twake nails those and keeps a sane on prem story with S3 and LDAP, it has a shot. The harder part is trust and docs. Clear threat model. Crisp migration guides from Drive and Dropbox. And a tiny CLI that just works on a headless box. Do these and teams will try it for real work, not just weekend tests.
I'd add a fourth; "Make it easy to do backups and verify they're correct".
I don't think I've ever considered a data store without that being one of my top concerns. This anxiety comes from real-life experience where the business I worked at had backups enabled for the primary data store for years, but when something finally happened and we lost some production data, we quickly discovered that the backups weren't actually possible to restore from, and had been corrupted this whole time.
Heh - I once made a little chunk of change, because a former client from 10-years previous discovered the shiny "DVD/CD" backups had succumbed to "bit-rot" and needed some source code.
I grabbed the hard-drive off the shelf, put it in an enclosure and handed them the source-code... (At the time, every time I upgraded my system, I would just keep my old drives, so... had a stack of them - buy a new external enclosure, slot it and park it.)
Depends. Even something basic like "Check if the produced artifact is a valid .zip/.tar.gz" can be enough in the beginning, probably would have prevented the issue I shared before.
Then once you grow/need higher reliability, you can start adding more advanced checks, like it has the tables/data structures you expect and so on.
"Great, first you wanted more money to buy compute and storage for dev and staging separate from production, and now you even more for 'testing backups'?!"
I’m not sure what your point is. Business continuity requires a disaster recovery plan that must be tested regularly. It might be considered slog work, but like taking out the garbage, it’s non negotiable and must be done.
I had a funny where I somewhat regularly test an sql backup, then one day it didn't work, it worked the second time, the 3rd and the 4th. I have no idea why it didn't work. It turned into a permanent background process in the back of my head. The endless what-if loop.
I'd like a manual "sync now" option. Sometimes I put stuff in google drive using windows explorer and it's not immediately obvious if it is syncing, why it is or isn't, or what I need to do to make it.
I've got a theory that progress bars for main functionality tasks and the associated manual triggers in modern software are out of favor, as it creates a stage for an error to be displayed and creates expectations the customer can lean on. Less detail in errors displayed to the customer removes their ability to identify a software problem as unique or shared among others.
I think you're right and I think I insufficiently considered malice as the reason for a lot of this type of minimalism. This "SWW" message is great as it doesn't even give a hint as to whether the problem is with the server (all vendor's fault), the network (not vendor's fault), or a client fault (maybe vendor's fault, maybe customer just needs to update it). Users can just do brute force things like "Swipe up the app and open it up again" and eventually just give up.
Syncing should be in the control of users. user should be able to trigger or abort the sync. Also it should provide some sort of indicator of progress.
I desperately want to be a fan of ownCloud, because it offers clients natively across Mac/Linux/mobile, but it’s such a mess. Every platform has small bugs and reliability problems that makes the whole thing useless.
If you just need a web interface to your filesystem, there’s this single Go executable (https://github.com/filebrowser/filebrowser) that supports sharing and minimal user management.
Snap isn't the best experience for Nextcloud in my experience, fine for a demo or a single user instance that isn't mission critical. Users who expect more out of it will often bump up against its limitations.
Anyone who wants to seriously use Nextcloud should look into the AIO docker containers or rolling the individual containers themselves. Nextcloud has expanded into a full groupware stack and it's expected you have an actual admin managing the system like with any real deployment of enterprise software
He complained about the difficulty of installing an application. He didn’t complain about establishing a personal data center.
That one line will give you the Nextcloud. Exactly one more line in snap will give you a self sign cert. Alternatively, the line below will give you remote access, a domain, and a valid certificate for your application:
It seems reasonable that someone would want to go beyond just installing software; they are presumably doing so in order to use it for its purpose. Being pedantic about the nature of the complaint (i.e. "He complained about the difficulty of installing an application. He didn’t complain about...") seems to miss the point. All of the additional steps you lay out also have their own steps to get done or decisions to be made, and when it is all said and done, it seems reasonable to imagine that things could get quite complicated.
I mean if you want a working Nextcloud instance, available through VPN with backups, then no, it doesn't get more complicated than that, actually. It is incredibly easy.
58.9% TypeScript and 32.6% JavaScript wouldn't be my first preference to implement such a high performance and throughput demanding application? Why is that?
+1 for Syncthing. I've been running it for years, after my student discount for Dropbox expired (Google drive and OneDrive were just getting traction at the time).
The mobile experience last I tried was pretty rough though. I don't really need my files on my phone and I have a web interface on my home server I can use to grab them in a pinch, but it's something to keep in mind.
Do you really need a database for this? On a unix system, you should be able to: CRUD users, CRUD files and directories, grant permissions to files or directories
Is there a decade-old software that provides a UI or an API wrapper around these features for a "Google Drive" alternative? Maybe over the SAMBA protocol?
How would you implement things like version history or shareable URLs to files without a database?
Another issue would be permissions: if I wanted to restrict access to a file to a subset of users, I’d have to make a group for that subset. Linux supports a maximum of 65536 groups, which could quickly be exhausted for a nontrivial number of users.
> Filesystem or LVM snapshots immediately come to mind
I use ZFS snapshots and like them a lot for many reasons. But I don’t have any way to quickly see individual versions of a file without having to wade through a lot of snapshots where the file is the same because snapshots are at filesystem level (or more specifically in ZFS, at “dataset” level which is somewhat like a partition).
And also, because I snapshot at set intervals, there might be a version of a file that I wanted to go back to but which I don’t have a snapshot of at that exact moment. So I only have history of what the file was a bit earlier or a bit later than some specific moment.
I used to have snapshots automatically trigger every 2 minutes and snapshot clean up automatically trigger hourly, daily, weekly and monthly. In that setup it was fairly high chance that if I make some mistake with an edit to a file I also had a version of it that kept the edits from right before as long as I discover the mistake right away.
These days I snapshot automatically a couple of times per day and cleanup every few months with a few keystrokes. Mainly because at the moment the files I store on the servers don’t need that fine-grained snapshots.
Anyway, the point is that even if you snapshot frequently it’s not going to be particularly ergonomic to find the version you want. So maybe the “Google Drive” UI would also have to check each revision to see if they were actually modified and only show those that were. And even then it might not be the greatest experience.
If you are on windows with a Samba share hooked up to zfs you can actually use the "previous versions" in file explorer for a given folder and your snapshots will show up :) there are some guides online on setting it up
With no command line use needed, you can:
Navigate the entire filesystem,
Create, delete, and rename files,
Edit file contents,
Edit file ownership and permissions,
Create symbolic links to files and directories,
Reorganize files through cut, copy, and paste,
Upload files by dragging and dropping,
Download files and directories.
I have no idea how this project was designed, but a) it's expectable that disk operations can and should be cached, b) syncing file shares across multiple nodes can easily involve storing metadata.
For either case, once you realize you need to persist data then you'd be hard pressed to justify not using a database.
I don't know of one- have thought this before but with python and fsspec. Having a google drive style interface that can run on local files, or any filesystem of your choice (ssh, s3 etc) would be really great.
... well, it makes sense to be able to do a "join" with the `users` and `documents` collections, use the full expressive range of an aggregation pipeline (and it's easy to add additional indices to MongoDB collections, and have transactions, and even add replication - not easy with a generic filesystem)
put all kinds of versioned metadata on docs without coming up with strange encodings, and even though POSIX (and NodeJS) offers a lot of FS related features it probably makes sense to keep things reeeeally simple
With SAMBA you just get boring old authentication, but with SCP you need to file a Form-72B with Site Command, ensure all new users pass a Class-3 memetic hazard screening, and then hope that the account doesn't escape containment and start replicating across subnets.
Sure, it's more overhead, but you can't put a price on preventing your NAS from developing sentience.
I do, password-protected of course. It is the only "native" way I found to get server files access to my iPhone without downloading a third party app (via Files).
I really hope you lock it down to something like Tailscale so that you have a private area network and your Samba share isn’t open to the entire world.
Samba is a complicated piece of software built around protocols from the 90s. It’s designed around the old idea of physical network security where it’s isolated on a LAN and has a long long history of serious critical security vulnerabilities (eg here’s an RCE from this month https://cybersecuritynews.com/critical-samba-rce-vulnerabili...).
It seems like every network filesystem is irredeemably terrible. SMB and NFS the stuff of security nightmares, chatty performance issues, and awkward user id mapping. WebDAV is a joke. SSHFS is slow. You can get really crazy with CephFS or GlusterFS, and for all that complexity, you don't get much farther way from SMB/NFS issues with those either.
Well one problem is that filesystem in general is a terrible abstraction both in terms of usability and in terms of not fitting well with how you design network applications.
I’d say Dropbox et all is closer to a good design but their backend is insanely crazy optimized to make it work and proprietary. There’s an added challenge that everything these days is behind a NAT so you usually end up needing to have a central rendezvous server where nodes can find each other.
Since you’re looking at rsync where you want something closer to Dropbox, I’d say look at syncthing. It’s designed in a way to make personal file sharing secure.
I need to remind that the time when a service's tenant — be it a file, email, whatever else — automatically meant there was an OS user account for that user, has also been decades ago.
I'm unironically convinced that a basic Samba share with Active Directory ACLs is actually probably the best possible storage system...but the UI for managing permissions sucks, and most people don't have enough access to set it up the way they want.
Like broadly, for all configuration Hashicorp Vault makes you do, you can achieve a much more useful set of permissions with a Samba fileshare and ACLs (certainly it makes it easy to grant targeted access to specific resources - and with IIS and Kerberos you even have an HTTP API).
I built something similar years ago. These are terribly hard to build, so I did a bit of digging.
1: This appears to be backed by a French company called Linagoria. I don't know much about the company, but they've been around for a bit.
2: I experimented with Mongodb for the similar product, and it turned out to be very unreliable. A lot can change since I used Mongodb, but in general, I'm weary of any product that uses it unless there's an expectation that data is lossy.
(Which was the problem Mongodb had at the time: Their CTO only wanted to target lossy data use cases, but the people interested in using Mondodb wanted a database that was easier to use than SQL.)
I’ve had similar warnings from multiple very senior devs to never go near mongo. So better explain that choice if you’re wanting adoption. Reliability was the concern.
At the time (2010), MongoDB was intended (from the creators) for handling high volumes of data where some loss was tolerable.
What happened was that its document model, and flexible index model, made it very attractive as an easy-to-use database. I used to call it the "Visual Basic" of databases.
I think the less technical people in marketing latched on to how a lot of people found MongoDB easier to work with, and there was a lot of selling to people who it shouldn't have been sold to.
The problem was that the lossiness nature of MongoDB didn't rear it's ugly head until deep in a project, and the assumptions made when writing documents lead to situations where operations required changing multiple documents; or other corner cases that triggered loss in larger schemas.
Of course, if you used MongoDB as intended, which was for ingesting lots of data with some tolerance of loss, you were totally fine.
With so much surveillance I think there's a real need for E2E on anything. I just bought the basic Tutanota package - but maybe that's just my OCD acting out.
I thought the same once, but apparently some of my friends literally do not own a PC. Only tablets or phones, no USB-A in the house except maybe in TV. Oh well, time for USB-C pendrives.
Surely you jest. I love USB sticks. But they are not a proper alternative to cloud storage. For example, how do I do share select files/folders with select people, in other countries?
Until you lose it, break it, damage it accidentally (via high humidity, high heat, etc). Arguably, if you run twake on some VPS, you have additional layers of redundancy by default.
> Looks like there is a single commit where a majority of the code came from.
I do this all the time, right before open sourcing a project. Basically while it's private, commit quality can be a bit rough, and if I want to open source it, I'll remove .git, make a new init commit then open source it. No one needs to see what I do in my private abode :)
The history of the development since its beginning can help a lot in studying the code, so I encourage people to avoid the single commit as much as possible.
It's much better to refactor (rebase) the messy commits, removing the personal or embarrassing stuff; although that might result in a "false" history, a series of smaller-sized commits will usually be much easier to follow than reading a whole code base all at once.
Really, I see a ton of open-source projects that do this, and it results in a lot of more opacity and friction than necessary.
It results in less people being able to check the code and contribute to the project.
I promise you're not missing much, except some commits that are implementing something, reverting it, implementing it again slightly differently, fixing typos, replacing 80% of the codebase in one swoop and similar stupid and un-needed stuff.
If the project is from the get-go supposed to be a long-lived project (like professional development for a business) then I agree, don't smoke the entire history no matter how embarrassing it is.
But for my personal projects, I can let you know that having access to the git history before I made it FOSS will make you dumber rather than being helpful for anything, compared to one clean starting commit.
Why do you think it's embarassing? The result is what reasonable people judge. And if you get to it through trial and error, well, that's how it's done almost everytime. It's normal
I don't? I said I remove it because it isn't useful to anyone, might even be adding more confusion than it solves, not because I'm embarrassed over anything.
If it really isn't useful, which I imagine means you committed somewhat haphazardly, ok, of course.
If there might be some usefulness hidden there (for example, trying something and then reverting it shows that you did explore it), it's also possible to place the old stuff in another repository or another branch (better the latter, unless it increases the repository's size too much)
> for example, trying something and then reverting it shows that you did explore it
True, those things tend to go into the documentation itself, checked into the codebase itself instead of being somewhat hidden inside the git history. Usually I end up having both a "Open Problems" (things yet to solve) and a "Tried X, this is why it didn't work" section somewhere in the documentation.
> it's also possible to place the old stuff in another repository
Yes, before the process I initially described, I usually leave a copy intact with the full-full history, but that's not what I published, just kept as an archive.
They were originally working on a MS teams replacement, with a bunch of things in one app like teams. (I tried it back then, it was pretty green). Now it looks like they are focused on drive, chat and email. The old app seems deprecated, so I presume they forked it into some of this new stuff.
> If you have a US startup called X and you don't have x.com, you should probably change your name.
But they do own https://twake-drive.com/ already? What exactly is your point here? Either you misunderstand the linked article, or I do. But seems people would be able to find that just fine if they search for, as twake-drive.com comes up as the first result when I search for "Twake Drive".
Besides, Graham's articles are almost always geared towards startups in one way or another. This doesn't seem to be that, so not sure I'd even try to read it if I was the owner of Twake Drive.
The name is hard to convey. Try telling someone verbally how to find it without error: "Twake. No, not take - like Wake with a T, Twayke. T double you ay kay ee. Oh, and there's a hyphen in the domain. T-Wake hyphen Drive dot com."
Re: should they read it? Either you want your product to spread, or you don't.
If you're posting it on HN, you want to share it, and for it to be shared. A tough name makes it harder to share, so you have to decide if you really want your product to spread or not.
Yeah - Twake is a terrible name though, tbf, I wonder what the use case is for open source cloud drive outside of pretty niche situations esp when the cost, in many cases, is for the infrastructure in part
It's really not clear: they seem to show a mobile app (https://static.tildacdn.com/tild3536-3661-4363-b433-35353561...) but there are no links to app stores anywhere, seems like they ended up on HN too early, maybe we should let them some time to get their stuff together
Given how integrated Drive and Docs are, if this doesn't have docs-like collaborative realtime document editing, for many people this is like "30% of Google Drive"
For people whose UX is dragging and dropping stuff to browser, and/or using a desktop sync client only, sure why not, the UI looks clean and familiar. But as someone who has used and still uses like 3 different similar things concurrently, the only real reason I use drive is because of the seamless zero-dependency office-like web software being part of the product.
(yes I know it's a curse too, I ended up writing a piece of software just to migrate company drive stuff to my personal drive when a company I was a cofounder in went bust to have a record ... those google docs can really only exist in Drive natively, any export is an immediate downgrade)
since it's I/O heavy an async web-oriented stack (ie. NodeJS) makes sense, and then TS is an obvious improvement over raw JS, and if the frontend is also JS/TS then at least there's some chance that expertise can be shared
The problem is such systems are also CPU heavy, with extensive hashing, encryption, and really quite a lot of general paperwork, and as such, a system that can efficiently use multiple CPUs is really important. I guarantee that plenty of Twake installs are absolutely spending a ton of time blocked on CPU, both because of the multithreading, and the general 10x-slower-than-C you can expect from Javascript on general code.
Javascript was a poor choice that will hold the project back just as choosing PHP for the base has done and continues to do a lot of damage to NextCloud/OwnCloud. This is not a task for a scripting language, because they're disqualified on performance. It's also not a task for dynamic typing, and using Typescript can help with that, but it doesn't change the fact that Javascript is just generally slow and does not play well on multiple CPUs.
This soundbite really needs to go away. It and its counterexamples don't apply in any significant measure. You can pay and still be the product, and that is often the case.
Lots of talk about must‑have features and backups here...
BUT there's another piece that makes or breaks these tools... whether they can build a community around them and stick around for years...
Open‑source cloud storage projects come and go when maintainers burn out... a sustainable business model or strong contributor base matters as much as technical checklists...
ALSO interoperability is underrated... if your drive can speak WebDAV or S3 and plug into existing identity systems, teams are more likely to try it...
In the end people want something that won't vanish after the honeymoon... that's harder than adding a progress bar...
Indeed. "S3 compatible" is the state of the art for object storage imho. As long as you can talk to a storage system that supports the basic S3 primitives, longevity is improved and there is no lock in. You can use S3 proper, Backblaze, Wasabi, Backblaze B2, local storage exposing an S3 api, etc. Any replacement is mostly drop in assuming it can read, scan, index existing objects.
Edit: @n3t heard wrt to the turn of phrase
https://en.wikipedia.org/wiki/State_of_the_art
Seafile is the only good enough thing i've found so far for self-hosted file sync. But it is still a pain to upgrade the server version. nextCloud and friends is a complete disaster in my oppinion.
> nextCloud and friends is a complete disaster in my oppinion.
Why is that? Have been using NextCloud in our company and for myself, and I couldn't be happier, no issues since 3 years, all the tools and plugins I need, sync running perfect and hassle-free and performant. I thought it's generally liked up until now - I didn't try any of the alternatives though, so they might indeed be better. Though I don't have any reason to try them tbh, as NC works almost too well.
Using Nextcloud on the web feels like a state of the art 2015 PHP web UI. It is... fine. But compare it to immich for example and they're just not playing in the same league imo
100%. Though their UI has been update a little with the last major release.
> But compare it to immich for example and they're just not playing in the same league imo
I mean, this doesn't make sense at all, tbf. They're literally not in the same league, as their targeting different use cases. Nextcloud offers a MUCH broader experience, while Immich has a very clear cut focus and does nothing outside of that. Comparing it doesn't make any sense. Except if you're actually talking about the UI exclusively. Then, yes, Immich feels much more modern and smooth.
Theres a lot of weird setup often required on the backend in my experience, but when it works, it works well. But until you get everything dialed in it can have weird issues that don't have a clear path to fix them.
It might be better in their weird AIO solution? But i dont like the idea of giving a docker container the ability to spawn more containers. I just use one of their normal docker containers and have had to manually change a lot to make it work as they actually suggest. Like just recently i setup their notify_push plugin as it improves performance - but the provided setup instructions didn't work in my setup and i had to manually tweak several things.
Nextcloud suffers from flexibility, it's got a lot to offer but requires dialling in to your specific use case, the mistake most admins is to assume you can just run it without tuning, it has too many differing options to do that smoothly out of the box.
The ability to just run it in a snap has really contributed to this imho, Nextcloud is enterprise software you just happen to be able to run in your homelab.
Resilio is also pretty good, depending on your use case. (Syncthing is great too, but Resilio seems faster and better at NAT traversal in my experience.)
running Nextcloud AIO has been reliable for me for a couple of years now.
A bit off-topic, but is there a way I can convince various apps (Viber, WhatsApp) to use some replacement instead of Google Drive for backup? They do not offer such an option, but maybe by rooting the phone and faking the interface, or ...?
On Android isn't it "just" a share-targrt? You can make a PWA that's a share-target pretty easy.
It seems that Twake is the result of Cozy Cloud joining Linagora: https://blog.cozy.io/en/from-7-july-your-cozy-cloud-begins-i...
Open source drive tools live or die on three things. 1) Simple sync that never surprises. 2) Clean conflict handling you can explain to a non tech friend. 3) And zero drama upgrades.
If Twake nails those and keeps a sane on prem story with S3 and LDAP, it has a shot. The harder part is trust and docs. Clear threat model. Crisp migration guides from Drive and Dropbox. And a tiny CLI that just works on a headless box. Do these and teams will try it for real work, not just weekend tests.
I'd add a fourth; "Make it easy to do backups and verify they're correct".
I don't think I've ever considered a data store without that being one of my top concerns. This anxiety comes from real-life experience where the business I worked at had backups enabled for the primary data store for years, but when something finally happened and we lost some production data, we quickly discovered that the backups weren't actually possible to restore from, and had been corrupted this whole time.
Heh - I once made a little chunk of change, because a former client from 10-years previous discovered the shiny "DVD/CD" backups had succumbed to "bit-rot" and needed some source code.
I grabbed the hard-drive off the shelf, put it in an enclosure and handed them the source-code... (At the time, every time I upgraded my system, I would just keep my old drives, so... had a stack of them - buy a new external enclosure, slot it and park it.)
Schrödinger's backup. Testing the backup works involves even more engineering and non creative work.
Depends. Even something basic like "Check if the produced artifact is a valid .zip/.tar.gz" can be enough in the beginning, probably would have prevented the issue I shared before.
Then once you grow/need higher reliability, you can start adding more advanced checks, like it has the tables/data structures you expect and so on.
"Great, first you wanted more money to buy compute and storage for dev and staging separate from production, and now you even more for 'testing backups'?!"
I’m not sure what your point is. Business continuity requires a disaster recovery plan that must be tested regularly. It might be considered slog work, but like taking out the garbage, it’s non negotiable and must be done.
I had a funny where I somewhat regularly test an sql backup, then one day it didn't work, it worked the second time, the 3rd and the 4th. I have no idea why it didn't work. It turned into a permanent background process in the back of my head. The endless what-if loop.
I'd like a manual "sync now" option. Sometimes I put stuff in google drive using windows explorer and it's not immediately obvious if it is syncing, why it is or isn't, or what I need to do to make it.
I've got a theory that progress bars for main functionality tasks and the associated manual triggers in modern software are out of favor, as it creates a stage for an error to be displayed and creates expectations the customer can lean on. Less detail in errors displayed to the customer removes their ability to identify a software problem as unique or shared among others.
"Something went wrong!"
I think you're right and I think I insufficiently considered malice as the reason for a lot of this type of minimalism. This "SWW" message is great as it doesn't even give a hint as to whether the problem is with the server (all vendor's fault), the network (not vendor's fault), or a client fault (maybe vendor's fault, maybe customer just needs to update it). Users can just do brute force things like "Swipe up the app and open it up again" and eventually just give up.
Syncing should be in the control of users. user should be able to trigger or abort the sync. Also it should provide some sort of indicator of progress.
As others have asked, how does it compare with nextCloud ownCloud? And does it have native clients for the usual suspects? Windows/Mac/Mobile...
I desperately want to be a fan of ownCloud, because it offers clients natively across Mac/Linux/mobile, but it’s such a mess. Every platform has small bugs and reliability problems that makes the whole thing useless.
I tried to install nextcloud once, and it was an exercise in misery.
If you just need a web interface to your filesystem, there’s this single Go executable (https://github.com/filebrowser/filebrowser) that supports sharing and minimal user management.
I couldn't get past installing required PHP extensions, as my hosting provider doesn't allow for that.
Overall it's no WordPress instance that works everywhere.
sudo snap install nextcloud
That’s all!
Auto updates and I can bet it will not break.
Snap isn't the best experience for Nextcloud in my experience, fine for a demo or a single user instance that isn't mission critical. Users who expect more out of it will often bump up against its limitations.
Anyone who wants to seriously use Nextcloud should look into the AIO docker containers or rolling the individual containers themselves. Nextcloud has expanded into a full groupware stack and it's expected you have an actual admin managing the system like with any real deployment of enterprise software
Is this a joke?
There's lots more to hosting your own file share/sync tool than just standing it up.
No, it was serious!
He complained about the difficulty of installing an application. He didn’t complain about establishing a personal data center.
That one line will give you the Nextcloud. Exactly one more line in snap will give you a self sign cert. Alternatively, the line below will give you remote access, a domain, and a valid certificate for your application:
curl -fsSL https://tailscale.com/install.sh | sh
You will have a functioning personal Drive on a VPS or a computer at this point!
Toggle snapshots on VPS for backups.
Setting up services with public clouds also takes some steps.
It seems reasonable that someone would want to go beyond just installing software; they are presumably doing so in order to use it for its purpose. Being pedantic about the nature of the complaint (i.e. "He complained about the difficulty of installing an application. He didn’t complain about...") seems to miss the point. All of the additional steps you lay out also have their own steps to get done or decisions to be made, and when it is all said and done, it seems reasonable to imagine that things could get quite complicated.
I mean if you want a working Nextcloud instance, available through VPN with backups, then no, it doesn't get more complicated than that, actually. It is incredibly easy.
When hand-waving away complexity, then yes, everything looks easy. :)
IME NextCloud is a bloated PHP monster with poor performance. Twake seems to be leaner and have a narrower scope.
58.9% TypeScript and 32.6% JavaScript wouldn't be my first preference to implement such a high performance and throughput demanding application? Why is that?
> 58.9% TypeScript and 32.6% JavaScript
Isn't that just 91.5% JavaScript?
TypeScript is not real.
Almost, but not entirely, unlike birds
It appears that the backend is written in TS, while the frontend in JS.
Personally I separate church and state by writing tests in JS and application code in TS.
If you're asking why these languages at all when this and that other language is faster, most likely it's less of a bottleneck than estimated.
Maybe ask all the startups looking to scale their TS\JS microservices "stack" using event driven architecture.
Give syncthing a go.
+1 for Syncthing. I've been running it for years, after my student discount for Dropbox expired (Google drive and OneDrive were just getting traction at the time).
The mobile experience last I tried was pretty rough though. I don't really need my files on my phone and I have a web interface on my home server I can use to grab them in a pinch, but it's something to keep in mind.
If you’re on iOS, try my (FOSS) app for Syncthing: https://github.com/pixelspark/sushitrain
Syncthing is easily the most effective FOSS I actively use. It just works and runs on everything.
Do you really need a database for this? On a unix system, you should be able to: CRUD users, CRUD files and directories, grant permissions to files or directories
Is there a decade-old software that provides a UI or an API wrapper around these features for a "Google Drive" alternative? Maybe over the SAMBA protocol?
How would you implement things like version history or shareable URLs to files without a database?
Another issue would be permissions: if I wanted to restrict access to a file to a subset of users, I’d have to make a group for that subset. Linux supports a maximum of 65536 groups, which could quickly be exhausted for a nontrivial number of users.
As for the permissions, using ACLs would work better here. Then you don't need a separate group for every grouping.
TIL about ACLs! I think that would nicely solve the group permission issue.
The final project for my senior year filesystems class thirty years ago was to implement ACLs on top of a SunOS 4 filesystem. That was a fun project.
Write up? Code? :D
Then let me also introduce you to extended attributes, aka xattrs. That's how the data for SELinux is stored.
Backup files the way Emacs, Vim,... do it: Consistent scheme for naming the copies. As for sharable URLs, they could be links.
The file system is already a database.
Ok this product will be for project with less than 65k users.
For naming, just name the directory the same way on your file system.
Shareable urls can be a hash of the path with some kind of hmac to prevent scraping.
Yes if you move a file, you can create a symlink to preserve it.
Encode paths by algorithm/encryption?
This wouldn’t be robust to moving/renaming files. It also would preclude features like having an expiration date for the URL.
Use sym link in that case to keep the redirect.
> How would you implement things like version history
Filesystem or LVM snapshots immediately come to mind
> or shareable URLs to files without a database?
Uh... is the path to the file not already an URL? URLs are literally an abstraction of a filesystem hierarchy already.
> Filesystem or LVM snapshots immediately come to mind
I use ZFS snapshots and like them a lot for many reasons. But I don’t have any way to quickly see individual versions of a file without having to wade through a lot of snapshots where the file is the same because snapshots are at filesystem level (or more specifically in ZFS, at “dataset” level which is somewhat like a partition).
And also, because I snapshot at set intervals, there might be a version of a file that I wanted to go back to but which I don’t have a snapshot of at that exact moment. So I only have history of what the file was a bit earlier or a bit later than some specific moment.
I used to have snapshots automatically trigger every 2 minutes and snapshot clean up automatically trigger hourly, daily, weekly and monthly. In that setup it was fairly high chance that if I make some mistake with an edit to a file I also had a version of it that kept the edits from right before as long as I discover the mistake right away.
These days I snapshot automatically a couple of times per day and cleanup every few months with a few keystrokes. Mainly because at the moment the files I store on the servers don’t need that fine-grained snapshots.
Anyway, the point is that even if you snapshot frequently it’s not going to be particularly ergonomic to find the version you want. So maybe the “Google Drive” UI would also have to check each revision to see if they were actually modified and only show those that were. And even then it might not be the greatest experience.
If you are on windows with a Samba share hooked up to zfs you can actually use the "previous versions" in file explorer for a given folder and your snapshots will show up :) there are some guides online on setting it up
Take a look at "cockpit", because if there were, that's where it "should" be.
https://cockpit-project.org/applications
--
> Do you really need a database for this?
I have no idea how this project was designed, but a) it's expectable that disk operations can and should be cached, b) syncing file shares across multiple nodes can easily involve storing metadata.
For either case, once you realize you need to persist data then you'd be hard pressed to justify not using a database.
I don't know of one- have thought this before but with python and fsspec. Having a google drive style interface that can run on local files, or any filesystem of your choice (ssh, s3 etc) would be really great.
... well, it makes sense to be able to do a "join" with the `users` and `documents` collections, use the full expressive range of an aggregation pipeline (and it's easy to add additional indices to MongoDB collections, and have transactions, and even add replication - not easy with a generic filesystem)
put all kinds of versioned metadata on docs without coming up with strange encodings, and even though POSIX (and NodeJS) offers a lot of FS related features it probably makes sense to keep things reeeeally simple
and it's easy to hack on this even on Windows
Perhaps they are using MongoDB GridFS instead of storing files on disk.
An SCP or FTP client maybe?
Definity. Though SAMBA supports authentication natively. With SCP and sFTP you'll need another admin server to create users.
With SAMBA you just get boring old authentication, but with SCP you need to file a Form-72B with Site Command, ensure all new users pass a Class-3 memetic hazard screening, and then hope that the account doesn't escape containment and start replicating across subnets.
Sure, it's more overhead, but you can't put a price on preventing your NAS from developing sentience.
You expose SAMBA shares outside your home network?
I do, password-protected of course. It is the only "native" way I found to get server files access to my iPhone without downloading a third party app (via Files).
I really hope you lock it down to something like Tailscale so that you have a private area network and your Samba share isn’t open to the entire world.
Samba is a complicated piece of software built around protocols from the 90s. It’s designed around the old idea of physical network security where it’s isolated on a LAN and has a long long history of serious critical security vulnerabilities (eg here’s an RCE from this month https://cybersecuritynews.com/critical-samba-rce-vulnerabili...).
It seems like every network filesystem is irredeemably terrible. SMB and NFS the stuff of security nightmares, chatty performance issues, and awkward user id mapping. WebDAV is a joke. SSHFS is slow. You can get really crazy with CephFS or GlusterFS, and for all that complexity, you don't get much farther way from SMB/NFS issues with those either.
My solution: Share nothing and use rsync.
Well one problem is that filesystem in general is a terrible abstraction both in terms of usability and in terms of not fitting well with how you design network applications.
I’d say Dropbox et all is closer to a good design but their backend is insanely crazy optimized to make it work and proprietary. There’s an added challenge that everything these days is behind a NAT so you usually end up needing to have a central rendezvous server where nodes can find each other.
Since you’re looking at rsync where you want something closer to Dropbox, I’d say look at syncthing. It’s designed in a way to make personal file sharing secure.
I think you should figure out how to quit while you're ahead. I wouldn't expose Samba to most of the devices on my LAN, never mind the internet.
Search for wannacry. You may rethink your setup.
I need to remind that the time when a service's tenant — be it a file, email, whatever else — automatically meant there was an OS user account for that user, has also been decades ago.
I'm unironically convinced that a basic Samba share with Active Directory ACLs is actually probably the best possible storage system...but the UI for managing permissions sucks, and most people don't have enough access to set it up the way they want.
Like broadly, for all configuration Hashicorp Vault makes you do, you can achieve a much more useful set of permissions with a Samba fileshare and ACLs (certainly it makes it easy to grant targeted access to specific resources - and with IIS and Kerberos you even have an HTTP API).
Can you name a single Google Drive clone that doesn’t use a database?
Would love to see your source code for your take on this product.
The Synology Drive version mirrors the filesystem, though I’m sure it has a database for sharing metadata. Is that what they mean?
Nextcloud too.
There is a database in most if not all useful cases, but there could also be the actual files separately.
I built something similar years ago. These are terribly hard to build, so I did a bit of digging.
1: This appears to be backed by a French company called Linagoria. I don't know much about the company, but they've been around for a bit.
2: I experimented with Mongodb for the similar product, and it turned out to be very unreliable. A lot can change since I used Mongodb, but in general, I'm weary of any product that uses it unless there's an expectation that data is lossy.
(Which was the problem Mongodb had at the time: Their CTO only wanted to target lossy data use cases, but the people interested in using Mondodb wanted a database that was easier to use than SQL.)
I’ve had similar warnings from multiple very senior devs to never go near mongo. So better explain that choice if you’re wanting adoption. Reliability was the concern.
At the time (2010), MongoDB was intended (from the creators) for handling high volumes of data where some loss was tolerable.
What happened was that its document model, and flexible index model, made it very attractive as an easy-to-use database. I used to call it the "Visual Basic" of databases.
I think the less technical people in marketing latched on to how a lot of people found MongoDB easier to work with, and there was a lot of selling to people who it shouldn't have been sold to.
The problem was that the lossiness nature of MongoDB didn't rear it's ugly head until deep in a project, and the assumptions made when writing documents lead to situations where operations required changing multiple documents; or other corner cases that triggered loss in larger schemas.
Of course, if you used MongoDB as intended, which was for ingesting lots of data with some tolerance of loss, you were totally fine.
There's also https://cryptpad.fr/ - https://cryptpad.org/ - https://github.com/cryptpad/cryptpad
That looks great, thanks for sharing.
I would add to that list something like a splitwise alternative.
And open source too? Seems too good to be true.
I always use https://ihatemoney.org/
I think you're looking for https://spliit.app/
I don't think that's end to end encrypted.
With so much surveillance I think there's a real need for E2E on anything. I just bought the basic Tutanota package - but maybe that's just my OCD acting out.
EDIT: This is closer, and you can self-host
https://github.com/cryptoboid/splitio
But it's in JavaScript <throw up> can't win them all.
Do you feel you need E2E even when you're self hosting?
https://github.com/spliit-app/spliit
USB sticks, the alternative to the cloud.
USB sticks can fulfill part of the "2" in the 3-2-1 rule.
https://en.wikipedia.org/wiki/Backup#3-2-1_Backup_Rule
I thought the same once, but apparently some of my friends literally do not own a PC. Only tablets or phones, no USB-A in the house except maybe in TV. Oh well, time for USB-C pendrives.
Not sure how i can collaboratively edit documents thanks to a USB stick.
Surely you jest. I love USB sticks. But they are not a proper alternative to cloud storage. For example, how do I do share select files/folders with select people, in other countries?
Until you lose it, break it, damage it accidentally (via high humidity, high heat, etc). Arguably, if you run twake on some VPS, you have additional layers of redundancy by default.
You mean, like the dns of AWS in us-east-1? #OhWait
Why not use Deno instead of Node.js for the backend? For a product like this could the extra security that Deno's sandbox provides help?
You could also just run the node.js process via a `systemd` service and sandbox it that way using hardening directives.
Zero percent chance I will ever trust my critical data to a mongo-backed service, personally.
With clients some of them have already made this bad decision; with my own personal files I get to avoid it.
Isn't Mongo source available too? So it sort of seems to contradict the mission of this organization to use it.
My first inclination too tbh.
And then I saw Npm references and thought “in JavaScript?!” But at least it’s typescript.
You lose JS but at least you get to keep the supply chain risks.
Mongodb used to suck. We use it at work for critical systems, it’s been rock solid for 3+ years.
why? since WiredTiger is the default storage engine it works
Is this a fork of something? Or recently open sourced? Looks like there is a single commit where a majority of the code came from.
> Looks like there is a single commit where a majority of the code came from.
I do this all the time, right before open sourcing a project. Basically while it's private, commit quality can be a bit rough, and if I want to open source it, I'll remove .git, make a new init commit then open source it. No one needs to see what I do in my private abode :)
Ha! 100% agree! Lots of my commits have personal info even. Months or years of changes, I'd rather squash and then push publicly.
The history of the development since its beginning can help a lot in studying the code, so I encourage people to avoid the single commit as much as possible.
It's much better to refactor (rebase) the messy commits, removing the personal or embarrassing stuff; although that might result in a "false" history, a series of smaller-sized commits will usually be much easier to follow than reading a whole code base all at once.
Really, I see a ton of open-source projects that do this, and it results in a lot of more opacity and friction than necessary.
It results in less people being able to check the code and contribute to the project.
I promise you're not missing much, except some commits that are implementing something, reverting it, implementing it again slightly differently, fixing typos, replacing 80% of the codebase in one swoop and similar stupid and un-needed stuff.
If the project is from the get-go supposed to be a long-lived project (like professional development for a business) then I agree, don't smoke the entire history no matter how embarrassing it is.
But for my personal projects, I can let you know that having access to the git history before I made it FOSS will make you dumber rather than being helpful for anything, compared to one clean starting commit.
Why do you think it's embarassing? The result is what reasonable people judge. And if you get to it through trial and error, well, that's how it's done almost everytime. It's normal
> Why do you think it's embarassing?
I don't? I said I remove it because it isn't useful to anyone, might even be adding more confusion than it solves, not because I'm embarrassed over anything.
If it really isn't useful, which I imagine means you committed somewhat haphazardly, ok, of course.
If there might be some usefulness hidden there (for example, trying something and then reverting it shows that you did explore it), it's also possible to place the old stuff in another repository or another branch (better the latter, unless it increases the repository's size too much)
> for example, trying something and then reverting it shows that you did explore it
True, those things tend to go into the documentation itself, checked into the codebase itself instead of being somewhat hidden inside the git history. Usually I end up having both a "Open Problems" (things yet to solve) and a "Tried X, this is why it didn't work" section somewhere in the documentation.
> it's also possible to place the old stuff in another repository
Yes, before the process I initially described, I usually leave a copy intact with the full-full history, but that's not what I published, just kept as an archive.
+1
They were originally working on a MS teams replacement, with a bunch of things in one app like teams. (I tried it back then, it was pretty green). Now it looks like they are focused on drive, chat and email. The old app seems deprecated, so I presume they forked it into some of this new stuff.
If you want to increase adoption, change the name: https://www.paulgraham.com/name.html
TDrive would work
It can go wrong too.
You search that in Google with file sharing keywords and the AI will helpfully correct it to 'do you mean GDrive?'
They would've lost a prospective user to a competitor while sounding like a knockoff of some other product.
search engine "correction" to GDrive is a good point. Both Brave & Duck correct to GDrive, but Google finds a local "t-drive" product in ZA.
> If you want to increase adoption, change the name: https://www.paulgraham.com/name.html
> If you have a US startup called X and you don't have x.com, you should probably change your name.
But they do own https://twake-drive.com/ already? What exactly is your point here? Either you misunderstand the linked article, or I do. But seems people would be able to find that just fine if they search for, as twake-drive.com comes up as the first result when I search for "Twake Drive".
Besides, Graham's articles are almost always geared towards startups in one way or another. This doesn't seem to be that, so not sure I'd even try to read it if I was the owner of Twake Drive.
The name is hard to convey. Try telling someone verbally how to find it without error: "Twake. No, not take - like Wake with a T, Twayke. T double you ay kay ee. Oh, and there's a hyphen in the domain. T-Wake hyphen Drive dot com."
Re: should they read it? Either you want your product to spread, or you don't.
If you're posting it on HN, you want to share it, and for it to be shared. A tough name makes it harder to share, so you have to decide if you really want your product to spread or not.
Yeah - Twake is a terrible name though, tbf, I wonder what the use case is for open source cloud drive outside of pretty niche situations esp when the cost, in many cases, is for the infrastructure in part
I don't think that advice has been relevant anymore for awhile now.
It's still relevant.
Dreaming of (and working on) making the ATProto PDS capable as a backend for authn/z and storage for ideas like this.
I've definitely been more motivated to de-cloud as the tech bros capitulate as well as push their ai way too hard
Cool, who's the audience?
Does it have mobile clients?
It's really not clear: they seem to show a mobile app (https://static.tildacdn.com/tild3536-3661-4363-b433-35353561...) but there are no links to app stores anywhere, seems like they ended up on HN too early, maybe we should let them some time to get their stuff together
Google safe browsing violation in 3...2...
Given how integrated Drive and Docs are, if this doesn't have docs-like collaborative realtime document editing, for many people this is like "30% of Google Drive"
For people whose UX is dragging and dropping stuff to browser, and/or using a desktop sync client only, sure why not, the UI looks clean and familiar. But as someone who has used and still uses like 3 different similar things concurrently, the only real reason I use drive is because of the seamless zero-dependency office-like web software being part of the product.
(yes I know it's a curse too, I ended up writing a piece of software just to migrate company drive stuff to my personal drive when a company I was a cofounder in went bust to have a record ... those google docs can really only exist in Drive natively, any export is an immediate downgrade)
versus nextCloud ownCloud ?
yes :)
Why do we need another file sharing platform?
it is not a new one, it used to be called Cozy drive before.
In TypeScript, interesting. Not the obvious choice IMO but trying to keep an open mind.
Was that because of team expertise or particular aspects of TS you thought suited the domain?
since it's I/O heavy an async web-oriented stack (ie. NodeJS) makes sense, and then TS is an obvious improvement over raw JS, and if the frontend is also JS/TS then at least there's some chance that expertise can be shared
The problem is such systems are also CPU heavy, with extensive hashing, encryption, and really quite a lot of general paperwork, and as such, a system that can efficiently use multiple CPUs is really important. I guarantee that plenty of Twake installs are absolutely spending a ton of time blocked on CPU, both because of the multithreading, and the general 10x-slower-than-C you can expect from Javascript on general code.
Javascript was a poor choice that will hold the project back just as choosing PHP for the base has done and continues to do a lot of damage to NextCloud/OwnCloud. This is not a task for a scripting language, because they're disqualified on performance. It's also not a task for dynamic typing, and using Typescript can help with that, but it doesn't change the fact that Javascript is just generally slow and does not play well on multiple CPUs.
How does it make money? Couldn't find any about us page or explanation. As we all know,if it's free, you're the product.
This soundbite really needs to go away. It and its counterexamples don't apply in any significant measure. You can pay and still be the product, and that is often the case.
Open Source != Free, feels like the typical HN user should know this better than the average user.
FWIW, the people working on this project has Mission and Vision pages on their website: https://linagora.com/en/mission https://linagora.com/en/vision
Took me a whooping 17 seconds to find those two.
I’m not sure, but if major companies start using it, they’ll definitely find a way to make money from it.
Damn bro, I didn’t know gcc had been exploiting me for all these years.
GCC was a psyop to destabilize the private compiler industry.
-Someone, surely
I'm pretty sure it reads your code, bro! Sus...