Using static websites for tiny archives

(alexwlchan.net)

391 points | by ingve 9 months ago ago

81 comments

  • egeozcan 9 months ago

    I copy the images in my clipboard and save them in an HTML file to have single-file galleries:

    https://gist.github.com/egeozcan/b27e11a7e776972d18603222fa5...

    Live:

    https://gistpreview.github.io/?b27e11a7e776972d18603222fa523...

    Selecting via file-picker works too. Dragging usually does not. When all works, images are inserted inline as blobs.

    After adding images, if you save the page (literally file->save), the blobs are saved together. don't want a part when saving (for example, removing images)? inspect element, remove, save page.

    throw the page on some server or just double click on your computer/mobile.

  • meonkeys 9 months ago

    Lots of folks mentioning Markdown in the comments. +1 to that. Plain text FTW. I think a lot about my own data hoarding / archiving, and plain text is such a key part of that. Very future-proof.

    Ever since WordPerfect I've preferred more deterministic, lightly-formatted documents with some way to see formatting characters directly. Markdown is brilliant, basically a DSL (domain-specific language) for HTML.

    The key to plain text is tooling! A couple Markdown tools I haven't seen mentioned here yet (even though they've come up on HN before) are:

    https://addons.mozilla.org/en-US/firefox/addon/markdown-view... - pretty-render Markdown right in the browser

    https://casual-effects.com/markdeep/ - standalone web-friendly Markdown formatter with many features

    • tducret 9 months ago

      Hey, I made a small JS that allows you to host a markdown file directly (no pre-conversion or plugin needed). It is rendered in the browser as HTML.

      It can definitely be used in such a local website, giving the convenience of just writing plain markdown.

      => https://www.tducret.com/pure-markdown/

      (source code) https://github.com/tducret/pure-markdown

      • telgareith 9 months ago

        Looks more like you just made an HTML page that invokes "marked. Min. Js" from the npm cdn. Neat demo.

      • Brajeshwar 9 months ago

        This is nice. I have used https://ndossougbe.github.io/strapdown/ to quickly throw out MarkDown files for people to see as rendered HTMLs. I love the cleanliness of your script. Thanks.

    • codazoda 9 months ago

      I use GitHub to host my markdown files. A bit more information is in this article I wrote about it. I actually have 4 or 5 similar articles with various thoughts on this. I'm trying to find a way to make it simpler, maybe even for non-technical users, but I'm not there yet.

      https://joeldare.com/using-neat-css-on-github-pages

    • Brajeshwar 9 months ago

      Same here. I have been doing more and more of Plain Text. I wrote about my thoughts on it for my archives at https://brajeshwar.com/2022/plain-text/

      Google/Search tends to send quite a lot of people looking for how to take notes with Plain Text, and they seem to have benefited from my simple write-up.

  • Rhapso 9 months ago

    I convert content to markdown and relevant images and then store them in an obsidian vault. I self-sync it with syncthing. It has quickly become a rather effective zettelkasten memory prosthetic on my laptop and phone.

    I also use google/facebook takeouts, reformat the results, and store+index all my human-facing correspondence in there. Text is cheap and I avoid most images. Its still under 200mb and instantly searchable with a nice UI and as a bunch of markdown files it is easily portable.

    • PoignardAzur 9 months ago

      > zettelkasten memory prosthetic

      You're really going to drop these three words without any context?

      • Rhapso 9 months ago

        I was hit on the head a lot as a child. My memory isn't great, so I take a LOT of notes. Those notes and the writing/searching tools to use them are very literally a memory prosthetic.

        Zettelkasten is a methodology of organizing a LOT of notes.

        I index by topic, date and people involved. I can look up a friend and re-read every shared IM, email, and event I logged almost instantly. Faster than any website can. It's my own personal pile of papers future historians will be excited to find because they can actually read it.

        One of my biggest frustrations is that most of my note-taking tools are not permitted in my workplace for security reasons. I have to keep all my notes on their infrastructure. I'm going to loose a chunk of my brain when I change jobs someday.

        • Terr_ 9 months ago

          That reminds me (heh) of a bit from one of my favorite book-series, involving someone recovering from a kind of brain injury.

          > “It’s been so long since I had to [use a holo-map], it didn’t even occur to me. It’s like an eidetic chip you can hold in your hand. It even remembers things you never knew before. Wonderful!” He unfastened his jacket, and pulled a second device from an inner pocket, a perfectly ordinary, though obviously best-quality, business audionote filer. “She gave me this, too. It cross-references everything automatically by key word. Crude, but perfectly adequate for ordinary use. It’s nearly a prosthetic memory, Miles.”

          > The man hadn’t had to even think about taking notes for the past thirty-five years, after all. What was he going to discover next, fire? Writing? Agriculture? “All you have to remember is where you put it down.”

          > “I’m thinking of chaining it to my belt. Or possibly around my neck.”

          -- Memory (1996) by Lois McMaster Bujold

          That last "audionote filer" is looking increasingly practical in real-life, cross-referencing and all.

          • ForOldHack 9 months ago

            I had dinner with Ted Nelson. He took out a voice tape recorder and said "Note to self:..."

            Dumbeldore touched a wand to his head, pulled out a thought and put it in his pesisive.

            The book 'Memory' is optional reading for a year long class in development. I still get things out of it.

        • sourcepluck 9 months ago

          Have you thought about writing up a lovely tutorial on this going into all the details? Seems like a lovely setup!

          • Rhapso 9 months ago

            Its in the backlog of notes labeled "blog post ideas" :'(

            I think more general tooling to "convert your assorted takeouts into a local database" is higher on my todo list. I have a bunch of python scripts I cobbled together to convert things. If we can get it all into an easy to use database, everybody could do their own things with them more easily.

          • adamhp 9 months ago

            There is a ton of content about Obsidian! Also it's a fairly intuitive interface. I'd just download it and start messing around, then check out the community plugins. If you really want to dig into notes systems, then you can Google PARA or Zettelkasten, but to me, that quickly begins to devolve into homework and needless learning curves. Just bolt on what you need it for. It's very full featured and if you feel like you're missing something, just search for a plugin.

        • bee_rider 9 months ago

          > I'm going to loose a chunk of my brain when I change jobs someday.

          Ah, that is quite sad. Do you write general impressions of the work day, at least, when you get home? I guess none of us remember all of the details of our workdays anyway.

        • deadfast 9 months ago

          I don't understand if you are self-syncing how would you lose your notes when switching jobs? Just exclude relevant work directories with sensitive info compress the vault and ship it to a cloud provider.

          • Rhapso 9 months ago

            I have to take work-notes on security isolated hardware and the notes are owned by my employer. I can't really expand on the subject.

            • all2 9 months ago

              This has been two of my last three jobs. I was fortunate to be able to save a list parser written in the worst possible language.

    • winter_blue 9 months ago

      What do you use to sync Obsidian on your phone? Is it syncthing as well?

      • Rhapso 9 months ago

        It works great on android. I have a laptop, my phone, and a NAS all syncing. The NAS does most of the heavy lifting. Its a little P2P data ship-of-Theseus as I replace machines over time. As long as I don't throw my laptop, phone, and NAS all in the river at once, my data is safe. The encrypted sync feature of sync-thing lets me and my so-inclined friends use each other as offsite backups. Its honestly the best open source software other than GNU apps or Linux I have ever used.

        Make sure you setup basic version control in syncthing, I had some issues with my daily notes getting clobbered because they were autogenerated by multiple obsidian instances.

      • Tuckerism 9 months ago

        I saw that the OP already replied, but wanted to share how I approach this myself. I have a desktop, laptop, and phone that I wanted to keep synced up, so I actually used it as an excuse to setup my own git repo on my NAS (which I wanted to do anyway).

        The only tricky part has been dealing with git on iOS. I have to use a particular app (Working Copy) and some shortcuts to get the syncing behavior consistent. But it is doable!

    • 9 months ago
      [deleted]
  • stared 9 months ago

    For personal use, I rely on Obsidian in a similar way—whenever I want to keep something (like an FB post I might want to share later), I save it along with the source link. External services can disappear anytime, so local data has the dual advantage of being owned by us and easily searchable.

    I also wrote a script to convert Kindle highlights into Markdown files. If anyone’s interested, I'd be happy to polish it a bit and share.

    For public-facing content, the Static Site Generator ecosystem keeps improving. I started with Jekyll (since it's the GitHub default), moved through Gridsome, and eventually landed on Nuxt 3 Content, which feels like the sweet spot for me. If I were starting now, I might have chosen Astro.

    In any case, the barrier to entry has never been lower. We can host sites for free on GitHub, and if custom styling is needed, AI models are incredibly helpful with CSS.

    Markdown is like JavaScript for text formatting. Despite its quirks, it just works.

    • byteknight 9 months ago

      I forked an android app [1] to share articles to the app, which converts to markdown and then sends to obsidian. I also use a Firefox extension that uses Obsidian extensuion Advanced URI to send markdown versions of articles (with frontmatter!) to Obsidian[2]

      [1] https://github.com/IAmStoxe/obsidian-markdownr

      [2] https://addons.mozilla.org/en-US/firefox/addon/markdownload/ - Theres also a chrome extension

    • darylfritz 9 months ago

      I'm interested in your Kindle script. I'm curious about how you handle saving FB posts—do you just copy and paste the content, convert it to markdown, and store it in Obsidian? Would love to hear more about your process!

      • stared 9 months ago

        Here is the script. You need to download "My Clippings.txt" from your Kindle device. https://gist.github.com/stared/ce732ef27d97d559b34d7e294481f...

        Regarding Facebook, I do it manually only for selected posts. I tried to do it otherwise, but Facebook exports don't have data about likes (it would be helpful to filter popular content) or comments (often more important than the original post itself).

        • mgobl 9 months ago

          Do you know if this file is possible to recover from/for kindle app highlights? (ie. off an Android Tablet) Or only off a kindle device?

  • G_o_D 9 months ago

    Been doing so since 15 years, i make portable html with embedded images, mp3 and much so that i dont need any special software for viewing, just carry it in cloud or my phone nowadays and you only need a browser on any device any os. With embedded mp3 in html, (yes size may grew large) l, but i dont need special music player software or app just browser,

    Nowadays along with html i try to archive using MHTML format instead of manually embedding

    Run a simple http server and start browsing archives

    FOR IMAGES I DO IS

    ---> Store all images in Folder

    ---> Open localhost server

    ---> Open folder in browser

    ---> Using javascript convert links to <img> tag with src=link

    --> Once browser fetches and displays all images Save as and i have embedded MHTML archive

    Or simple bash script can be used to create html with img tag and links to folder

    Or you can manuaaly template a MHTML

    BUT i let my browser do the heavy work why go manual,

    Also instead of BASE64 EMBED, EMDEDDING DIRECTLY BINARY IMAGES IN MHTML IS QUITE MORE EFFECTIVE AND LESS MEMORY CONSUMING

    Eg i have 15 images MHTML (binary encode) -> 4MB MHTML (BASE64 ENCODE) -> 5MB

    Another method i use is, Run python -m http.server on any folder

    Or linux : tree -H http://localhost:8000 Set recursion depth

    Then open folder link from server or tree created HTML IN BROWSER

    in cmd execute wget -rkpN -e robots=off http://localhost:8000

    It will recreate folder with index.html for you to browse, you dont need server then for viewing

    Same as export from google or twitter or youtube

  • pomdtr 9 months ago

    I had similar thoughts, and built myself a little framework for this: https://www.smallweb.run

    The key feature it adds compared to your own setup is mapping subfolders to subdomains (+ dynamic websites, but you don't seem interested in that).

    ex: ~/smallweb/example => https://example.localhost

    We have a little discord community at https://discord.smallweb.run if anyone is interested.

    • tga 9 months ago

      It looks like you just reinvented CGI/PHP.

      • pomdtr 9 months ago

        Yeah you guessed it, smallweb is basically CGI meets Deno sandboxing + https imports.

  • zirkuswurstikus 9 months ago

    Personally, I prefer VimWiki for taking notes during my work. So it is a place to mix ideas, small documentations and snippets of things I found on the web.

    Since I most of the time like to store articles, tutorial or nifty tricks, I like to store the entire website. For this task, my favorite Tool is SingleFile[1]. With SingleFile you can save a Website with embedded images. Also, you can add annotations, and cut away annoying Ads etc. Besides, it supports a distraction free copy of the website. I can highly recommend taking a look.

    [1]https://github.com/gildas-lormeau/SingleFile

    • paravz 9 months ago

      SingleFile is fantastic, became part of my "standard" browser install, next to uBlock and Tree Style Tab

  • ericyd 9 months ago

    I always find posts like this fascinating. I love the direction of going low tech and maintainable, but I have never once found myself spending significant time looking through old work. Photos are the one exception but I've always been fine just scrolling through my personal timeline of date-sorted photos. I used to spend more time on this sort of thing when I was younger and then at some point I just realized I'm never actually looking at it. I'd be curious to know some of the reasons people are frequently revising work from years ago?

    • famahar 9 months ago

      I rarely go back to screenshot but whenever I do I find inspiration and have even started projects suddenly. I think what I need is an archive that also randomly surfaces one of them once a day.

    • rnewme 9 months ago

      Life revised is life lived twice.

  • justusthane 9 months ago

    For myself at least, there's no way I'd stick with this over the long run given the overhead of hand-editing an HTML file (however quick and simple) every time I needed to add an item to a collection.

    Seems like an ideal use for a very simple DIY static-site generator. Write it in Bash or Perl and it will be future-proofed forever.

  • lloeki 9 months ago

    > Using a static website like this isn’t new – my inspiration was Twitter’s account export, which gives you a mini-website you can browse locally. I’ve seen several other social media platforms that give you a website as a human-friendly way to browse your data.

    I've read somewhere that Telegram exports work this way, you get a bunch of raw files somehow organised with directories and browsable by themselves, with a tiny local static website to browse them more conveniently.

    So different from the last such mass export I used: Google Takeout, which produces a dumb dump of cryptic xml and raw files named in some nonsensical (to the user) scheme. To this day I'm not even sure I got all the data I asked for before deleting it cloudside.

    • approxim8ion 9 months ago

      Facebook does (or at least used to when I left it a few years ago) this too. They give you a huge dump of your data and photos, with an html file to help you navigate it better.

  • massimoto 9 months ago

    I recently wrote a static site generator from AnyBox's local database, since they currently only allow for backups via iCloud which is locked down on my work laptop. I was surprised by the peace of mind it gave me to have a nice, 100% portable version of my vast bookmark/website archives.

  • corinroyal 9 months ago

    This excites me. Imagine someone not overcomplicating web tech. I've been thinking of having web sites render as epubs so we don't have to have a sysadmin on call 24/7 just so I can read.

  • lazylizard 9 months ago

    https://linux.die.net/man/1/tree

    will list your directory tree as a html file..helpful?

  • mathnmusic 9 months ago

    Strict hierarchies are indeed too rigid. What about using a tag-based file manager like TagSpaces (which is free and open-source)?

    • 9 months ago
      [deleted]
    • deafpolygon 9 months ago

      MacOS also supports tags.

  • ejddhbrbrrnrn 9 months ago

    Markdown files can be a magic low effort way to get this. Even less fancy. Just stick an md file and it is easy to link to stuff. Open it in VS Code. You can go full zettlekasten but you can also just drop some notes around.

    • chrisweekly 9 months ago

      that's one of the things I love about Obsidian -- at the end of the day, it's "just" an extraordinarily powerful interface to Markdown files.

  • itohihiyt 9 months ago

    Why not use a wiki? Zim desktop is text based local first. It doesn't handle videos but everything is handled. Search is good and you get the other benefits of a wiki. No mobile client, that I'm aware of.

    • Pfeil 9 months ago

      Markor is an Android Text Editor with Zim (and other markup) Support, althogh I never used the Zim compatibility. But it is probably worth a try.

      I like that zim is not automatically a hosted solution but a local app. I would love to see more local apps for archiving solutions and PKM. I just have some issues with the Zim app itself. It works nice for some of my use cases, but not for all. And I wish it would just use markdown (I know it has limits). Stuff like that.

      I think Zim does not really fit into the discussion because it does not rely on easily exchangeable standard software like a file explorer and browsers.

      That said, I believe that notetaking applications mostly exist because file explorers do an extremely bad job and integrating applications with them is too limited or at least too reunified. Look at what these applications offer. 80% of it is actually the task of a file explorer.

      • itohihiyt 9 months ago

        > I think Zim does not really fit into the discussion because it does not rely on easily exchangeable standard software like a file explorer and browsers.

        Zim is very much based on the file system. Each note is a text file and if it has attachments or embedded images they go into a folder named after that text note.

        Whilst not in markdown the markup used is easy to understand and convert. Zim itself allows you to copy a note to the clipboard in pandoc markdown and export the note to markdown and/or html (though admittedly the styling for html it atrocious).

        • Pfeil 9 months ago

          Yes, but you rely on a working installed software. If it is not properly maintained, you will need to switch at some point, and therefore change your current workflow. The assumption here is that file and web browsers will exist for a long time and not only make the data sustainable, but also the way you use it. Some of the other approaches shown in the comments make the browser not only the viewer, but also the tool. I am not saying zim and obsidian are completely different, but the assumption made above is significantly less likely to hold for these tools.

          I am not against zim or obsidian. In fact, I currently use plain markdown wit vs code, which boils down to a similar situation. But vs code and its extensions may be gone in a while and then I will have to look what to do.

  • miragecraft 9 months ago

    I'm doing the same except with the convenience of HTML includes.

    https://miragecraft.com/projects/x-include

  • crtasm 9 months ago

    >folders require you to use hierarchical organisation, and everything has to be stored in exactly one place.

    You can make aliases/shortcuts to files on MacOS, can't you?

    • kccqzy 9 months ago

      Personally I used to use the saved search feature in macOS which is much more convenient than aliases or symlinks. You can specify what to search for and then save them. The files themselves can have arbitrary text based metadata just by writing Spotlight comments for the files. For example for screenshots, I write in the spotlight comments things like from which app I took the screenshot and why I took this screenshot ("funny" or "delightful UI") and then I have a saved search that only searches for PNG files in the Screenshots folder with keyword "funny" (you get the idea). The actual files are of course still organized in folders by year just like the author.

    • prmoustache 9 months ago

      On pretty much any modern OS that uses a modern filesystem that is not exfat I believe.

  • chrisweekly 9 months ago

    Awesome post. I'm inspired to take a similar approach. Related tangent: https://sive.rs/ti

    is author/entrepreneur Derek Sivers' script for reproducing his bare-bones, low-overhead, long-term "Tech Independence" stack.

  • RadiozRadioz 9 months ago

    > folders require you to use hierarchical organisation

    I find symlinks work for this, which is what I do. I have big directories with the raw pictures dumped from my devices, then categorized directories linking to them.

  • mediumsmart 9 months ago

    thank you for posting, I have the same experiences looking for a good way to organize files etc - I tested this now and asked the oracle to write me a bash script that finds all images starting with Screenshot and list them in an html file that grids them at 200px width with click fill screen and second click dismiss. Such a good way to have an overview - going to implement that across the HD.

  • freitzzz 9 months ago

    Really nice idea! As a data hoarder myself, I think I will follow this as way to remind myself of the things I truly should archive :)

  • smugglerFlynn 9 months ago

    Just thought it would be cool to have a personal "data lifeboat", similar to Twitter export, for exporting Instagram

  • nyc111 9 months ago

    I use org-mode export to HTML and then ftp that to the server. Is the OP doing the same without the org-mode?

  • GrumpyNl 9 months ago

    Why the html files? You can just expose your directory structure on the web, no need for html.

  • thenoblesunfish 9 months ago

    Glad to see my own instincts here. Filesystems, text files, plain HTML, fun, long-lasting.

  • lovegrenoble 9 months ago

    nice idea

  • zoobab 9 months ago

    PDF is better for archiving, but what about videos?

    HTML ready sucks for archiving.

    • klez 9 months ago

      Can you please explain what you mean that PDF is better for archiving while HTML sucks in this aspect? What aspects of the formats are you basing this on?

      • gazook89 9 months ago

        I’m not the commenter but I Imagine it just boils down to what you are archiving, but in any case I don’t think the commenter really understands what html is being used for here. The “preserved” material doesn’t have to be html, the html is just to set up the directory navigation. In the blog post, they even mention that each type of material is its own website so that each website can be designed to handle that file/data type.

      • 9 months ago
        [deleted]
    • approxim8ion 9 months ago

      HTML supports video just fine and has for years. Can't imagine why it would be an issue.

  • noja 9 months ago