This website is hosted on Bluesky

(danielmangum.com)

343 points | by hasheddan 6 hours ago ago

63 comments

  • pfraze 5 hours ago

    Appreciated Daniel reaching out to the team about this! Hosting blobs is one of those things that will inevitably go through iterations as we understand the abuse vectors more and more, but for now it's really fun to see this kind of usage in action. The PDS is meant to be a database host in the same sense that a webserver is a website host.

    • sebmellen 2 hours ago

      Are you ever going to bring back Beaker Browser? Used to love playing around with that! Didn't realize you'd gone on to Bluesky, very neat.

    • moritonal 2 hours ago

      Congrats on finding a role at Bluesky. Beaker was such an amazing project to follow, that experience must be so useful.

  • simonw 5 hours ago

    I was curious as to the security context this runs in:

        curl -i 'https://porcini.us-east.host.bsky.network/xrpc/com.atproto.sync.getBlob?did=did:plc:j22nebhg6aek3kt2mex5ng7e&cid=bafkreic5fmelmhqoqxfjz2siw5ey43ixwlzg5gvv2pkkz7o25ikepv4zeq'
    
    Here are the headers I got back:

        x-powered-by: Express
        access-control-allow-origin: *
        cache-control: private
        vary: Authorization, Accept-Encoding
        ratelimit-limit: 3000
        ratelimit-remaining: 2998
        ratelimit-reset: 1732482126
        ratelimit-policy: 3000;w=300
        content-length: 268
        x-content-type-options: nosniff
        content-security-policy: default-src 'none'; sandbox
        content-type: text/html; charset=utf-8
        date: Sun, 24 Nov 2024 20:57:24 GMT
        strict-transport-security: max-age=63072000
    
    Presumably that ratelimit is against your IP?

    "access-control-allow-origin: *" is interesting - it means you can access content hosted in this way using fetch() from JavaScript on any web page on any other domain.

    "content-security-policy: default-src 'none'; sandbox" is very restrictive (which is good) - content hosted here won't be able to load additional scripts or images, and the sandbox tag means it can't run JavaScript either: https://developer.mozilla.org/en-US/docs/Web/HTTP/Headers/Co...

    • benatkin 4 hours ago

      Blocking/allowlisting all JavaScript is the only way [1] to have a CSP fully contain an app (no exfiltration) [2] and with prefetch that might not be enough. The author is correct at the end to suggest using WebAssembly. (Also, it still has the issue of clicking links, which can be limited to certain domains or even data: by wrapping the untrusted code in an iframe and using child-src on the parent of the iframe)

      1: https://github.com/w3c/webappsec/issues/656#issuecomment-246...

      2: https://www.w3.org/TR/CSP3/#exfiltration

    • nightpool 4 hours ago

      is the default-src necessary if you're using sandbox or is it redundant?

      • johncolanduoni 3 hours ago

        `sandbox` doesn’t affect making requests via HTML (images, stylesheets, etc.).

  • SAHChandler 3 hours ago

    I'm very hopeful for the possibility of using bluesky for blob data.

    A friend and I had considered looking into storing DOOM WADs on bluesky so that "map packs" could be shared in the same way posts are. Follow an account, a list, or a starter pack, and you could theoretically modify GZDoom or some other client to know how to search and view any WADs posted by those accounts. Like how the Steam Workshop works, except it's via bluesky. :D

  • hi_hi 3 hours ago

    Could some awesome person possibly summarise any limitations or use cases where this might not work well?

    The example provided is quite basic static text, so I'm wondering if there's a reason for that?

  • edavis 4 hours ago

    If this sort of thing interests you, check out atfile: https://github.com/electricduck/atfile

  • Retr0id 5 hours ago

    The CSP headers didn't used to be there, which I used to pop an alert(), way back. (at the time there was also a MIME whitelist, but that whitelist included image/svg+xml, which allows script execution)

  • neuroelectron 2 hours ago

    What's the license for the Bluesky data btw? Is it something free to mirror and train LLMs on?

    • anon7000 an hour ago

      So the ToS explicitly says Bluesky does NOT own your data.

      However, data on AT Proto is fully public and it’d be trivial for someone to extract the data for AI to train.

      For example, this app shows you entries hosted on the protocol: https://atproto-browser.vercel.app/at/nytimes.com

    • phs318u 2 hours ago

      Based on https://bsky.social/about/support/tos#user-content , I would answer yes. While it's not expressly called out (permitted or forbidden), my reading of the above would indicate that it's not forbidden per se, and probably permitted ("Modify or otherwise utilize User Content in any media. This includes reproducing, preparing derivative works, distributing, performing, and displaying your User Content."). I believe training an LLM falls under "utilize" and "preparing derivative works".

  • steveklabnik 5 hours ago

    Ah this is super cool! I’ve been thinking about doing this with my website, but was going to leverage the whtwind lexicon, since my site is mostly a blog. But for the front page, and anything else, I may have wanted something else.

    This is more of an unstructured approach, which is cool because it needs less specialized tooling. It has the disadvantage of being… well, just a blob. No semantic information there.

  • h4x0rr 5 hours ago

    Anyone else feels like this will be abused for phishing and/or malware distribution?

    • remram 4 hours ago

      I don't see how. This is a direct link to the author's bluesky server (PDS) so of course it is controlled by them.

      • benatkin 4 hours ago

        Lack of moderation combined with an offical-sounding domain name.

        This would have to get the user to follow a link or call a phone number or something though. These are plausible. It's too bad the content-security-policy can't prevent following links.

        • extraduder_ire 2 hours ago

          Bluesky seems to use a lot of totally different domain names for each part of their infrastructure, maybe for this reason. e.g. this one is bsky.network

          While they're nowhere close on volume, they're certainly beating microsoft in terms of the rate they're adding similar looking official URLs.

        • anon7000 an hour ago

          I mean, the way AT Proto is designed, moderation primarily happens on the app layer, not the protocol layer. So on an app like Bluesky, you can have a lot of moderation. But the protocol itself allows hosting arbitrary content in a distributed/decentralized way.

    • lazystar 4 hours ago

      is there any hosting site that isn't? feels like a computing law at this point; if you build a hosting site, someone will try to use it for malicious purposes.

      • EGreg 4 hours ago

        Can’t you just make the hosting site features only be for real purposes?

        Like a link shortener which only forwards to a domain that matches the subdomain? Or only for watching videos and collecting metrics etc.

    • ineedaj0b 2 hours ago

      hehehe. I pinned it to the top research ideas. I'll get back to you on this

  • skybrian 4 hours ago

    I'm wondering whether a third-party PDS implementation should support other protocols as well. Would a combined git/PDS repo make any sense at all? (That is, it's a PDS, but it also implements enough of git to do read-only access via git commands.)

    What other protocols would make sense?

  • anacrolix 4 hours ago
  • slowhadoken 3 hours ago

    Whenever I hear about Bluesky I think about Jack Dorsey quitting their board and asked people to stay on Twittet/X.

    https://amp.theguardian.com/technology/article/2024/may/07/j...

    • crabmusket an hour ago

      What do you think about it?

      What I remember about that whole affair is that I'd really respected Jack for starting Bluesky, allowing it to be independent of Twitter (and Jay deserves a heaping of credit for pushing that!), and then losing that respect when he seemed to totally misunderstand what Bluesky had gone on to achieve.

      https://www.techdirt.com/2024/05/13/bluesky-is-building-the-...

      Jack was pushing Nostr at the time which... seems ok if you're into that. But his arguments in his interview with Mike Solana really didn't make sense to me.

      • strogonoff an hour ago

        Bluesky’s attitude seems logical and their reasoning aligns with my thoughts exactly.

        If techdirt’s article is to be believed, Dorsey’s departure has to do with going from an extreme to an extreme—from a traditional social monolith to a pure protocol—whereas Bluesky chose to pursue not only the protocol, but also “the app” as the face of that protocol for the ordinary user, and let’s face it: the ordinary user does not really care about protocols.

        My speculation about him suggesting people “stay on Twitter” is that Nostr (which he apparently is invested in now) and Twitter are orthogonal, so there is no conflict there, but Bluesky competes with both.

        Not a Bluesky user (the invite-only period has put me off for a while), but if they do not compromise on the protocol part (and there are no shenanigans unfolding, who knows, maybe Dorsey found something) their attitude seems to me to be the most reasonable for a mainstream social platform.

  • la64710 5 hours ago

    I think the AT protocol is versatile in that users can acces each others data once authenticated without any centralized service (granted the aggregators and some other things may still be centralized).

    • jazzyjackson 4 hours ago

      Is there any auth necessary to pull data from a PDS? I know the main relay is a public firehouse so I would be surprised, but maybe the PDS can put relay servers on an allowlist?

      • anon7000 an hour ago

        As far as I can tell, all content on ATProto is fully public without auth

  • bbor 5 hours ago

    Pretty awesome! Convenience link to the fascinating github issue linked at the bottom, featuring Bluesky celebrity pfrazee: https://github.com/bluesky-social/atproto/issues/523

    I have a lot of hope for AT. I'm sure there's lots of smart people on HN that have done great things with the Fediverse, but this whole paradigm just seems more sustainable + realistic. Basically it gives us centralization by default, but with real decentralized support when you need it / for power users.

    • jazzyjackson 5 hours ago

      As far as sustainability goes I'm hoping for a better business model than "accept funds from Blockchain Capital" [0], some return on investment in mirroring the firehouse. I can muse, a Discord alternative where some users pay to host longer videos (current limit is 60sec [1]) or Patreon where a relay takes a cut in exchange for managing access/decryption keys, or Bandcamp or some other kind of social marketplace - as it is theres no reason I couldn't do this, it is an open platform after all.

      [0] https://www.blockchaincapital.com/blog/bluesky-13m-users-and...

      [1] https://bsky.social/about/blog/09-11-2024-video

      • bbor 4 hours ago

        Yeah I’m also worried about profitability, tho not particularly concerned about that particular investor, personally; all VCs are inherently amoral profit generators. They are a “benefit corporation” like anthropic, which gives them some leeway to deny shareholder requests in the name of public good. Which is nice!

        In general I feel like social media is in the perfect spot for a huge shakeup as display ads breathe their last breath. Even if Google wins/draws out its Display Ads antitrust case and successfully implements some new interest-tagging system, I think anyone with a calculator and a newspaper subscription can read the leaves at this point; people are concerned about their data, and the money it generates is peanuts compared to more traditional advertising schemes. All of this is of course not even mentioning what I think intuitive algorithms will do (cynical or no, there’s lots of credentialed scientists saying that AGI (!!) is within reach in the coming decade, if not the coming few year).

        All that to say: I feel like they can find a way to make it work. Revenue doesn’t need to be as high anyway if you a) don’t have 1000 devs optimizing Display Ad A/B tests all day, and b) have the support of the open source community.

        • yokem55 4 hours ago

          If they can get ~100k subs to a $10/mo premium service similar to discord nitro, they are probably close to breaking even at the current scale and ops methodology. Which seems feasible.

  • tr1ll10nb1ll 5 hours ago

    unrelated probably, but it made me realize how I don't really see Hugo/Jekyll type websites anymore.

    • hipadev23 5 hours ago

      How do you even know? Don't those both just generate static html?

      • tr1ll10nb1ll 5 hours ago

        Footer. also Jekyll/Hugo sites use generator so you can mostly find it in the meta generator tag.

        Next.js sites are also a super easy find like this.

        • veqq 5 hours ago

          You can trivially remove it e.g. `disableHugoGeneratorInject = true` in `config.toml`.

      • thesdev 5 hours ago

        It says "Powered by Hugo" at the bottom of the page.

        • Zambyte 5 hours ago

          Depending on the theme.

          • rahkiin 5 hours ago

            I build my own themes and don’t include that either

    • zahlman 5 hours ago

      I see plenty of blogs generated from Markdown with tools like that.

      Has something overtaken Hugo and Jekyll in that space?

      • aryonoco 3 minutes ago

        If you like JS/TS, then Astro.

        I maintain a blog on Hugo but also host a couple of Astro ones. I think Hugo is great but to my eyes at least Astro has more active development behind it, and I also enjoy it more (probably because I know Typescript more than golang)

    • teitoklien 5 hours ago

      I build my own with Jinja2 templates my custom python script + mistune library to parse markdown to html, and a YAML file in similar format to Hugo (the previous generator i used to use)

      I found building my own custom one with python3, much more freeing in all sorts of interesting ways, I also exposed the static site generator with a FastAPI based API to auto build my website from my notes, my cooking recipes, database records, financials, git commits, etc to build me a private protected website (via nginx auth) from anywhere, whether via sending a text message to my telegram bot, or running a Shortcuts command on my iPad, or just directly running a command from my terminal.

      It took barely a day to setup, and allows me to run interesting custom extensions in all sorts of interesting ways, and builds me a personal website curated to my interest, where the primary viewer is supposed to be me. and it exposes a public barebones website with barely any content for everyone else.

      One of these days I think i’ll expose more of it to the world.

    • dangerlibrary 5 hours ago

      I just use mkdocs for everything.

      • dv35z 3 hours ago

        Have you found a decent bare bones starter theme? I've been using MkDocs Material, and I find the theme too complicated (HTML etc) - hoping to find a super simple one that looks decent - plain - and is a good base for theming / styling. Thanks & take care.

  • leoc 5 hours ago

    https://bsky.app/profile/leocomerford.bsky.social/post/3l7v6... To help the hard of clicking, this time I have pasted it all for you:

    Leo R. Comerford ‪@leocomerford.bsky.social‬

    Why was it decided not to build on any existing content-addressable networking system (IPFS or whatever)?

    November 1, 2024 at 12:39 PM

    ‪Leo R. Comerford‬ ‪@leocomerford.bsky.social‬ · 23d

    (Not implying that this was the wrong decision, it’s a genuine question.)

    ‪dan‬ ‪@danabra.mov‬ · 23d

    actually not sure i can answer this well. paging @bnewbold.net or maybe @why.bsky.team (who worked on IPFS btw)

    ‪dan‬ ‪@danabra.mov‬ · 23d

    my guess is that we’d want data hosting to be under direct control of the user (same as web hosting) rather than peer-to-peer, want instant deletion/edits at the source, need ability to move to a different host or take content down, need grouping into collections. not sure how much IPFS could adapt

    ‪dan‬ ‪@danabra.mov‬ · 23d

    we do use some pieces from IPFS through (aside from the actual peer to peer mechanism) ‪bryan newbold‬ ‪@bnewbold.net‬ · 4mo

    you can basically ignore it, we don't use "IPFS" proper anywhere.

    there are strong social connections, and we borrow some tech components like CIDs (flexible hash/digest syntax) and DAG-CBOR (more-deterministic subset of CBOR, good for signing+hashing) ‪

    Bumblefudge‬ ‪@bumblefudge.com‬ · 1d

    yeah this is all accurate. bluesky remixed a lot of IPFS components and patterns in interesting ways, but the monolithic global IPFS network (with chatty DHT distribution) wouldn't make sense here, BS made an infinitely more efficient/performant distribution of bytes tailored to its use case. ‪

    Bumblefudge‬ ‪@bumblefudge.com‬ · 1d

    FWIW the IPFS foundation is working on making IPFS more modular and easily remixed for future BlueSkies, but it's a big task decomposing the monolith and reorienting the documentation and ergonomics...

    [a second reply to the first skeet:]

    ‪Uai‬ ‪@why.bsky.team‬ · 23d

    As far as im concerned (and i led ipfs development for a number of years) we are using ipfs, just a specific streamlined implementation of it. All your repo data can be imported into an ipfs node and addressed via cid ‪

    Uai‬ ‪@why.bsky.team‬ · 23d

    We dont use libp2p because for a consumer mobile app we didnt want to futz with nat traversal and connectivity and the like, but its definitely possible to build a p2p version of bluesky

    • echelon 3 hours ago

      "skeet" is such a terrible term for this. It's like mastodon "toot"s.

      Using bodily functions as core infra terminology is off-putting and feels like a bit like a juvenile boy's club. I get that some people find it funny, but it alienates people. We should just call these "posts".

      Same thing with names like CockroachDB and GIMP.

      • xeeeeeeeeeeenu 3 hours ago

        The official Bluesky FAQ says this:

        >What is a post on Bluesky called?

        >The official term is “post.”

        https://bsky.social/about/blog/5-19-2023-user-faq

      • singpolyma3 34 minutes ago

        Eevn better: call them tweets. That's what they are.

      • leoc 2 hours ago

        Sure, whatever: I had certainly given it approximately no thought in this case, and my personal investment in 'sk**t' is zero. I'd edit my post but I seem to have hit the timeout. I will also say that I don't think this is the most interesting or on-topic thread to pull on from my comment.

      • bbor 2 hours ago

        Hard agree -- this one is especially bad because it's gendered. We'll see what happens, but I'd put my money on "post" winning out. There's some people on Bluesky who feel absurdly strong about this because of the history (the CEO asked them not to use it so they used it more often as a joke), but they're simply outnumbered already. Such is exponential growth...

  • bargainbot3k 5 hours ago

    History repeating itself sadly.

    • steveklabnik 5 hours ago

      In what sense?

      • bargainbot3k 2 hours ago

        I foresee another migration (away from bluesky) in some N years (< 10? 15?). not even trying to be bullish or anything. Just the same shit keeps happening since the WWW came to be. People seem to mistake altruism with just good ol’ business.

        Not sure why my parent comment was flagged. I guess when you can’t formulate a response, you flag and downvote? Is that the HN way?

        • noirbot an hour ago

          What is there to respond to? "This thing may happen at some point in the future" isn't insight or commentary.

          I can formulate a response, but it's already required more thought and effort than you seemed to have put into your comment. Engagement farming and bait isn't what HN is generally for.

        • ziddoap 2 hours ago

          >Not sure why my parent comment was flagged. I guess when you can’t formulate a response, you flag and downvote? Is that the HN way?

          It was a vague and negative one-liner, with no indication of what you were insinuating or why you think that way, from a brand new account.

          If you spent 30 more seconds to expand on what history you were referring to and why you think history is repeating, it would not have been flagged.

          • bargainbot3k 2 hours ago

            Are you a mod/admin here?

            • ziddoap 2 hours ago

              No, just someone who spends too much time here and am familiar with the trends.