19 comments

  • gnabgib 2 days ago

    Small discussion (10 points, 5 days ago, 4 comments) https://news.ycombinator.com/item?id=41867208

  • lazycog512 2 days ago

    I assume if it's in the open that it's going to be scraped and fed into the system, ToS or not.

    • DrillShopper 2 days ago

      Same though I do wish there was a way to enforce copyright against the giant megacorps (specifically on training AI) that see everything on the Internet as just part of their profit making empire.

      Though if I copied one of their things they'd bury me in court until I was either broke or dead.

    • reginald78 2 days ago

      I don't even think it needs to be in the open. I think the endgame for things like Windows Recall is to train on data on your local machine, and I'm sure they train on things in the cloud whether its openly available or not.

      • neodymiumphish 2 days ago

        Apple Intelligence seems well positioned to provide some of the best functionality for radical personalization. The potential for new devices to come with additional unified memory intended to run its LLM and vector databasing / additional training, it could use your specific writing style, time certain notifications based on when you’re least/most productive, etc. My guess is that they’re going to make this advanced functionality subscription based, since the vast majority of cases will require Private Cloud instances (unless _maybe_ you’re using a device with a significantly high amount of memory and strong enough M-series processor).

  • unsignedint 2 days ago

    Many people seem to have skewed expectations, but posting on X is no different from publishing a blog post. Unless they're taking similar actions for private posts, this isn’t too surprising. In fact, X is arguably more transparent about it. (Other platforms might not explicitly mention AI, but often include terms in their ToS that allow similar practices.)

    It wouldn’t be surprising if Facebook is doing the same, provided it only applies to public posts. Ultimately, if you don’t want your content scraped from the internet, the best defense is not to post it at all.

  • archagon 2 days ago

    If I prepend “by reading this message, you agree to not use it for AI training purposes” to my Tweet, why is that any less legitimate that the ToS I implicitly agree to by using Twitter?

  • rsynnott 2 days ago

    This seems like a particularly bad move, because:

    - The content is, er, not what you'd call high-quality.

    - Artists generally _hate_ genAI. Like, really, really, viscerally hate it. They're gonna lose whole communities over this.

  • rchaud 2 days ago

    I wonder what the ratio of "real human" posts vs mass-produced botspam is like in that dataset. Probably looks like the inside of a mortgage-backed security in 2006.

  • silisili 2 days ago

    What's it called when bots start learning primarily from other bots and get stuck in a loop, no longer acquiring any real new intelligence?

    • amenhotep a day ago

      Model collapse. "No longer acquiring any real new intelligence" would actually be a big breakthrough, I think - with current techniques we don't just stop improving, but start degrading. If LLMs are blurry jpegs of the entire corpus of human knowledge, then it's easy to imagine what happens when you start making a jpeg from a jpeg.

  • cyanydeez 2 days ago

    Im aure ina few years X will be tge dead internert.

  • ElonChrist 2 days ago

    [dead]

  • jayantbhawal 2 days ago

    tl;dr for those who don't want to open CNN:

    X's new terms of service, effective November 15, 2024, now allow the platform to use public posts to train its AI models. Users' content can be collected and adapted for various uses, which has raised privacy concerns.

    • jpl56 2 days ago

      Is it only for public posts, or also private ones ?

      I wouldn't post private information in a public area, but I happen to exchange adresses or account numbers in private messages, as I would do in emails. Not on X since I'm not on the platform, but any other one will do the same if not already done (e.g. Reddit).

      • beretguy 2 days ago

        I use private Twitter posts as a password manager, so this new development is very concerning to me.

        ... just kidding. There needs to be xkcd about something like this.

    • Sohcahtoa82 2 days ago

      > public posts [...] privacy concerns

      Someone please explain to me how someone would raise privacy concerns over things they've chosen to make public?

    • beretguy 2 days ago

      > tl;dr for those who don't want to open CNN:

      Thank you for your sacrifice. I have it blocked on a DNS level.