Open washing – why companies pretend to be open source

(theregister.com)

89 points | by Brajeshwar 5 hours ago ago

47 comments

  • martin-t 3 hours ago

    The second goal is muddying the waters and making people not care.

    Say you're deciding between two programs (or AI models)[0], you prefer an open source one, a colleague prefers one that just pretends to be open. You say your choice is preferable because it's open, he says the same about his choice. Then you say the dreaded "well, actually" and either you sound like a fundamentalist or an asshole.

    [0]: None of those are truly open source because they're all trained on stolen data. And see? Now I sound like a fundamentalist.

    • Spivak an hour ago

      I'm not sure why training on stolen data would disqualify them if said data was available or at minimum accurately specified what it was.

      • youoy 26 minutes ago

        If (stolen) data is available to download ok, that would be the accurate definition of open AI model. But "accurately specified" is not because you would need to trust that the person specifying it is actually honestly doing it. And I think we all know what happens to all that honesty when economic interests are in place.

  • neilv an hour ago

    Open source was always a corporate-friendly compromise, but seemed like some of the people involved had a lot of integrity.

    What we need is those open source people with integrity to put the smack down on those willfully abusing and destroying the terms.

    If you can't do it with trademarks/certifications/licensing/memberships/etc., do it with mainstream journalism. Like might be being done here, except The Register has long had rare insider knowledge, and is relatively niche. You need to get the message out to everyone who's not already in the know, including lawmakers.

    (Incidentally, the FSF also has integrity, but, besides prompting open source by being zero-compromise -- which is fine in their case -- they have an additional challenge of seeming to be clinically incapable of advocacy in situations that are aligned.)

  • bubblesnort 3 hours ago

    Open source never had any of the ethics or philosophy that free software has.

    Free software > open source.

    • trehalose 3 hours ago

      Do you think, if open source never existed, if there were only free software and non-free software, we wouldn't be arguing about whether AI corporations can truly call their free models free?

      • mrweasel 3 hours ago

        Companies always seemed much more weary of "free software" as compared to open source. Probably because of the ambiguous meaning of free in English, honestly that is one of the reason we have open source as a concept.

        Companies like the flexibility in "open source", even companies who release code as GPL rarely talk about "free software", they are open source companies.

      • pessimizer 2 hours ago

        How could we? Free Software makes it clear that when you modify the Free thing and productize it, you have to share the modifications with the public under the same licensing. What's there to argue about? You're either doing that or you're not. If you find a loophole in the text, then the license gets updated, the loophole explicitly closed, and everybody who agrees moves to the new version.

        • Ekaros an hour ago

          Free is ambiguous term. It might be free in code and price. Or it might be free in price, but closed source. It could be free for me as private person, but not for business.

          Is freeware free software? It is rather murky term for me.

        • arccy 2 hours ago

          based on current license choice of projects, turns out most people don't agree...

    • mistrial9 3 hours ago

      in English, the word "free" has not served well.. suggested alternative "libre" ... oh, except LOSS does not sound great! seems challenging right now.. "free" has failed IMHO .. it is literally mocked by finance people no? every adult in the US and elsewhere must pay bills.. "free" is failing as a label

      • homebrewer 2 hours ago

        Probably should have called it "freedom software" like "freedom fighter" or "freedom units" (as opposed to metric units).

        • bubblesnort 2 hours ago

          It's not too late for that.

        • Ringz 2 hours ago

          Don’t forget „Freedom Fries“.

        • anthk an hour ago

          Fair software.

      • pessimizer 2 hours ago

        Free Software has been wildly and unimaginably successful, and undergirds the world economy.

        • mistrial9 an hour ago

          certainly agree (to clarify)

  • scirob an hour ago

    an agregious example is thirdweb who technically has the product open sourced but is written to not work without an API key and phone home to SAAS to check your API call limit..

    https://github.com/thirdweb-dev/engine?tab=readme-ov-file https://portal.thirdweb.com/engine/self-host

    It makes me sad becuase I was working on a getting a team together to build a real opensource and free alternative but once they found thirdweb they all got discouraged thinking that no one will understand why our real open product is diffierent

    • josephcsible a few seconds ago

      If it's open source, can't you just fork it and remove that antifeature?

  • an_d_rew 3 hours ago

    I have worked at multiple companies that vilified open source anything, while building their entire businesses on Linux, Java, Debian, and thousands of other "OSI Approved" software.

    It's because, in my experience, the majority of businesses want to take but do not want to feel any obligation to give back or support.

    • Aeolun 3 hours ago

      Most businesses are started to earn money. Using free stuff while not giving anything away seems perfectly in line with those goals.

      • pessimizer 2 hours ago

        Which was the entire purpose of Open Source, from conception, and the only way it is distinct from other licenses. Open Source is like Free Software, except you can use it without giving anything away.

        • dragonwriter 2 hours ago

          > Open Source is like Free Software, except you can use it without giving anything away.

          No, Open Source and Free Software are two names for essentially the same thing. The Free Software Foundation has a preference for licenses which go beyond its own Free Software Definition [0] and which are also "Copyleft" [1], but does not define Free Software in a way which requires that it also be Copyleft.

          [0] https://www.gnu.org/philosophy/free-sw.en.html [1] https://www.gnu.org/licenses/copyleft.en.html

  • ahaucnx 38 minutes ago

    I believe often companies or rather decision makers are afraid of going fully open-source because they invested a lot of money into the product and are afraid some other company uses it, offers it cheaper and ultimately harms the originator.

    So even they might believe in open-source they put protections in place that ultimately lock it down and thus make it closed source but trying to keep the impression of being open.

    In our journey at AirGradient towards becoming fully open-source hardware (all code and hardware licensed under CC-BY-SA), we had the same concerns but ultimately decided to go full-in and open up everything with an officially approved open-source license.

    I believe there are a few important aspects and "protections" that are open-source compatible that help companies protect their investments.

    Firstly, requiring Attribution is compatible with open-source and can help companies get a lot of visibility and competitors probably don't want to attribute another company and thus are often not likely to clone.

    Secondly, using a share-alike license also makes it unattractive for many other companies using the code.

    Lastly, I believe the code itself is often not the valuable part compared to the brand value, employees, reputation, business model, network and implicit knowledge that a company builds up.

    It really worked for us to go that way with a true open-source license and I hope many others will do it too.

    There are already some easy to understand licenses like CC in place and I do hope that they also create awareness around "open washing".

  • tzs 2 hours ago

    > The Open Source Initiative (OSI) spells it out in the Open Source Definition, and Llama 3's license – with clauses on litigation and branding – flunks it on several grounds.

    Anyone know specifically what he is talking about here?

    The only things I'm seeing that I would consider to be clauses on litigation are one that terminates your license if you sue them claiming Llama 3 or its output violates your IP, and the have a choice of venue and choice of forum clause.

    Several OSI approved licenses have "terminate on patent suit" clauses. Llama 3 is termination on IP suit rather than just on patent suit but I don't see anything in the OSD where that would make a difference.

    There's stuff about trademarks, which I assume are the branding clauses he mentions. But I don't see anything obvious on the OSD that such clauses violate.

    • simonw an hour ago

      The Llama 3 license has all sorts of hokey extra clauses in it:

      From https://www.llama.com/llama3/license/

      > If, on the Meta Llama 3 version release date, the monthly active users of the products or services made available by or for Licensee, or Licensee’s affiliates, is greater than 700 million monthly active users in the preceding calendar month, you must request a license from Meta, which Meta may grant to you in its sole discretion, and you are not authorized to exercise any of the rights under this Agreement unless or until Meta otherwise expressly grants you such rights.

      This seems harmless... until you ask what happens if you start a startup on top of Llama 3, do really well and later try to get acquired by one of the companies that had more than 700m active users on that date (Apple, Microsoft, Google etc)

      > You will not use the Llama Materials or any output or results of the Llama Materials to improve any other large language model (excluding Meta Llama 3 or derivative works thereof).

      That's a pretty huge restriction on ways you can use the models. The language "to improve any other large language model" is also incredibly vague.

      > (B) prominently display “Built with Meta Llama 3” on a related website, user interface, blogpost, about page, or product documentation. If you use the Llama Materials to create, train, fine tune, or otherwise improve an AI model, which is distributed or made available, you shall also include “Llama 3” at the beginning of any such AI model name.

      I love this one, it means that if you fine-tune a model for erotic furry fan fiction you HAVE to call it "Llama 3 Erotic Furry Fan Fiction Writer" or similar.

    • pessimizer 2 hours ago

      But they said "several grounds" in the article. Isn't that enough? Why would you expect them to explain exactly where and how? A license is just a vibe anyway, it's the spirit that's important.

  • mirekrusin 2 hours ago

    True, this needs clarification that currently doesn't exist for large models where training costs heavy millions and binary artifact is both precious and malleable – unlike ordinary compilation.

    Regardless if – once OSI establishes their definition(s) – Meta will choose path of adherence or not, they still deserve a paragraph of praise for what they're doing.

    As a side note OSI should also recognize that in the era of giant cloud providers protection from predatory market participants is also a thing and should exist as clear licensing option. Mongo, Elastic and Redis drama could be avoided in the future if there was a clear option to protect author side sustainability without affecting open source spirit for end users.

    ps. I also believe that "Open <something>" should be protected phrase similar to how "Police", "Federal", "Government" or "Organic" is protected to not mislead the public so we don't have things like "OpenAI" nonsense.

  • Sytten 3 hours ago

    Direct consequence IMO of our failure to popularize good licenses in another concept like fair source that sits in-between open source and closed source. My small non-saas bootstrap company could not survive if it was OSS, but maybe fair source.

  • kvemkon 2 hours ago

    Related:

    OSI readies controversial open-source AI definition (26.10.2024)

    https://news.ycombinator.com/item?id=41951421

  • teddyh 2 hours ago

    Cue the several weasels who regularly turn up, arguing that “Open Source” can mean whatever they say it means, since they don’t accept the OSI definition.

    • ffsm8 2 hours ago

      I get were you're coming from, but there is truth to that - especially in english.

      As a random example, when is the last time you heard the term "racism" in the context of someone actually discriminating a person according to their race for example? I can't even remember the last time that happened, it's always some form of perceived discrimination, but the discriminator is usually nationality, cultural etc - never actually the race. And that's with the term racism actually including the word race

      There are countless examples like that because the words ultimately mean what the population in general thinks they mean vs what it initially meant.

      It's admittedly frustrating if you're aware of the original definition and people just randomly redefine the meaning however. I've experienced that quiet a few times at this point

      • teddyh an hour ago

        It’s different when it’s a regular word, used for ages. However, the term “Open Source” (as applied to software) was created by the OSI to explicitly mean exactly the OSI definition, no more, no less. The OSI definition was based on the Debian Free Software Guidelines, which Debian had to write because, IIRC, at the time not even the FSF had a strict definition of what constituted free software, and Debian needed some lines to be drawn in order to know what they did and did not want to distribute on Debian CD:s. Claiming something is “Open Source” but not OSI-approved is like claiming something is “legal” just because you personally think it’s acceptable, even when the actual law does not agree. Some terms come with strict definitions.

        • evanelias an hour ago

          > the term “Open Source” (as applied to software) was created by the OSI

          This is historical revisionism, and it's especially terrible that you'd call people "weasels" for correcting it. The term "open source" (as applied to software) was in-use prior to the existence of the OSI, and that's explicitly why the OSI wasn't able to obtain a trademark on the term. The term meant something roughly equivalent to how we use "source available" today.

          Read https://dieter.plaetinck.be/posts/open-source-undefined-part... for a really good deep-dive into the prior usage of the term.

      • JambalayaJimbo 40 minutes ago

        “Race” is itself a vague and almost meaningless term.

  • simonw 2 hours ago

    "Would it surprise you to know that according to the study, the big-name ones from Google, Meta, and Microsoft aren't? I didn't think so."

    Microsoft has a decent LLM that I'd consider to be "open source": Phi-3.5, under the MIT license: https://huggingface.co/microsoft/Phi-3.5-vision-instruct

  • rvnx 3 hours ago

    At the same time Facebook is doing some of the best efforts for open-AI, so it's a bit hard to blame them. They are not perfect but they still spent and shared the most important artifact that was created out of dozens of millions of USD spent (or even more), though not the dataset, but it is really a major advance forward.

  • lordofgibbons 2 hours ago

    > The pair found that while a handful of lesser-known LLMs, such as AllenAI's OLMo and BigScience Workshop + HuggingFace with BloomZ could be considered open, most are not.

    It's absolutely wild to think the deranged BigScience RAIL license, under which the Bloom LLM was released, is open in any way shape or form. It has more user-harming restrictions than basically any other LLM license out there.

  • gradientsrneat 2 hours ago

    Article commenter points out that Meta is a funder of the OSI. We'll see if that affects how the OSI defines "open" AI models.

    I find it funny how OpenAI was only indirectly mentioned. Still, I'm glad that this columnist is taking a principled stance by arguing aginst one of the more borderline cases.

  • rietta 3 hours ago

    I attended the referenced talk by Dan Lorenc in Alpharetta this week. It was very interesting. He hammered on how many licenses flunk the OSI test despite claiming to be open source.

  • meehai 2 hours ago

    I think Open Weights is a better name for AI models that don't share the reproducible training scripts and data.

  • blackeyeblitzar an hour ago

    It’s easy. They’re draining the phrase “open source” of meaning while gaining by marketing themselves that way. It’s fraudulent but also just exploitative.

  • stonethrowaway 2 hours ago

    I’ve commented on these moves and jukes a few months ago. In the spirit of not reposting, the original is here: https://news.ycombinator.com/item?id=41090142

  • nmstoker 3 hours ago

    See also: "AI Washing".

    Externally done to give a kick to sales efforts.

    And internally done in an attempt to get someone with AI resources to build blatantly non-AI functions by sticking then onto something with no or very little genuine AI angle.

    • BerislavLopac 3 hours ago

      To be fair, any products that rely on "AI" in naming or advertising is "washed" in some way -- AI is simply a marketing term, not a technical one. Especially considering that it covers so many different (albeit related) things -- LLMs, image generation, computer vision, machine learning etc -- that it became completely void of any useful meaning.

      • marcosdumay 3 hours ago

        AI is pretty much a technical term.

        It's just a very wide category that is basically meaningless nowadays because it applies to everything.

        Marketers are even restrained in using it, because applying it everywhere it could go would sound insane and cringe. But it is a technical term, that technically applies to all those things people put it on.

  • cranberryturkey 5 hours ago

    heh. i've seen this a lot lately.