> The team that made dataroom has stated that they did not use any of papermark’s code and that dataroom was made from scratch with inspiration from existing document sharing softwares, and that this post’s allegations of us stealing code are false. [...]
The screenshots clearly show they copied whole pages verbatim, both design and texts. The founder, Nico Laqua, basically responding with "we didn't copy _code_" and not taking any responsibility says a lot about his and his company's moral code. It might not be enough to get sued. That doesn't make it right.
I did an interview a couple years ago when Corgi was first hiring engineers. Nico and I ... did not click and it was probably the least smooth interview I've ever had despite it just being a phone screen.
I wouldn't be that surprised if Nico genuinely thinks "we didn't copy the code" is a reasonable defense. It would be a clear cut rule, and extreme "shape rotator" types often have trouble with the fuzziness of things like law. In reality, copyright infringement is often more like the porn test, you know it when you see it.
they probably need to sue to enforce this, I think this is actually going to be a larger issue than just corgi. copyright with these models really is just a mess
What I don't understand is that if a lawsuit happens, then must the plaintiff produce their source code for verification ? Even so a git tree is trivial to change into some other arbitrary code even if a license violation has occurred. I also heard if proven the consequences are that they would lose all revenue starting from when the violation has occured
Since the Tweet is small enough and a lot of people aren’t reading it (Twitter links don’t work well for those without an account some times) I’ll quote it here
> Hey Nico,
> It looks like you didn't vibe code your data room but stole it from Papermark's open source and enterprise-licensed code.
> We demand you take this copyright and license infringing product down immediately.
> It's not moving fast and breaking things, it's fraud.
> It makes the rest of your business questionable and the YC community look terrible.
I wonder if Nico will be feeling so cocky when Papermark gets their general counsel involved. The public Twitter shaming was clearly an attempt to resolve this without litigation, but hey, if that's how Nico truly feels, guess he gets to see what's behind door #2 (a massive bill for a legal retainer).
I didn't realise that one could forcibly require a competitor to disclose trade secrets.
Now, INAL of course, but I would think this sort of mechanism would be quite gameable from both sides ( i) a wealthy competitor legally forcing a promising upstart to reveal source ii) a copycat working out some kind of arrangement where the code itself is licensed to them via shell company based overseas.)
As with most legal hacks, the courts figured this one out long ago :).
If someone is trying to dig into their competitor's trade secrets via discovery, the court offers multiple ways to safeguard against that. The defendant can identify information as a trade secret and ask that it be protected in some way - for example, the documents may be restricted to "Attorneys' Eyes Only", so while the plaintiff's attorneys can review the material, the plaintiffs themselves are barred from reviewing it. Or the judge themselves may get involved in an in-camera session.
There are software engineers that specialise in source code analysis that lawyers will often use in these cases. The engineers will be given access to source code in secure environments where they're not allowed to bring any device in or out. They review, analyse, and write up a report using pen and paper, that can then be reviewed by the lawyers.
Absolutely. It was very similar to one of my first jobs: "Legal Technical Analyst". Not as much time doing deep source analysis, but basically translating things for lawyers: "So as far as this claim of copyright/plagiarism... this block here, that's CS 101 stuff, that block there, that's novel, and does x, y and z".
The X link has screenshots where the two products have lots of identical pages. Is that IPable? Honestly don't know since I seem to use a lot of products that look like other products (LibreOffice, etc). But the pages for obscure things looking identical is kind of sus.
What's with this response in the Twitter thread??:
"This ain't what a C&D looks like. Implies you don't actually have a leg to stand on. Upload a copy of your official legal demand (from a lawyer) or I'll forever see your company as one who attempts to bully the competition in public"
Yeah, the title that the OP chose is so sufficiently misleading that I think this one will need to be get changed by the mods. Seitz isn't opining on the ethics of vibe coding in his tweet, he's pointing out that Corgi literally just stole Papermark's AGPL codebase and passed it off as vibe coding.
It's nearly word-for-word the content of the tweet. Right at the top. It isn't misleading unless you literally don't even bother to open the linked content.
Just ban users who comment without reading, I think that would go further to keep the quality of discussion high.
The number of bots/trolls responding to the title without reading the content and missing the point entirely is astounding, honestly, and I don't think any of those posts are contributing to high quality discussion. We could do without those users.
"but but but I can't/won't open twitter links" - then don't flap your yak-hole. Ignoring for a moment that the content has been reproduced in full in this thread, and another user has provided an alternative xcancel link.
Ideally yes, but we know people don't RTFA - there's a reason that initialism dates back to early Slashdot.
The paraphrase is doing a lot of heavy lifting to convert it to ragebait. Had the OP gone with something like "you didn't vibe code it, you plagiarized Papermark's open source project" (may need some editing to fit under the character limit) it would have at least been more true to the original tweet.
I know I RTFA, and I know I'm not interested in discussing things with people who don't. Maybe others feel differently, because more people is better or something. Information pollution is a serious, persistent, growing problem and I'm just not inclined to be tolerant about it anymore. Mistakes are one thing, deliberate stupidity is another.
If you come to book club without reading the book, and you derail the conversation into something completely irrelevant, you're not getting invited back.
I remember a few cases when asking an LLM to do something in the early days yielded not only the code but an author and a COPYRIGHT license.
Naturally LLM technology has moved on since then. I don't remember any recent word for word reproductions of a copyright license.
There are a lot of people lauding the technology though because it occasionally one-shots a wildly impressive example of something which...already exists.
Many open source licenses levy restrictions upon the acceptable use of the software. Those restrictions may include attribution requirements, up to and including a requirement to include the license when redistributing the code; they may forbid using derivative works for commercial purposes; they may require the downstream project to utilize the same license. Open source is not the same thing as "anybody can do anything they want forever."
Yup, if we take OSI as defacto authority on open source definition
> 6. No Discrimination Against Fields of Endeavor
> The license must not restrict anyone from making use of the program in a specific field of endeavor. For example, it may not restrict the program from being used in a business, or from being used for genetic research.
Well, if it's my memory at fault then I apologize. My memory of the comment I replied to didn't include the initial qualifying phrase with either word choice.
Copyright violation is not theft. Your effort to create something that can be effortlessly copied conveys to you no property. Society deems it beneficial to grant a time limited monopoly on copying it to spur innovation.
Stealing a car - or anything tangible - means... the owner is very literally deprived of the benefits of owning said car/thing. Can't really say the same for a copied pattern of bits.
Copyleft is still a thing. Right to attribution is still a thing. Please, read about it and you will discover that there is a lot of nuance to the open-source code.
LLM generated code could have very similar pattern to existing code with stricter license it trained on. So, it's better to keep them to yourself instead of bothering the public.
I agree. It's a sarcasm of the new reality. What is copying vs writing from scratch? The line is blurred now, non-existent. You can ask an LLM to re-write any open source to a degree where there is no definite way to say that it's a derivative.
It is, but this isn't competition. This just copyright infringement.
Competition would be if these people created their own software, possibly innovating and improving it in the process. That would encourage Papermark to improve their own offering, and would create an environment where these businesses are economically incentivized to improve the product or service.
Nobody is incentivized to improve the software in question here. If copyright law doesn't protect anything, then improving your product is helping the competition and potentially hurting your business. Same is true if you're the people who did the infringement.
Who cares if the consumer buys it and uses it? Information is worth nothing anymore, attention is, so if they manage to capture a larger audience somehow, they win.
What do you do for a living? For most of us in the tech industry, information being worth something (because it takes creative and intellectual labor to produce) puts food on our tables.
LLMs produce about 95% of the code at my company and review about 70% of it for 3 years now. Our team has downsized from 40 to 8 people in this time. My creative labor is spent writing harnesses and wrappers. When there is enough of a data distribution on this, the LLMs will be able to do that as well.
I have saved up a buffer in funds and bonds because it's going to be over at some point when the company moves from explore to exploit.
When everyone is using LLMs to suggest IA, build basic UIs, dump out your startup in a day, etc. everything will look the same, even the source code. There will be no way to litigate this. Does it benefit society to force two companies to make their products look different? Where’s the outrage over all basic pencils looking the same? Let the market decide which pencils it prefers.
Their response:
> The team that made dataroom has stated that they did not use any of papermark’s code and that dataroom was made from scratch with inspiration from existing document sharing softwares, and that this post’s allegations of us stealing code are false. [...]
The screenshots clearly show they copied whole pages verbatim, both design and texts. The founder, Nico Laqua, basically responding with "we didn't copy _code_" and not taking any responsibility says a lot about his and his company's moral code. It might not be enough to get sued. That doesn't make it right.
https://x.com/nico_laqua/status/2070158170937581951
I did an interview a couple years ago when Corgi was first hiring engineers. Nico and I ... did not click and it was probably the least smooth interview I've ever had despite it just being a phone screen.
I wouldn't be that surprised if Nico genuinely thinks "we didn't copy the code" is a reasonable defense. It would be a clear cut rule, and extreme "shape rotator" types often have trouble with the fuzziness of things like law. In reality, copyright infringement is often more like the porn test, you know it when you see it.
I never made it to the interview phase because on the phone screen they mentioned they all work 7 days a week in office. nope nope nope nope.
We should thank companies for warning us during the interview process that they are so separated from reality (especially in the AI era)
If AI can’t make them recognize a work life balance has value then it’s easy to see they don’t believe the “force multiplier” BS they are peddling
License in question: https://github.com/papermark/papermark?tab=License-1-ov-file It is AGPL, basically means:
You have to share the source code even when the user interacts over the network with the software.
The project which uses that code, must also be AGPL,
There are ways to separate it and go around it, for example, using an AGPL auth server shouldn't affect the code where your business logic lives
I am sure they could have found a way to design their product to be compliant, especially following past drama.
This is assuming the code is indeed copied, since we don't know that for sure, it does look very similar but I am not sure how that is enforced
they probably need to sue to enforce this, I think this is actually going to be a larger issue than just corgi. copyright with these models really is just a mess
What I don't understand is that if a lawsuit happens, then must the plaintiff produce their source code for verification ? Even so a git tree is trivial to change into some other arbitrary code even if a license violation has occurred. I also heard if proven the consequences are that they would lose all revenue starting from when the violation has occured
Hey Claude, copy XYZ, make no mistakes.
The meme keeps on memeing.
What a wonderful world we live in where we can blame machines and extremely dilluted processes for all things we might do wrong.
tech will do anything to normalize theft and call it innovation
Since the Tweet is small enough and a lot of people aren’t reading it (Twitter links don’t work well for those without an account some times) I’ll quote it here
> Hey Nico,
> It looks like you didn't vibe code your data room but stole it from Papermark's open source and enterprise-licensed code.
> We demand you take this copyright and license infringing product down immediately.
> It's not moving fast and breaking things, it's fraud.
> It makes the rest of your business questionable and the YC community look terrible.
Missing context.
What a scumbag. The replies from Nico are insane:
“Team effort”
“:praying-hands (x2)”
And so on… The audacity and complete shamelessness…
I wonder what narrative they tell themselves.
I wonder if Nico will be feeling so cocky when Papermark gets their general counsel involved. The public Twitter shaming was clearly an attempt to resolve this without litigation, but hey, if that's how Nico truly feels, guess he gets to see what's behind door #2 (a massive bill for a legal retainer).
I am curious how this will play out legally.
Surely UI enough isn't enough to prove that source code was plagiarised?
In the event Papermark chooses to sue how will the defendant defend themselves short of presenting their own (possibly) closed source?
> I am curious how this will play out legally
I am curious if/how YC will handle this to get ahead of earning a reputation of being a den of scammers - a few months after the Delve scandal
Most likely, Papermark would compel Corgi to disclose the source code during discovery.
I didn't realise that one could forcibly require a competitor to disclose trade secrets.
Now, INAL of course, but I would think this sort of mechanism would be quite gameable from both sides ( i) a wealthy competitor legally forcing a promising upstart to reveal source ii) a copycat working out some kind of arrangement where the code itself is licensed to them via shell company based overseas.)
As with most legal hacks, the courts figured this one out long ago :).
If someone is trying to dig into their competitor's trade secrets via discovery, the court offers multiple ways to safeguard against that. The defendant can identify information as a trade secret and ask that it be protected in some way - for example, the documents may be restricted to "Attorneys' Eyes Only", so while the plaintiff's attorneys can review the material, the plaintiffs themselves are barred from reviewing it. Or the judge themselves may get involved in an in-camera session.
There are software engineers that specialise in source code analysis that lawyers will often use in these cases. The engineers will be given access to source code in secure environments where they're not allowed to bring any device in or out. They review, analyse, and write up a report using pen and paper, that can then be reviewed by the lawyers.
Absolutely. It was very similar to one of my first jobs: "Legal Technical Analyst". Not as much time doing deep source analysis, but basically translating things for lawyers: "So as far as this claim of copyright/plagiarism... this block here, that's CS 101 stuff, that block there, that's novel, and does x, y and z".
Sounds expensive. *(sleaze ball hands)*
Ah another YC popcorn fest
The X link has screenshots where the two products have lots of identical pages. Is that IPable? Honestly don't know since I seem to use a lot of products that look like other products (LibreOffice, etc). But the pages for obscure things looking identical is kind of sus.
What's with this response in the Twitter thread??:
"This ain't what a C&D looks like. Implies you don't actually have a leg to stand on. Upload a copy of your official legal demand (from a lawyer) or I'll forever see your company as one who attempts to bully the competition in public"
-- https://xcancel.com/jacobhartmannx/status/207012600834729596...
Is this just trolling?!
What a bizarre complaint! It's not bullying to first try to resolve the matter informally rather than jumping straight into legal action.
Besides - who is this guy, and why does he think he's owed sight of any legal paperwork?
He seems to be a bullshitter and partially fake. Just take a look at his LinkedIn profile.
Look at his other tweets, he seems to be a sociopathic extremist
Or a troll? I'm so confused.
Folks... read the actual tweet. They literally didn't vibe code it - they copy-pasted another project.
Yeah, the title that the OP chose is so sufficiently misleading that I think this one will need to be get changed by the mods. Seitz isn't opining on the ethics of vibe coding in his tweet, he's pointing out that Corgi literally just stole Papermark's AGPL codebase and passed it off as vibe coding.
It's nearly word-for-word the content of the tweet. Right at the top. It isn't misleading unless you literally don't even bother to open the linked content.
Just ban users who comment without reading, I think that would go further to keep the quality of discussion high.
The number of bots/trolls responding to the title without reading the content and missing the point entirely is astounding, honestly, and I don't think any of those posts are contributing to high quality discussion. We could do without those users.
"but but but I can't/won't open twitter links" - then don't flap your yak-hole. Ignoring for a moment that the content has been reproduced in full in this thread, and another user has provided an alternative xcancel link.
It’s an intentionally misleading title, using “you” to imply that the reader is guilty of theft.
An honest title would be “Corgi didn’t vibe code it, they stole Papermark’s AGPL code”.
Sure, people should read links, but when a writer posts ragebait for engagement, there’s plenty of blame to go around.
You’re giving me too much credit if you think i was being sensationalist and trying to make it more clickworthy, i couldnt succeed in that if i tried
I was mostly fighting the title character limit
Ideally yes, but we know people don't RTFA - there's a reason that initialism dates back to early Slashdot.
The paraphrase is doing a lot of heavy lifting to convert it to ragebait. Had the OP gone with something like "you didn't vibe code it, you plagiarized Papermark's open source project" (may need some editing to fit under the character limit) it would have at least been more true to the original tweet.
I know I RTFA, and I know I'm not interested in discussing things with people who don't. Maybe others feel differently, because more people is better or something. Information pollution is a serious, persistent, growing problem and I'm just not inclined to be tolerant about it anymore. Mistakes are one thing, deliberate stupidity is another.
If you come to book club without reading the book, and you derail the conversation into something completely irrelevant, you're not getting invited back.
wait just a second, that's not how to use HN. youre supposed to read the title -> get upset and write a comment -> argue.
I remember a few cases when asking an LLM to do something in the early days yielded not only the code but an author and a COPYRIGHT license.
Naturally LLM technology has moved on since then. I don't remember any recent word for word reproductions of a copyright license.
There are a lot of people lauding the technology though because it occasionally one-shots a wildly impressive example of something which...already exists.
Vibe stole it?
Probably just stole it by the looks of those screenshots.
Yeah I mean like git clone the repo then "hey LLM rip off this code, make no mistakes"
Same thing https://githubcopilotlitigation.com/
I'd suggest replacing that link with https://xcancel.com/mfts0/status/2070080422482977095
And maybe reword the submission title while they're there, though the current one is well chosen for maximizing engagement I'm sure.
Gonna have to see the agent trace on that one.
Unless you don't copy the license terms, it's impossible to "steal" open-source code. That's... sort of the point.
Many open source licenses levy restrictions upon the acceptable use of the software. Those restrictions may include attribution requirements, up to and including a requirement to include the license when redistributing the code; they may forbid using derivative works for commercial purposes; they may require the downstream project to utilize the same license. Open source is not the same thing as "anybody can do anything they want forever."
> they may forbid using derivative works for commercial purposes
The most widely used definitions of “open source” do not allow such a prohibition.
Yup, if we take OSI as defacto authority on open source definition
> 6. No Discrimination Against Fields of Endeavor
> The license must not restrict anyone from making use of the program in a specific field of endeavor. For example, it may not restrict the program from being used in a business, or from being used for genetic research.
https://opensource.org/osd
> Unless you don't copy the license terms
You edited your comment while I was replying, and merely copying the license does not cover many other possible restrictions.
I didn't edit anything.
I did choose the wrong word, though. Comply, not copy.
Well, if it's my memory at fault then I apologize. My memory of the comment I replied to didn't include the initial qualifying phrase with either word choice.
So, by definition, you did edit it to change the typo.
>So, by definition, you did edit it to change the typo.
their comment still says "copy". the comment you are replying to clarifies that they meant to type "comply", not copy.
since the wrong word is still there, 'by definition' they have not edited it.
Ahh, I misread it.
Papermark is AGPL; Corgi must release all its changes.
That means they're not complying with the license terms. Which would be stealing. Like I said it would be.
Copyright violation is not theft. Your effort to create something that can be effortlessly copied conveys to you no property. Society deems it beneficial to grant a time limited monopoly on copying it to spur innovation.
You wouldn’t steal a car!
Stealing a car - or anything tangible - means... the owner is very literally deprived of the benefits of owning said car/thing. Can't really say the same for a copied pattern of bits.
But I would download one.
Thats not what you said. You said "copy the license terms". Copying a license isn't the same as complying with one.
Though it looks like in this case they didn't do either.
So we're in violent agreement then?
Brutally violent agreement. kicks shin, shakes hand
It's really hard to not assume this is intentional ragebait.
A cursory look reveals they aren't complying. So, as you say, they are stealing. What's the point of this comment?
Copyleft is still a thing. Right to attribution is still a thing. Please, read about it and you will discover that there is a lot of nuance to the open-source code.
You didn't code it, you stole it from open source OS and compiler maintainers
"before Bison version 1.24, Bison-generated parsers could be used only in programs that were free software."
https://www.gnu.org/software/bison/manual/html_node/Conditio...
Stealing it for your use case would take more effort vibe coding. The term is fine as is
LLM generated code could have very similar pattern to existing code with stricter license it trained on. So, it's better to keep them to yourself instead of bothering the public.
It is not possible to steal something which doesn't obey conservation laws. Don't try to scam physics, is always wins.
Close your source if you don't want it to be read by LLM
That's not how licenses work, Papermark is AGPL
I agree. It's a sarcasm of the new reality. What is copying vs writing from scratch? The line is blurred now, non-existent. You can ask an LLM to re-write any open source to a degree where there is no definite way to say that it's a derivative.
"If Disney wants to retain their rights to Mickey they really shouldn't be showing any images of him to the world."
Don't care. Competition is good for consumers.
It is, but this isn't competition. This just copyright infringement.
Competition would be if these people created their own software, possibly innovating and improving it in the process. That would encourage Papermark to improve their own offering, and would create an environment where these businesses are economically incentivized to improve the product or service.
Nobody is incentivized to improve the software in question here. If copyright law doesn't protect anything, then improving your product is helping the competition and potentially hurting your business. Same is true if you're the people who did the infringement.
Let’s not even talk about the feature. Copying the entire visual design itself with superficial tweaks is pretty brazen and, frankly, incredibly lazy.
Who cares if the consumer buys it and uses it? Information is worth nothing anymore, attention is, so if they manage to capture a larger audience somehow, they win.
> Information is worth nothing anymore
What do you do for a living? For most of us in the tech industry, information being worth something (because it takes creative and intellectual labor to produce) puts food on our tables.
LLMs produce about 95% of the code at my company and review about 70% of it for 3 years now. Our team has downsized from 40 to 8 people in this time. My creative labor is spent writing harnesses and wrappers. When there is enough of a data distribution on this, the LLMs will be able to do that as well.
I have saved up a buffer in funds and bonds because it's going to be over at some point when the company moves from explore to exploit.
When it plays fair, sure. Not when it steals.
When competition has no rules it resorts to people banging each other over their heads with clubs.
People argue for less regulations until they are the ones eating crow.
When everyone is using LLMs to suggest IA, build basic UIs, dump out your startup in a day, etc. everything will look the same, even the source code. There will be no way to litigate this. Does it benefit society to force two companies to make their products look different? Where’s the outrage over all basic pencils looking the same? Let the market decide which pencils it prefers.
Its sounds like you are taking a side.