GoLic, injects license into source code files

(github.com)

19 points | by kure256 2 days ago ago

15 comments

  • gtirloni 2 hours ago

    I've opted to simply add the SPDX license identifier [0] , just like it's done in the Linux kernel [1]

    [0] https://spdx.org/licenses/

    [1] https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/lin...

  • jchw 4 hours ago

    Neat. There's a lot of hand rolled implementations of this idea, would be nice to have something to standardize on. I am sure it can be done with custom templates but a good idea IMO might be supporting declarations using SPDX IDs. You see them in some source code, e.g. KDE source code. More info here:

    https://spdx.dev/learn/handling-license-info/

    For anyone wondering if this (license information in source files) is necessary, I think the answer is "maybe". Some licenses (e.g. Apache 2) seem to be written such that the license itself requires the disclaimer, and even having copyright information (e.g. users that make substantial contribution adding the name of whoever is assigned the copyright for their contribution to the header) is a good idea. I used to be against this for aesthetic reasons, viewing it somewhat similarly to those annoying corporate email footers, but over time it's become more obvious to me that it not only is great for keeping the license very explicit everywhere but may also be legally a good idea. (IANAL.)

    • Tomte 2 hours ago

      There is a related oddity in GPL 2: when distributing modified source code you must put a mini changelog in the file itself (clause 2. b).

      Nobody does that anymore. We have git, and before that we had SVN and CVS and so on.

      The legal commentaries I know simply say "in principle that requirement is legally valid and you must do it this way, it practice no programmer seems to do that, so shrug".

      • jchw 2 hours ago

        > There is a related oddity in GPL 2: when distributing modified source code you must put a mini changelog in the file itself (clause 2. b).

        Are you referencing the right license/clause? I don't actually see how GPLv2's clause 2. b would require this actually.

        > Nobody does that anymore. We have git, and before that we had SVN and CVS and so on.

        Yep, I will agree with you that pretty much nobody actually does this, and it does not seem like it is an obstacle so far, e.g. I have not seen legal contention over this, mostly just discussion.

        And honestly, writing a mini-changelog definitely seems like overkill with version control, and perhaps in most cases version control metadata is a perfectly acceptable substitute. However, since the file(s) might be distributed outside of version control where the version control data might not be present (e.g. like a release tarball) having at least the copyright information in each file seems useful. Whether it satisfies the "prominent notice that the file is changed" requirement is actually not 100% certain, but I can't imagine it puts you in a worse position to do so.

        • Tomte 2 hours ago

          Yes, I meant 2. a:

          "You must cause the modified files to carry prominent notices stating that you changed the files and the date of any change."

      • tempfile an hour ago

        I don't think clause 2b says that, do you perhaps mean clause 2a? In GPLv2 that says:

        > a) You must cause the modified files to carry prominent notices stating that you changed the files and the date of any change.

        while in GPLv3 it says

        > a) The work must carry prominent notices stating that you modified it, and giving a relevant date.

        Note that git/svn are not always relevant. In particular, it is not uncommon to distribute release code with the .git directory stripped - this does not excuse you from the requirement.

        Having said that, I had a quick glance at Linux, and picking a file at random it did have a copyright header, but certainly not one that included a record of every change. https://github.com/torvalds/linux/blob/master/init/calibrate...

        So it doesn't look like this is peculiar to v2. But it does seem like people don't follow the letter of the requirement. I wonder if the FSF has ever clarified this.

    • tempfile 2 hours ago

      > for anyone wondering if this is necessary

      It's not necessary (for you). But if you want to share your work, it can be very important for those people you share it with!

      The reason is quite simple. Some downstream user, who should be able to use your code, may not have it conveyed to them in the way you imagine. They might only receive a single file out of the project, since that's what they needed. Technically, the person who distributed the file to them has violated the license by not sending the license text in-band. This is not good for the community, since it produces confusion about who is allowed to use free software -- it should be everyone, not just people who understand the details of licensing.

      • VonGallifrey 10 minutes ago

        > Technically, the person who distributed the file to them has violated the license by not sending the license text in-band.

        That person might also just send parts of a file instead of the entire file with the license at the top.

    • chrismorgan 3 hours ago

      > Some licenses (e.g. Apache 2) seem to be written such that the license itself requires the disclaimer

      I don’t believe this is actually true. These instructions are not in the actual license terms and conditions, but rather in a distinct information section after that. In GPL-3.0, “How to Apply These Terms to Your New Programs”; in Apache-2.0, “How to apply the Apache License to your work”. My understanding is that such prescriptions and sections are not normative.

      From a pure copyright law perspective: no, you definitely don’t need to put the license in each source file. Copyright is automatic these many years, so you don’t even need a copyright line; and license can be (and practically always is) independent of the code. This would obviously¹ hold for warranty disqualification too. Bigger businesses may like to do it for their own convenience of license management, and individuals or groups may like to do it on their work in case individual files are lifted (… as distinct from smaller units, or even taking a file and stripping the header), but basically the era where this sort of thing could arguably be relevant as a mandate is long-past.

      There’s even more definitely no need for a dozen lines. If you really want to put anything, one line for a copyright declaration and one line for SPDX-License-Identifier seems fair.

      As for copyright lines⸺ bah, they’re such a bunch of drivel. The way people use year ranges, or just bump the year, it’s almost all such legal nonsense. As in, “if this stuff actually mattered, you’d probably have lost your copyright protection” nonsense. The fact of the matter is that copyright year stuff wasn’t designed for such easily-edited stuff as software. It was designed for “first edition copyright 1925; renewed 1935; second edition copyright 1945”, that kind of thing.

      —⁂—

      ¹ “Obviously” here means what a normal person would mean; but I acknowledge that some jurisdictions sometimes hold positions that are obvious nonsense.

      • jchw 2 hours ago

        Firstly, if you are not a legal professional, please temper your statements somewhat. There are very few absolutes when it comes to interpreting legalese, and what is true for one jurisdiction may not be true for all of them. (Obligatory: yes, I am aware of the existence of international copyright treaties/Berne Convention.) That said, as I denoted, I am (also?) not a lawyer. I will try to follow my own advice and not make any strong claims I am not qualified to make.

        Now with that said...

        > I don’t believe this is actually true. These instructions are not in the actual license terms and conditions, but rather in a distinct information section after that. In GPL-3.0, “How to Apply These Terms to Your New Programs”; in Apache-2.0, “How to apply the Apache License to your work”. My understanding is that such prescriptions and sections are not normative.

        There are a few lines in Apache 2 itself which are normative and make reference to obligations regarding notices attached directly to source files. Most notably, see section 4.a[1]:

        > You must cause any modified files to carry prominent notices stating that You changed the files; [...]

        Of course, you can argue about what might qualify or not qualify here, e.g. maybe Git metadata is good enough, but then a tarball produced by your Git host of choice would suddenly violate the copyright license obligations.

        > From a pure copyright law perspective: no, you definitely don’t need to put the license in each source file. Copyright is automatic these many years, so you don’t even need a copyright line; and license can be (and practically always is) independent of the code.

        I have no idea what you mean by this. First of all, of course you don't have to put the entire copyright license into each source file; this is solely about copyright notices, which typically point to a more complete LICENSE/NOTICE file. Secondly,

        > license can be (and practically always is) independent of the code.

        I'm not sure what this means. Each file of a project either needs to be in the public domain or has to be covered under some kind of copyright license for anyone (aside from the original authors) to be able to distribute it. At least from a conceptual sense, the code and the copyright license are definitely not independent (This still holds with dual licensing schemes.)

        > There’s even more definitely no need for a dozen lines. If you really want to put anything, one line for a copyright declaration and one line for SPDX-License-Identifier seems fair.

        Some projects do basically just do this, but many of them do both. The SPDX identifier is great because it is machine-readable, and it might help you if you're ever in a situation where it's necessary.

        > As for copyright lines⸺ bah, they’re such a bunch of drivel. The way people use year ranges, or just bump the year, it’s almost all such legal nonsense. As in, “if this stuff actually mattered, you’d probably have lost your copyright protection” nonsense. The fact of the matter is that copyright year stuff wasn’t designed for such easily-edited stuff as software. It was designed for “first edition copyright 1925; renewed 1935; second edition copyright 1945”, that kind of thing.

        While registering copyrights or including copyright notices explicitly is not necessary to have protection under copyright law, my layperson understanding is that it does indeed afford you additional protection under law in some cases. My understanding is that since Berne Convention, pretty much anywhere on Earth anything you produce that's eligible for copyright protection does implicitly get it. However, if you ever actually wind up in court over copyright issues, the lack of clarity on what licenses go where could possibly create reasonable doubt. It's a lot of risk for what ultimately amounts to an aesthetic concern.

        P.S.: Also, just so it's clear, I am mainly concerned about the copyright notices because they explicitly denote the copyright license, not because they denote the existence of copyright protection. This is especially nice to have when people contribute patches to your projects, so that it can be as explicit as possible that their contributions are under the same license as the original file.

        I will stress again that I am not a legal professional, I don't study law, and at best I have spoken to people who do irregularly. However, I haven't found anyone that would disagree that it is a good idea to provide a full-fat copyright notice when possible. All else the same, it's just good hygiene at a cost of a kilobyte or so per file.

        [1]: https://www.apache.org/licenses/LICENSE-2.0#redistribution

        • kelnos 6 minutes ago

          > I have no idea what you mean by this. First of all, of course you don't have to put the entire copyright license into each source file; this is solely about copyright notices, which typically point to a more complete LICENSE/NOTICE file.

          You don't even need a copyright notice. Under current US law (and that of many other countries that have harmonized their copyright regimes), copyright is automatic, and requires no notice at all. Obviously it is better/safer for you to put a copyright notice on everything you create, but it is not required. You will hold copyright on a work you create and distribute today, even if you don't put your name or the magic "Copyright 2024 $NAME" bit anywhere on it.

          I think there's some confusion here about copyright vs. licensing: those two things are independent, except of course that you need to hold the copyright to be able to set licensing terms. The default, without any provided license, is the most restrictive of one: that no one else has any rights toward the covered work.

          > While registering copyrights or including copyright notices explicitly is not necessary to have protection under copyright law, my layperson understanding is that it does indeed afford you additional protection under law in some cases.

          Registering does, yes. In the US, at least, you cannot file suit to protect your copyright unless it is registered. I don't believe the presence or absence of a copyright notice matters one bit; ultimately regardless of the presence or absence of a copyright notice, if it ends up in court, everyone will be providing more documentation as to the provenance of the work in question.

          The dates you list in the copyright notice are perhaps useful for readers, but if a dispute ends up in court, part of the proceedings may involve determining the creation date/year of any bits under dispute, independent of what you put in the copyright notice. But in practice, with software, considering the stupid-long time things remain under copyright these days, the exact date doesn't matter much, as pretty much anything anyone is going to bring a dispute about is going to still be under copyright.

          > However, if you ever actually wind up in court over copyright issues, the lack of clarity on what licenses go where could possibly create reasonable doubt

          "Beyond a reasonable doubt" is a standard applied to criminal cases; civil cases have a lower burden of proof. But yes, you're always better off with things documented than not. A judge/jury will certainly appreciate a plaintiff or defendant who has all their ducks in a row vs. one that is disorganized. But they'll also be cognizant of (and try to determine) what the copyright holder intended to do, even if the documentation doesn't spell it out as clearly as one would like.

          The law is not a computer; you're not likely to get away with copyright infringement just because the copyright holder missed some detail.

          > I am mainly concerned about the copyright notices because they explicitly denote the copyright license

          No they don't. A "copyright notice"[0] is simply something that says "Copyright $YEAR $NAME". A "copyright license" is a list of terms that give people more rights to use and distribute the work than they would get under copyright law.

          You don't need to put licensing information in each source file. If everything in a project is released under the same license, noting the license in a single place is fine. If different files have different licenses, of course you'll need to note things specifically; though, again, if you don't want to put licensing information in each file, you can still note in a single place which files are under which license.

          But as I mentioned above, more documentation is better/safer than less. Personally I put licensing information in every source file of the things I release, even if every file is under the same license as the project as a whole.

          [0] https://en.wikipedia.org/wiki/Copyright_notice#Form_of_notic...

        • tempfile an hour ago

          > > From a pure copyright law perspective: no, you definitely don’t need to put the license in each source file.

          > I have no idea what you mean by this.

          I think they interpret your original comment as "do you need to do this [to obtain copyright protection]? Maybe" rather than "do you need to do this [to comply with the license terms]? Maybe", which is what I think you intended. If I'm right, this explains a lot of the stuff in their comment that doesn't really make sense.

  • dangoor an hour ago

    Ideally, this would follow the format of reuse.software so that there's a machine-readable standard for these: https://reuse.software

    I'm working on tooling that involves automated reading of this info, and it's a lot easier if the tools don't have to do fuzzier matching.

  • JamesCoyne 3 hours ago

    Could use a (2021) in the title. No activity since then in the repo

  • IshKebab 2 hours ago

    There's no legal reason to do this.