10 Years of HexaPDF

(gettalong.org)

55 points | by thunderbong 4 days ago ago

31 comments

  • nona 3 days ago

    I looked at HexaPDF several years ago, and really liked what I saw.

    However, my only major issue with it was the difficulty (back then) for laying out PDFs. Most people I worked with found it a lot easier to layout in HTML/CSS and converting to PDF from there; so we went that way.

    If I were to look for a way to manipulate existing PDFs I'd definitely use HexaPDF. I'll re-evaluate its (more recently) improved layout capabilities again and consider it for my next project. And I also think the dual-licensing model is fair.

    Congrats on the 10-year/1.0 milestone.

    • gettalong 3 days ago

      Thanks!

      I agree that laying out PDFs could be made easier by using a declarative mechanism instead of coding. However, I'm still not sure what the best way would be to do that. Using HTML/CSS for this and doing it right would entail implementing something like PrinceXML...

      With the current layout model you have the possibility to implement price layouts, like the one needed by Swiss QR bills (see https://x.com/_gettalong/status/1748823670368117154), or just define the general layout and let HexaPDF decide the final position (see https://hexapdf.gettalong.org/examples/pdfa.html).

      If you have any ideas, how laying out PDFs could be made simpler, I'm all ears!

    • bn-l 2 days ago

      How did you go from HTML to PDF?

  • poulpy123 3 days ago

    There is a side note that really shocked me:

    > Normally one would need to pay ISO to get a standards document.

    What is the point of a standard that is not pulicly available ?

    • RadiozRadioz 3 days ago

      https://www.iso.org/footer-links/frequently-asked-questions-...

      "Developing, publishing and maintaining ISO standards incurs a cost, and revenues from selling them helps ISO and its members to cover an important part of these costs. Charging for standards allows us to ensure that they are developed in an impartial environment and therefore meet the needs of all stakeholders for which the standard is relevant. This is essential if standards are to remain effective in the real world."

      • tonyedgecombe 3 days ago

        >Charging for standards allows us to ensure that they are developed in an impartial environment

        I don't think the PDF standard was developed in an impartial environment.

        • aredox 3 days ago

          Here are the members of the ISO Technical Committee 171 (ISO/TC 171), Document management application, Subcommittee SC 2 who is in charge of developing and updating PDF standards:

          https://www.iso.org/committee/53674.html?view=participation

        • dotancohen 3 days ago

          PDF? I'd agree about that "open" document format that MS Word use (the standard literally states that some things should be implemented "like MS Office does") but not PDF.

      • kjksf 3 days ago

        You should be able to see when someone is gaslighting you. They charge because they want and can.

        The above justification might apply to some of their standards but certainly not to PDF.

        PDF was developed by Adobe. They make lots money selling PDF tools and licensing PDF software.

        PDF didn't have to be a standard. It could be a proprietary format.

        Adobe wanted the adoption of PDF so they made PDF a standards so that they can sell more PDF tools.

        It didn't have to an ISO standard. Adobe could have released it as an open spec with some sort of permissive license.

        Adobe wanted it to be ISO standard to piggy back on "respectability" of International Standards Organization, to buy the impression that it's an open standard.

        But ISO makes money by selling standards so we ended up in this situation when the spec for "open" standard that Adobe wants to be adapted costs $100+ to buy.

        I guess Adobe at some point figured out this is stupid and made the spec available for free.

    • andai 3 days ago

      A lot of regulations exist largely to make it more difficult for individuals and small companies to compete.

      I don't know if that's the explanation here (maybe there's good reasons for it?) but it's the first thing that came to mind.

      • 082349872349872 3 days ago

        According to https://www.iso.org/standard/51502.html the standard would cost ~$250, so basically 1 hour of dev time. I can't think of anything competitive I might do as an individual/SME in this space that would come in at under 2 weeks, so compared to actually doing the work, paying for the standard would be in the noise.

        Keep in mind that ISO traditionally dealt more with industrial standards; eg something like https://www.iso.org/standard/40447.html costs ~$180, but I doubt anyone who'd be attempting to compete in the "Apparatus for industrial gamma radiography" world would consider that expense as any more burdensome than buying office consumables.

        • andai 3 days ago

          One hour of dev time? Are you paid half a million a year?

          • 082349872349872 3 days ago

            I have been in the past; currently I consult part time, so that's both (a) fully burdened, and (b) not the rate I'd choose if I still wanted 1'800 hours/year.

            Chop it down by a factor of 2-4, and the question still remains: is $250 significant compared to the amount of work you're going to have to do to be competitive in the PDF-document processing space?

            (for quick hacks, reverse engineering is good enough. you only need the standard once you're processing enough client documents that you need to know all about the long tail of possible but unlikely constructs your code may encounter, after all)

            • andai 3 days ago

              The answer is yes. It's a lot of money for what it is.

              ISO is the international standards organization. A software developer in Indonesia earns $500 per month. Half your salary for one document is indeed a lot.

              • 082349872349872 3 days ago

                Wow. True; it probably doesn't help that ISO HQ is in Geneva.

                How much is 500g emping for you all? From a swiss source it's $25. That might give us a rough idea of price differentials...

                [This is all moot though, because the PDF standard is available gratis.]

                • gradschoolfail 2 days ago

                  ISO docs seem to me to be one of those club antigoods.. paying for them doesnt necessarily make them less destructive (dependent on timescales tho, i do believe that standards are an unalloyed good if they are patched with a timeconstant low compared to what standards bodies consider optimal)

            • andai 3 days ago

              Also, could I ask for advice to increase my hourly rate? Mine is closer to $25. Granted I am in Europe, so the ceiling here will be quite different, but I'd love to hear how you did that!

              I hear (even for those based in US) that location plays a huge part, but even so, I assume specialization also played a big part?

              • 082349872349872 3 days ago

                Yes, specialisation over decades. Perhaps almost as important: I started in the US, and still work almost exclusively for US clients.

                In a world where I can work almost as easily 12 time zones away as 2, I'm not sure why these huge continental disparities still exist, but they do.

        • poulpy123 2 days ago

          it's true that the price is not relevant for most companies, but it makes it unavailable for most individuals interested in the topic

  • andai 3 days ago

    >commercially available library

    Are there any examples of this? How do you monetize a DLL file? One time purchase and later upgrades?

    It seems like you'd have to put it behind an API and charge for usage, though I don't have a good overview of what the other options are.

    • alemanek 3 days ago

      This used to be much more common. I worked on a product in 2004 that licensed a library around providing a nice SDK for different payment processors. Effectively making it super simple to switch or route payments to different processing gateways.

      We paid on a per domain basis and also paid a yearly maintenance fee to get updates. Nothing stopped us from just using the library in ways that violated the license but we didn’t. The threat of lawsuit is an effective deterrent if the ethical concerns aren’t.

      In my experience for B2B the people that are going to steal from you were never going to pay anyways.

      • tonyedgecombe 3 days ago

        >In my experience for B2B the people that are going to steal from you were never going to pay anyways.

        I used to think that but eventually the evidence pointing to the opposite was overwhelming. It wasn't just small companies either, some of them you will have heard of.

    • gettalong 3 days ago

      The library is dual-licensed as AGPL plus a commercial license. So everything is in the open and can be tested and tried out under the AGPL. Once the library is used in a commercial context, you nearly always need to buy the commercial licenses to stay compliant. This is how it generally works.

      What the commercial license does is a different thing. You could charge once OR once and for every upgrade to an (arbitrarily defined) new major version OR each year via a subscription OR ... It is really up to you and how you want to handle this.

      • pabs3 3 days ago

        The AGPL is pretty easy to comply with in a commercial context even if you are using it as part of a SaaSS product.

        Just either use the code unmodified, or release your modifications to customers, or to the public in general.

        Do the businesses buying commercial licenses just not understand the AGPL license? or are their development processes not rigorous enough to ensure compliance? The AGPL includes some easy ways to be forgiven for accidental violations, so that should not be a problem in almost all cases. So only deliberate non-compliance should be an issue.

        • gettalong 3 days ago

          I'm not a lawyer but I think you mistaken in this regard. One indication for this is that otherwise some major companies would have problems.

          For example, the GPL FAQ has the following part in the FAQ item title "What is the difference between an 'aggregate' and other kinds of 'modified versions'?" (https://www.gnu.org/licenses/gpl-faq.en.html#MereAggregation):

          > If the modules are included in the same executable file, they are definitely combined in one program. If modules are designed to run linked together in a shared address space, that almost surely means combining them into one program.

          A combined work needs to be distributed under the AGPL, an aggregated work does not. Since Ruby is interpreted the code of HexaPDF loaded from the application would run in the same address space and thus it would be a combined work.

          The following two links are also relevant: https://opensource.stackexchange.com/questions/5003/agplv3-s... and https://opensource.stackexchange.com/questions/5010/can-i-us...

          • pabs3 2 days ago

            Just add an AGPL command-line interface, or a daemon wrapping the library and you have a process boundary. That doesn't necessarily create a derivative work boundary, but it probably would if it is generic enough to be useful to everyone.

            • gettalong 2 days ago

              Yes, creating a binary and calling that would circumvent the AGPL. But then everything will be more complex and slower.

              Also, doing this extra work and developing the binary is probably more expensive than just buying a commercial license.

  • aidog 3 days ago

    Note: This is by the author of the popular kramdown markdown library.

  • 4silvertooth 4 days ago

    >Another thing that was imported to me was - and still is

    Typo in the article word imported should be important.

  • pabs3 3 days ago

    What do the terms of your commercial license allow customers to do and what do they prevent them from doing?