Dear sir, you have built a compiler (2022)

(rachit.pl)

135 points | by azhenley 3 days ago ago

49 comments

  • pjungwir an hour ago

    I've seen this a lot when someone wants to add "workflow automation" or "scripting" to their app. The most success I'd had is embedding either Lua or Javascript (preferably Lua) with objects/functions from the business domain available to the user's script. This is what games do too. I think it's a great way to dodge most of the work. For free you can support flow control, arbitrary boolean expressions, math, etc.

  • quantadev 3 hours ago

    In software development it's pretty important to know when to build "on top" of something else, and when to start from scratch.

    Lots of developers will find it much more interesting, challenging, rewarding and just plain fun to develop something from scratch, even when there are better things that already exist.

    They'll cleverly manipulate and convince the boss, against the better discretion of their elder developers, that they can do it, and if they're one of the better developers, the boss won't want to risk losing them so they'll agree to the escapade.

    Then said escapade turns into a shambles, as predicted by the elder devs, and the developer who created the mess simply quits and moves to some other job, in search of more fun and greener pastures. Any developer with decades of experience has probably seen this same pattern multiple times.

    • wheybags 2 hours ago

      This is a sentiment that I've seen expressed in comment sections many times. I've been programming professionally now for 10 years, and it just doesn't resonate with my experience. Problems with build systems for external dependencies, package managers, and underfeatured / overcomplicated / buggy third party dependencies have been by far the worse issue in my career, compared to problems with homebrewed systems.

      I'm not saying you're wrong, I don't doubt that many people have the opposite experience. It just makes me feel a bit alien when I read comments like this.

      • marcosdumay 2 hours ago

        I've been there, on both sides, with homebrew ideas pushed from up and down, some that worked nicely, and some that were complete disasters...

        And I agree with you. The problems with third party dependencies are way worse than any in-house complete disaster.

        But that happens almost certainly because everybody is severely biased into adding dependencies. Make people biased into NIH again, and the homebrew systems will become the largest problems again.

      • quantadev an hour ago

        This kind of thing admittedly isn't as pervasive in the last decade as it was the two before, so if you've been a dev only since 2014 years you may not have seen it. The old people like me will get it tho.

    • rectang 36 minutes ago

      > even when there are better things that already exist

      That's a "big if".

      Lots of times what's there is a nightmarish tangle of technical debt left by previous greenfield devs. The dev who gets to maintain and evolve this dreck is the sucker, scapegoated for ever slower development.

      Canonical example: on-call AWS engineers working hellish overtime to close tickets on one of AWS's many terribad fragile codebases.

    • PittleyDunkin an hour ago

      > Lots of developers will find it much more interesting, challenging, rewarding and just plain fun to develop something from scratch, even when there are better things that already exist.

      This is true; but after enough years in the industry you learn to correlate success with laziness. This is well-discussed and arguably obvious but on an emotional level it takes a long time to fully sink in. We were all once developers with outsized ambitions and awareness we can flee to greener pastures.

      • quantadev 9 minutes ago

        I've said in the past "The best developers are the laziest ones". We don't want to do a bit of unnecessary work at all.

        But at the same time since I spend almost ever waking hour of my spare time coding, the word lazy still isn't quite accurate in every way either.

    • chii 2 hours ago

      > against the better discretion of their elder developers

      why are the junior calling the shots over the elder developers?

      • quantadev an hour ago

        Lots of times it's just ordinary office politics, or the boss likes one person more than another, or isn't "technical" enough to know when he's being manipulated. Because often managers aren't developers themselves, so they don't know which developer is telling them the best advice, when two developers disagree.

        • taeric an hour ago

          And to be fair, often times it is against the senior's judgement, but if off the critical path can be a decent gamble. The hubris of junior engineers accomplishes a lot.

          • quantadev an hour ago

            Yes, it's true great developers can 'reinvent' things and do a great job of it, but the problem is that every line of code in a project is an efficiency drag forever moving forward. It always has to be maintained, updated, and managed by someone.

            Developers should be measured by how many lines of great code they can delete, not how many lines of great code they create (<-- but don't take this literally of course, it's just making a point)

            • taeric an hour ago

              Right. My "to be fair" is largely similar to "devil's advocate."

              The caveat of, "off critical path" is a heavy lift, too.

              My view was to give projects a form of risk budget. If possible, do things same way as last time. Any deviation is a risk. Can have rewards, sure. But if there was a known way to do it already, be budgeted to pivot back.

  • iamthepieman 5 hours ago

    I get the solution for this and I know what all the terms mean. But I don't understand the problem. Whether it's facetious or hyperbole or whatever, I just don't get who or what circumstances this is addressing.

    This is written like a Jeopardy answer. I just don't know what the question is.

    Can anyone enlighten me?

    • throwaheyy 4 hours ago
      • anyonecancode 2 hours ago

        Is it just me, or would "single page applications" fit easily in the "examples" section there? Sounds kind of trollish when I write it out, but honestly, it fits pretty well, right? We threw away all the things the browser gives you for free and re-implement the back button, history, etc in JavaScript. (and it's somewhat fractal, as within the frameworks we use to do this you'll frequently see people re-implementing things the framework already does).

        • btown an hour ago

          An interesting perspective here is this amazing 2015 blog post from the Figma co-founder: https://www.figma.com/blog/building-a-professional-design-to...

          > "Our editor is written in C++ and cross-compiled to JavaScript using the emscripten cross-compiler... Pulling this off was really hard; we've basically ended up building a browser inside a browser."

          Now sure, Figma's an exception, but it's an illustrative one. For most single-page apps, it's an interesting question. Is the web browser a monolithic platform where if you reimplement any of its layout engines etc. you're reinventing the wheel? Or, is it a set of libraries that can be chosen from at will, that of course happen to all work together to provide sane defaults, but by no means are required or expected to all be put into use simultaneously?

          I tend to think of the web platform as the latter. Just because there's something in the "standard library" so-to-speak doesn't mean I'm forced to use it - the real question is whether it's something stable that won't force me and the team to yak-shave to maintain it. Mature JS/TS libraries are no worse than the browser in this regard!

        • crdrost an hour ago

          I'd mostly agree. SPAs can do things hard to do with URLs and form inputs; for instance, chess would be harder to program in the browser without JS (though I remember an opening explorer which worked that way). Or if you think about popping up a modal or a toast. But a lot of the functionality is duplicated.

          This gets even more extreme now that you can have wasm on a canvas... The language that you're compiling from doesn't understand the semantics of a back button either!

        • ameliaquining an hour ago

          It's true that some of the Web platform's downsides, rooted in its split identity between being a document library and being an operating system, are kind of similar to this antipattern, if you squint a bit, although they tend not to be as bad because the outer platform is much more robustly engineered than the average enterprise app.

          The key difference is that, in the Web platform's case, there's not actually a better alternative on offer. Even with these awkwardnesses, it's still a better app delivery platform than desktop or mobile OSes, because it's dramatically less fragmented, has a more convenient "installation" story (https://xkcd.com/1367/), and has a better security model (at least compared to desktop OSes). So people need to write rich web apps with arbitrary behavior in it, which requires it to be arbitrarily customizable.

          Contrast an enterprise app, where the lesson of the "inner-platform effect" idea is that code changes to the outer platform aren't as costly as you think, compared to unmaintainable configuration that interacts in complex ways with the platform primitives. So it's best to allow only customization simple enough to not pose maintainability challenges, and eat the cost of an outer-platform code change whenever you need anything more complicated. But Web developers don't have the option of getting browsers to add new code every time they want to add a complex new feature to their app, so browsers need to support a rich enough set of primitives that those features are already possible.

          The other way to resolve the tension would be to get rid of the document-library features and instead double down on being an operating system, perhaps based on WebAssembly and <canvas> instead of HTML+CSS+JavaScript, like Flutter for Web uses. But of course people are using the document library, and in some cases it's the easiest way to do something, even at the cost of a little bit of redundancy at intermediate levels of customization.

          What SPA critics typically want, of course, is for most sites to be satisfied with less feature-richness so as to fit more easily into the document-library model. But the platform has to support everyone's use cases, not just those of people who like HN's minimalist style. (I can't find it now, but there was a great comment on HN awhile ago that said something like: "A lot of HN users basically wish the internet was like how it was in the 90s, except with broadband. But in this respect, we're unusual; most users like features and slick UIs.")

      • iamthepieman 4 hours ago

        That makes total sense. I have been tempted to do this in the past. Fortunately time and resources constraints have kept it to costly sane and maintainable, performant configurations until I learned that I would never create the system I wanted and that it was probably better that I didn't anyways. I guess I've been lucky and didn't even know it.

    • franciscop 3 hours ago

      I've come very close to be the "sir" mentioned in the article (for hobby stuff only though), luckily always I did catch myself and was able to stop before it was too late. I decided at some point that I do not want to build a compiler/transpiler, and if I do some day, I want for it to be a conscious decision and not an accident like in the article.

      It starts innocently, e.g. doing some template files and replacing some simple values, then you start to have to do more replacements and more "smart" parsing and then at some point it's too late, as the article suggests.

      TBF, I did put together a transpiler from PHP to JS, but I didn't build it, just found the different pieces that luckily fit together and hacked around it enough that it could run in the browser.

    • MathMonkeyMan 4 hours ago

      When writing programs that take other programs as inputs, and/or produce other programs as outputs, it's tempting to treat the program as only slightly more structured than its textual representation.

      The problem is that unless your use case is very limited and is guaranteed to stay that way, supporting more and more language constructs will quickly turn your code into a mess.

      Compiler design as we learn it (lex/parse, syntax tree, semantic checks, transforms, lowering to codegen) is _the_ solution to the problem of dealing with computer programs as inputs and outputs. Trying to do something less is like solving a dynamic programming problem without knowing dynamic programming: it will only work for a restricted set of inputs.

  • DHaldane 4 hours ago

    It's ok to build a compiler sometimes -- it's just very important to make that choice intentionally

    • mattgreenrocks 4 hours ago

      Indeed, the safer thing is to actually build a few toy compilers on the side so you can get a sense for what they are good for, and what level of effort is required to build and maintain it.

      Keeping them locked up in the "scary CS" closet only ends up stunting your growth.

      • SAI_Peregrinus 22 minutes ago

        I like to write toy compilers or interpreters as an exercise when learning a new language. Usually for a Forth or Lisp or one of Turing Tarpit languages. It requires some of the most common bits of programming: I/O, lexing, parsing (both of source and of arguments to the compiler), file handling, and some common algorithms & data structures (can't have an AST without a tree).

    • lmm 3 hours ago

      I'd say the opposite. Building a compiler incrementally, driven by clear needs at each step, as described in the article, tends to work out better than trying to build a compiler from day 1.

      • pizza 2 hours ago

        Is there a meta-library someone somewhere has already written for when I just want to write 20% of a compiler, and it more or less takes care for me 80% of the common compiler-building-related things I’m likely to need to do?

  • vishnugupta 19 minutes ago

    There’s an insider joke at Uber that if you start out building configuration manager you’ll end up with a full blown version control system.

  • Pedro_Ribeiro 3 hours ago

    Having recently built 90% of a compiler by mistake, I felt like this post was written specifically about me. Hilarious writing, congrats to the author.

    • crdrost an hour ago

      There is also the opposite, Ed Kmett is on record as saying that he had a million ideas for his personal programming language, and all of the cool stuff it was going to do, but then he ran into Haskell and said “Let’s be real, whatever amazing language I make isn't gonna be half as put together as this and I have this one already in front of me...”

  • pvg 4 hours ago
  • taeric an hour ago

    I am not clear on why reaching for an existing compiler's AST would ever be top of list?

    Don't get me wrong. I think many language design points should be used more. But starting from scratch makes a ton of sense. Skip the parsing stage and build up supported AST style constructs of your own.

    Done simply, this is basically the command pattern. Keep execution separate from declaration and you should be fine?

    Sure, you may want a parser for a dedicated serialization language some day. Hard to think you need start there?

    But starting with the full AST of an existing language feels like a terrible idea. In any world.

  • neilv 3 hours ago

    I had this kind of risk in mind when I wrote a server-side "HTML template" feature for Racket.

    The template language intentionally only handles static chunks of HTML, escaping of values, and a little safety guards.

    Everything else (including the usual template language behavior like iterating over a collection/stream, such as from a database query result) is done with arbitrary normal Racket language, which the template feature's implementation doesn't have to know about nor handle specially.

    https://www.neilvandyke.org/racket/html-template/

    More recently (for employability reasons, or under-resourced startup pragmatics), doing Python with Flask, JavaScript with SvelteKit, and Swift with SwiftUI, I still miss the clean simplicity and available power that I had with Scheme/Racket.

  • swyx 3 hours ago

    I wrote a similar recently: Oops! you built a database https://news.ycombinator.com/item?id=34941650

    direct link https://dx.tips/oops-database

  • tn1 2 hours ago

    Many older .NET applications saved programmers from this by providing "C# scripts". The framework includes the compiler and then it's trivial to use the compiled artifact. You can still do it by including the Roslyn libraries. I don't see it as much anymore, or it's some half-baked Python or Lua interface.

    • Nuzzerino an hour ago

      The new Roslyn incremental generator API is pretty good these days but not well documented yet. I’ve been using it with json-schema to save a lot of boilerplate and provide a more intuitive declarative framework in a large side project.

  • dgfitz 3 hours ago

    Man, the yocto framework could do for a read over of this.

    • kazinator an hour ago

      Right? It generates scripts which are then executed to do builds.

      Debugging broken Yocto builds can be a nightmare.

      You end up stepping into the build directory environment and manually running the generated code. You try some fixes in it, and then guess on which fragment in what bitbake file to backport that fix into.

    • Joel_Mckay 3 hours ago

      "He who must not be named" has a curse upon that name.

      Binary optimization through stripped static-embedding and packing is a level of evil even I find upsetting. lol... =3

  • layer8 4 hours ago

    But can it send email?

  • teaearlgraycold 3 hours ago

    I know of someone that did this for a bespoke form definition language to drive onboarding. Tens of thousands of lines, months of delays, and a bus factor of 1 later it was all eventually ripped out and replaced with plain old page templates. When your 10 question onboarding flow has a back-end class named “PredicateEvaluator” something is wrong.

  • akshayshah 4 hours ago

    I enjoyed the article, but the unintentional Easter egg at the end left me in stitches: the link to “If Architects had to work like Programmers” just 404s, which feels spot on.

    • IncRnd 3 hours ago

      The text to "If Architects had to work like Programmers" has been around for 20-30 years.

      You can find the text at many websites. Here is one from 1997: https://www.inf.ed.ac.uk/teaching/courses/seoc2/1997_1998/ar... It's a fun read.

      • gfody 2 hours ago

        most of this holds up really well except for

        > Do not worry at this time about acquiring the resources to build the house itself. Your first priority is to develop detailed plans and specifications. Once I approve these plans, I would expect the house to be under roof within 48 hours.

        ..nowadays would be more like:

        > The MVP should be move-in-ready ASAP at which point we shall move into the house and live there while you complete the remaining requirements.

    • fragmede 3 hours ago

      It's not a subtle joke, the link was capture on archive.is: https://archive.is/r4l6C

  • fragmede 3 hours ago

    So at what point does Kubernetes become justified?

    • yuliyp 2 hours ago

      What at all does that have to do with this post about not-compilers?