Comparison of configuration file languages (2016)

(gist.github.com)

21 points | by aleyan 4 days ago ago

17 comments

  • itohihiyt 18 hours ago

    I love me an INI. By far, IMO, the best human readable config syntax. Sure it's got some gotchas, like all old formats (CSV) there's no spec. But if I have the choice I'd go INI. It's simple and leaves the choices up to the program that's reading the file, because everything is a string.

    To me there's no difference in this articles argument that with INI people would have to remember about the idiosyncrasies of the python implementation related to comments, and people having to know and learn the correct syntax of TOML. I'd say remembering when and where you can comment is easier too.

    Either way it's personal preference. I do occasionally like to reread this though (because I'm boring): https://github.com/madmurphy/libconfini/wiki/An-INI-critique...

    • codeflo 16 hours ago

      You’d probably be surprised how often people have to create config files programmatically. With properly specified formats, you can serialize the configuration, which means nesting and string escapes are taken care of automatically. With half-baked custom formats, you have to resort to string replacement and praying.

      Having said that, there’s an official RFC for CSV, and INI files are de-facto specified by Microsoft’s implementation.

      • itohihiyt 15 hours ago

        Yeah, half baked anything is going to screw you over. INI files are certainly programmatically serialisable, otherwise they wouldn't exist. It does move the datatypes to the program rather than being encoded in the config though, which adds to program overhead. Horses for courses though, with greater flexibility comes greater potential to f*uk it up.

    • aeurielesn 12 hours ago

      TOML is a monstrosity and I'm terrified Python is backing it up.

      I'd love to see more HCL.

  • OskarS 18 hours ago

    More and more the last couple of years, I’ve started to realize that ”configuration languages” are just a bad idea. Please, for the love of all that is holy, just let us use a real programming language. For simple declarative configuration, it’s no more difficult to use Python (or Lua or whatever) than YAML, and for complex configuration (looking at you, YAML files for GitLab CI), it’s an absolute godsend to be able to use real if-statements and for-loops. If security/runtime is a concern, you can sandbox it or use something like Starlark.

    JSON is a fine over-the-wire format, but can’t we leave YAML in the dustbin and just do it ”properly” from now on? Please?

    • talideon 16 hours ago

      Yes, Starlark is a relatively sane option, but the benefit of a language like that isn't simply that it's sandboxed, but also that it's not Turing complete because it only allows for primitive recursion and thus is guaranteed to terminate.

      If you want a sensible compromise, it's to use something like Cue or Dhall to generate JSON configuration that's actually consumed by the software. There's also Jsonnet, but I've never been a fan.

    • 9dev 17 hours ago

      Hard pass. There is a lot of software I am forced to run without wanting to involve myself with it. If your tool is so complex to warrant a need for a full programming language in its config files, to me that smells of another issue, but generally I just want a set of knobs to tweak behaviour.

      I’m so over having to learn Go templates for a metrics exporter, or the need to write booleans as True or False for python apps, or some obscure Erlang syntax for RabbitMQ, or the batshit crazy APT config file syntax, or Apache's weird XML files… Just settle for a simple format and offer to load custom modules for those that need them.

      • OskarS 15 hours ago

        > but generally I just want a set of knobs to tweak behaviour.

        But you can do that so easily as well! Like, take this YAML (substitute equivalent JSON or TOML, specific configuration language is irrelevant):

            options: 
              someOption: 14
              nested:
                foo: "some string"
                bar: true
        
        I don't see how that's so much easier than this version in Lua:

            options = {
                someOption = 14,
                nested = {
                    foo = "some string",
                    bar = true
                }
            }
        
        And when you do need to get a bit more complex, you have the the tools to manage that complexity. Like, if you have a bunch of very similar things (say, job definitions in CI/CD), you want a way to not repeat yourself so you only have to edit that once, YAML has a tool for that, "anchors" [1]. But it's obscure and a bit hard to use. In a real programming language, it's trivial: just stick the repetitive parts in a variable and use that. Or define a simple function in case most of the stuff is repeated, but some is not, so you can do

            foo = generateRepetativeConfig('specificOption')
        
        Or maybe you can just do that with a simple for-loop. And if you want to do something like a string substitution, you have a whole dang language with the tools you need to do it.

        Dhall and languages like that are a big improvement, but really, I just want a normal programming language. We have decades of experience now with managing complexity in computer systems, and programming languages have evolved robust systems for handling that.

        I agree that sandboxing and security can be a real concern for some of this stuff, in which case Starlark is perfect, though you can sandbox languages like Lua or various Lisps as well.

        [1]: https://support.atlassian.com/bitbucket-cloud/docs/yaml-anch...

    • aziis98 18 hours ago

      I hope one day people will realize that configuration languages can just be implemented by adding a "--run-total-sandboxed" to their favorite language i.e. a flag that disables while loops and recursion making the language non Turing complete and that runs the program in a sandbox without direct external access.

      This would be far better than all random configuration languages out there and far more extensible.

    • d0mine 15 hours ago

      Programming language as config is bad because it makes it easy to do the wrong thing (too much power). Friction can be good.

      Ideally, the less power, the better:

      - .env :: simple flat untyped key/value pairs. In practice, something like Pydantic can introduce hierarchy and type validation for the config - json :: simple, few data types. Can be read/written by humans - toml :: easier to edit by humans - json5 :: not in Python stdlib - yaml :: may require discipline, to avoid turing tarpit

      As an intermediate between pure config and programming languages, jsonnet language can be used to generate json.

      Turing complete: Ruby, Python, Lua, Lisp, etc — avoid for simple configs. They can be used as extension languages/ DSL

    • Findecanor 17 hours ago

      That to me feels too dangerous. You'd need to sandbox every piece of config file you use.

      I've been sketching on a shell/scripting language/system for a while and I've been thinking of it being in-effect three sub-languages, one nested within the other. At the lowest level: pure data, middle level: pure functional expressions, top level: imperative commands. I'd think that most use-cases for a configuration file would be satisfied by the lowest level, and some by the middle, but absolutely none would need the highest. Then a config file would be "sandboxed" by the language syntax, with very little risk of it breaking out.

    • derriz 17 hours ago

      I disagree - I do not want to involve execution to interpret the contents of a config file. I at least want to be able to trivially determine if two config components are equal by simple visual inspection.

    • otabdeveloper4 15 hours ago

      It doesn't take very long for your confuguration-in-a-real-language DSL to need a configuration language of its own.

      What now?

  • oezi 18 hours ago

    If JSON would allow comments (and trailing commas would be nice as well), then I think it would be a strong contender for a config format as well.

    • anon7000 18 hours ago

      Many configuration files for tools in the JS ecosystem actually do allow both of those through json5 (https://json5.org/)

    • massifist 16 hours ago

      Also, allowing unquoted names.

  • 15 hours ago
    [deleted]