25 comments

  • rtpg 2 hours ago

    The lack of naming seems to indicate a fundamental misunderstanding of how LLM coding agents are successful, and just makes me doubt anything about this project being useful and workable.

    • svachalek an hour ago

      Yeah it seems based on 2023 research which is ancient, back when we didn't have coding agents at all, and on some 1980s sci fi concepts of "how machines think" (beedeeboop) rather than the all too human coding agents we have.

      If I had to design one of these, I'd go for:

      1. Token minimization (which may be circular, I'm sure tokens are selected for these models at least in part based on syntax of popular languages)

      2. As many compile time checks as possible (good for humans, even better for machines with limited context)

      3. Maximum locality. That is, a feature can largely be written in one file, rather than bits and pieces all over the codebase. Because of how context and attention work. This is the one I don't see much in commercially popular languages. It's more of a declarative thing, "configuration driven development".

      • Octoth0rpe 37 minutes ago

        > That is, a feature can largely be written in one file, rather than bits and pieces all over the codebase.

        This seems to be at odds with the goal of token minimization. Lots of small files that are narrowly scoped means less has to be loaded into context when making a change, right?

        Throwing out another idea: I wonder if we could see some kind of equivalent of c header files for more modern languages so that an llm just has to read the equivalent of a .h file to start using a library.

        • lesam 5 minutes ago

          I think AST aware code reading is criminally underused by agents - you don't need a header file if you can see a listing of all the functions in a library.

          Similarly, I don't read the whole file a function is in while editing it in an IDE, why should a coding agent get the whole file polluting its context by default?

      • drivingmenuts 29 minutes ago

        > all too human coding agents

        There is no actual thought occurring. Arguably, we can say the same about a lot of humans at any given moment, but with machines there never is. It's all statistics.

  • danpalmer 2 hours ago

    > The empirical literature shows that models are particularly vulnerable to naming-related errors like choosing misleading names, reusing names incorrectly, and losing track of which name refers to which value.

    I think Vera might be missing something here. In my experience, LLMs code better the less of a mental model you need, vs the more is in text on the page.

    Go – very little hidden, everything in text on the page, LLMs are great. Java, similar. But writing Haskell, it's pretty bad, Erlang, not wonderful. You need much more of a mental model for those languages.

    For Vera, not having names removes key information that the model would have, and replaces it with mental modelling of the stack of arguments.

    • robviren an hour ago

      I too have found the models do well with Go. I will say despite the backwards compatibility guarantee library API changes, what counts as "good" patterns, and new language additions do add some friction to the experience. Almost always works but it can be a bit inconsistent in how the code shows up.

    • rapind 2 hours ago

      > But writing Haskell, it's pretty bad,

      I’m surprised by this. Most likely significant white space is a big part of the problem (LLMs seem horrible at white space). Functional with types has been a win for me with Gleam.

    • sornaensis 2 hours ago

      I'm curious what issues you had with haskell? I have had the opposite experience and find them dreadful at Java et al.

      Surely, denser languages should be better for LLMs?

      • hgoel 33 minutes ago

        The context window also limits how deeply the model can "think", and it does this in natural language. So a language suited to LLMs would have balanced density, if it's too dense, the model spends many tokens working through the logic, if it's too sparse, it spends many tokens to read/write the code.

        I think in the context of already trained LLMs, the languages most suited to LLMs are also the ones most suited to humans. Besides just having the most code to train on, humans also face similar limitations, if the language is too dense they have to be very careful in considering how to do something, if it's too sparse, the code becomes a pain to maintain.

      • zem 24 minutes ago

        my (uninformed) speculation is that you want resilience and error correction, which implies some level of redundancy rather than pure density.

      • danpalmer an hour ago

        Density is a double edged sword. On the one hand you want to minimise context usage, but on the other hand more text on the page means more that the LLM can work with.

  • 2001zhaozhao 2 hours ago

    > There are no variable names. @Int.0 is the most recent Int binding; @Int.1 is the one before.

    You already lost me here. There's a reason variable names are a thing in programming, and that's to semantically convey meaning. This matters no matter whether a human is writing the code or a LLM.

    • ycombinatornews 26 minutes ago

      Same here, reminds of JIRA’s field_17190 in MCP responses instead of description (and in similar excel-like systems)

      Good luck managing hallucinations on that context

  • solomonb 2 hours ago

    I think Hindley Milner (for decidability) + Linear Types (for resource management) + Refinement Types (for lightly asserting invariants) + Delimited Continuation based Effects (for tracking effectful code) + Unison style Content Addressability (for corralling code changes, documentation, and tests) would make a really nice language for an LLM.

  • unignorant 2 hours ago

    This isn't my project, but I shared it here because it has a few important ideas I've been thinking about in my own work. Effect type systems in particular are a really good fit for LLMs because they allow you to reason very precisely about a program's capabilities before runtime (basically, using the type system for capability proofs). This helps you trust agent-created code (for example, you know it can't do IO), or, if the code does require certain capabilities, run it in a sandbox (e.g., mock network or filesystem). This kind of language design also provides a safer foundation for complex meta-systems of agents-that-create-agents, depending on how the runtime is implemented, though Vera may be somewhat limited in that particular respect.

    The major design decision I'm a little skeptical about is removing variable names; it would be interesting to see empirical data on that as it seems a bit unintuitive. I would expect almost the opposite, that variable names give LLMs some useful local semantics.

  • hybrid_study an hour ago

    I love the ## Why README section! Every repo should have one :-)

  • hyperhello 2 hours ago

    > Division by zero is not a runtime error — it is a type error. The compiler checks every call site to prove the divisor is non-zero.

    Elaborate a little here.

    • hgoel 23 minutes ago

      Presumably an analyzer that makes it an error to not have an immediately traceable zero check.

      C# can do something similar with null references. It can require you to indicate which arguments and variables are capable of being null, and then compiler error/warning if you pass it to something that expects a non-null reference without a null check.

      • hyperhello 14 minutes ago

        But that’s because null is a static type. Zero isn’t a static type. How can I know if a calculation produces zero if I can’t predict the result of it at compile time?

        • hgoel 3 minutes ago

          It's about if there's a possibility of it being zero. Of course there's no way to tell at compile time that a value will definitely be zero.

          So, in pseudocode

          int div(int a, int b): return a / b;

          Would probably be a compile time error, but

          int div(int a, int b): return b == 0 ? ERR : (a /b);

          Would not, or at least that's what I'd expect.

  • sas41 an hour ago

    I find the claims regarding LLMs and their mistake prone nature around variable names very confusing.

    It appears that me and creator have had vastly different experiences with LLMs and their capabilities with complex code bases and complicated business logic.

    My observations point to LLMs being much more successful when variables and methods have explicit, detailed names, it's the best way to keep them on track and minimize the chance of confusion, next closest thing being explicit comments and inline documentation.

    Poorly named and poorly documented things in a codebase only cause it to reason more on what it could be, often reaching a (wrong) conclusion, wasting tokens, wasting time.

    Perhaps this diversion in philosophy is due to fundamental differences in how we view the tool at hand.

    I do not trust the machine, as such I review it's output, and if the variables lacked names, that would be significantly harder. But if I had a "Jesus, take the wheel!" attitude, perhaps I'd care far less.

  • DonHopkins 2 hours ago

    This is exactly the wrong approach. LLMs are good at writing programming languages they already know, that are well represented in the training data, not at writing programming languages that they have never seen before, so that you have to include the entire programming language manual and lots of example code in every prompt.

  • ginko 2 hours ago

    Is there any evidence that using structural references rather than names allows large language models to generate better code? This bit just feels like obfuscation for obfustcation’s sake.

    • Dragon-Hatcher 2 hours ago

      I've read the FAQ (https://github.com/aallan/vera/blob/main/FAQ.md) that provides the justification for this and it is, IMO, fairly weak. The main argument is that misleading names can confuse models. I have no problem believing this bit I'm not sure why we should assume code will have misleading names. In fact, the same document says that in tests they've had LLMs mix up the indices, which is exactly the problem I would foresee. It seems especially messy that the name for the same variable will change in different places in the code. The utility of De Bruijn indices is easy substitutability of expressions, which seems like totally the wrong thing to optimize for in a programming language.

      Edit: the more I think about it the more this seems like a really bad idea. Three more issues come to mind: 1) it becomes impossible to grep for a variable, which I know agents do all the time. 2) editing code at the top of the function, say introducing a new variable, can require editing all the code in the rest of the function, even if it was semantically unchanged! 3) they say it is less context for the LLM to track but now, instead of just having to know the name of one variable, you have to keep track of every other variable in the function