Launch HN: Hypercubic (YC F25) – AI for COBOL and Mainframes

(hypercubic.ai)

29 points | by sai18 2 hours ago ago

13 comments

  • gregsadetsky 10 minutes ago

    I submitted this [0] story a few weeks ago, which led to some discussion and then being flagged since (I think) people were unsure of how verifiable the stats around COBOL were (the submitted page had more than a tinge of self-promotional language).

    I was curious to ask you, as domain experts, if you could talk more to the "70% of the Fortune 500 still run on mainframes" stat you mentioned.

    Where do these numbers come from? And also, does it mean that those 70% of Fortune 500s literally run/maintain 1k-1M+ LoC of COBOL? Or do these companies depend on a few downstream specialized providers (financial, aviation/logistics, etc.) which do rely on COBOL?

    Like, is it COBOL all the way down, or is everything built in different ways, but basically on top of 3 companies, and those 3 companies are mostly doing COBOL?

    Thanks!

    [0] https://news.ycombinator.com/item?id=45644205

  • le-mark 9 minutes ago

    I’ve witnessed two legacy migration projects, both failed. One was a source translation to Java, this one failed because they didn’t have the expertise to manage production Java applications and pulled the plug. The other was a rewrite that went over budget and was cancelled.

    > HyperDocs ingests COBOL, JCL, and PL/I codebases to generate documentation, architecture diagrams, and dependency graphs.

    Lots of tools available that do this already without AI.

    > The goal is to build digital “twins” of the experts on how they debug, architect, and maintain these systems in practice.

    That will be a neat trick, will the output be more than sparsely populated wiki?

    My experience is there’s not a lot of Will or money to follow these things through.

  • zurfer an hour ago

    I heard the story once on how you migrate these old systems: 1 you get a super extensive test suite of input - output pairs

    2 you do a "line by line" reimplementation in Java (well banks like it).

    3 you run the test suite and track your progress

    4 when you get to 100 percent, you send the same traffic to both systems and shadow run the new implementation. Depending on how that goes you either give up, go back to work implementation or finally switch to the new system

    This obviously is super expensive and slow to minimize any sort of risks for systems that usually handle billions or trillions of dollars.

    • aayn an hour ago

      Yeah, we've heard the same "big bang" story a bunch of times as well. However, step 1 (extensive test suite) is often missing and is something you'd have to do as part of the modernization effort as well. Overall, it is a pretty hairy and difficult problem.

  • koolba 29 minutes ago

    Neat stuff. For more esoteric environments that could use this type of automated leg up, check out MUMPS: https://en.wikipedia.org/wiki/MUMPS

    There's a bunch of mainly legacy hospital and government (primarily VA) systems that run on it. And where there's big government systems, there's big government dollars.

    • sai18 14 minutes ago

      Thanks for sharing. It seems MUMPS is just as old and legacy as some of the COBOL systems!

  • jclay an hour ago

    Exciting work! I’ve often wondered if an LLM with the right harness could restore and optimize an aging C/C++ codebase. It would be quite compelling to get an old game engine running again on a modern system.

    I would expect most of these systems come with very carefully guarded access controls. It also strikes me as a uniquely difficult challenge to track down the decision maker who is willing to take the risk on revamping these systems (AI or not). Curious to hear more about what you’ve learned here.

    Also curious to hear how LLMs perform on a language like COBOL that likely doesn’t have many quality samples in the training data.

    • sai18 an hour ago

      Thank you!

      The decision makers we work with are typically modernization leaders and mainframe owners — usually director or VP level and above. There are a few major tailwinds helping us get into these enterprises:

      1. The SMEs who understand these systems are retiring, so every year that passes makes the systems more opaque.

      2. There’s intense top-down pressure across Fortune 500s to adopt AI initiatives.

      3. Many of these companies are paying IBM 7–9 figures annually just to keep their mainframes running.

      Modernization has always been a priority, but the perceived risk was enormous. With today’s LLMs, we’re finally able to reduce that risk in a meaningful way and make modernization feasible at scale.

      You’re absolutely right about COBOL’s limited presence in training data compared to languages like Java or Python. Given COBOL is highly structured and readable, the current reasoning models get us to an acceptable level of performance where it's now valuable to use them for these tasks. For near-perfect accuracy (95%+), that is where we see an large opportunity to build domain-specific frontier models purpose built for these legacy systems.

  • 1970-01-01 an hour ago

    This is weird. Mainframes are the dinosaurs of tech. Never would I think, 'let's add some AI' to a mainframe. It would be like adding opposable thumbs to dinosaurs. You are still solving the wrong problem, but it sure will be interesting watching the systems cope.

    • sai18 22 minutes ago

      Looks like it's already been pointed out. We’re not applying AI to these systems — IBM is already pursuing those initiatives (https://research.ibm.com/blog/spyre-for-z).

      Our focus is different: we’re using AI to understand these 40+ year-old black box systems and capture the knowledge of the SMEs who built and maintain them before they retire. There simply aren’t enough engineers left who can fully understand or maintain these systems, let alone modernize them.

      The COBOL talent shortage has already been a challenge for many decades now, and it’s only becoming more severe.

    • koolba 32 minutes ago

      IIUC, what's being solved here is not the mainframe, it's the lack of knowledge transfer of what the heck that software is doing and how it works. Anything that drives down the cost of answering those types of questions, whether for debugging or replacing, is going to be worth a lot to the dinosaurs running these systems.

  • mkoubaa an hour ago

    I'm excited for you but nervous about the kinds of bugs that might get caused

    • sai18 42 minutes ago

      Given how far back these systems go, the real challenge isn’t just the code or the lack of documentation, but the tribal knowledge baked into them. A lot of the critical logic lives in conventions, naming patterns, and unwritten rules that only long-time SMEs understand.

      Using AI and a few different modalities of information that exist about these systems (existing code, docs, AI-driven interviews, and workflow capture), we can triangulate and extract that tribal knowledge out.