A few months ago I saw a post on LinkedIn where someone fed the leading LLMs a counter-intuitively drawn circuit with 3 capacitors in parallel and asked what the total capacitance was. Not a single one got it correct - not only did they say the caps were in series (they were not) it even got the series capacitance calculations wrong. I couldn’t believe they whiffed it and had to check myself and sure enough I got the same results as the author and tried all types of prompt magic to get the right answer… no dice.
I also saw an ad for an AI tool that’s designed to help you understand schematics. In its pitch to you, it’s showing what looks like a fairly generic guitar distortion pedal circuit and does manage to correctly identify a capacitor as blocking DC but failed to mention it also functions as a component in an RC high-pass filter. I chuckled when the voice over proudly claims “they didn’t even teach me this in 4 years of Electrical Engineering!” (Really? They don’t teach how capacitors block DC and how RC filters work????)
If you’re in this space you probably need to compile your own carefully curated codex and train something more specialized. The general purpose ones struggle too much.
I studied mechatronics and did reasonably well... but in any electrical class I would just scrape by. I loved it but was apparently not suited to it. I remember a whole unit basically about transistors. On the software/mtrx side we were so happy treating MOSFETs as digital. Having to analyse them in more depth did my head in.
I had a similar experience, except Mechanical Engineering being my weakest area. Computer Science felt like a children's game compared to fluid dynamics...
I don’t mind LLMs in the ideation and learning phases, which aren’t reproducible anyway. But I still find it hard to believe engineers of all people are eager to put a slow, expensive, non-deterministic black box right at the core of extremely complex systems that need to be reliable, inspectable, understandable…
You find it hard to believe that non-deterministic black boxes at the core of complex systems are eager to put non-deterministic black boxes at the core of complex systems?
Yes I do! Is that some sort of gotcha? If I can choose between having a script that queries the db and generates a report and “Dave in marketing” who “has done it for years”, I’m going to pick the script. Who wouldn’t? Until machines can reliably understand, operate and self-correct independently, I’d rather not give up debuggability and understandability.
I think this comment and the parent comment are talking about two different things. One of you is talking about using nondeterministic ML to implement the actual core logic (an automated script or asking Dave to do it manually), and one of you is talking about using it to design the logic (the equivalent of which is writing that automated script).
LLM’s are not good at actually doing the processing, they are not good at math or even text processing at a character level. They often get logic wrong. But they are pretty good at looking at patterns and finding creative solutions to new inputs (or at least what can appear creative, even if philosophically it’s more pattern matching than creativity). So an LLM would potentially be good at writing a first draft of that script, which Dave could then proofread/edit, and which a standard deterministic computer could just run verbatim to actually do the processing. Eventually maybe even Dave’s proofreading would be superfluous.
Tying this back to the original article, I don’t think anyone is proposing having an LLM inside a chip that processes incoming data in a non-deterministic way. The article is about using AI to design the chips in the first place. But the chips would still be deterministic, the equivalent of the script in this analogy. There are plenty of arguments to make about LLM‘s not being good enough for that, not being able to follow the logic or optimize it, or come up with novel architectures. But the shape of chip design/Verilog feels like something that with enough effort, an AI could likely be built that would be pretty good at it. All of the knowledge that those smart knowledgeable engineers which are good at writing Verilog have built up can almost certainly be represented in some AI form, and I wouldn’t bet against AI getting to a point where it can be helpful similarly to how Copilot currently is with code completion. Maybe not perfect anytime soon, but good enough that we could eventually see a path to 100%. It doesn’t feel like there’s a fundamental reason this is impossible on a long enough time scale.
> So an LLM would potentially be good at writing a first draft of that script, which Dave could then proofread/edit
Right, and there’s nothing fundamentally wrong with this, nor is it a novel method. We’ve been joking about copying code from stack overflow for ages, but at least we didn’t pretend that it’s the peak of human achievement. Ask a teacher the difference between writing an essay and proofreading it.
Look, my entire claim from the beginning is that understanding is important (epistemologically, it may be what separates engineering from alchemy, but I digress). Practically speaking, if we see larger and larger pieces of LLM written code, it will be similar to Dave and his incomprehensible VBA script. It works, but nobody knows why. Don’t get me wrong, this isn’t new at all. It’s an ever-present wet blanket that slowly suffocates engineering ventures who don’t pay attention and actively resist. In that context, uncritically inviting a second wave of monkeys to the nuclear control panels, that’s what baffles me.
> We’ve been joking about copying code from stack overflow for ages
Tangent for a slight pet peeve of mine:
"We" did joke about this, but probably because most of our jobs are not in chip design. "We" also know the limits of this approach.
The fact that Stack Overflow is the most SEO optimised result for "how to center div" (which we always forget how to do) doesn't have any bearing on the times when we have an actual problem requiring our attention and intellect. Say diagnosing a performance issue, negotiating requirements and how they subtly differ in an edge case from the current system behaviour, discovering a shared abstraction in 4 pieces of code that are nearly but not quite the same.
I agree with your posts here, the Stack Overflow thing in general is just a small hobby horse I have.
>If I can choose between having a script that queries the db and generates a report and “Dave in marketing” who “has done it for years”
If you could that would be nice wouldn't it? And if you couldn't?
If people were saying, "let's replace Casio Calculators with interfaces to GPT" then that would be crazy and I would wholly agree with you but by and large, the processes people are scrambling to place LLMs in are ones that typical machines struggle or fail and humans excel or do decently (and that LLMs are making some headway in).
You're making the wrong distinction here. It's not Dave vs your nifty script. It's Dave or nothing at all.
There's no point comparing LLM performance to some hypothetical perfect understanding machine that doesn't exist.
You compare to the things its meant to replace - humans. How well can the LLM do this compared to Dave ?
I'm a non-deterministic black box who teaches complex deterministic machines to do stuff and leverages other deterministic machines as tools to do the job.
I like my job.
My job also involves cooperating with other non-deterministic black boxes (colleagues).
I can totally see how artificial non-deterministic black boxes (artificial colleagues) may be useful to replace/augment the biological ones.
For one, artificial colleagues don't get tired and I don't accidentally hurt their feelings or whatnot.
In any case, I'm not looking forward to replacing my deterministic tools with the fuzzy AI stuff.
Intuitively at least it seems to me that these non-deterministic black boxes could really benefit from using the deterministic tools for pretty much the same reasons we do as well.
Can you actually like follow through with this line? I know there are literally tens of thousands of comments just like this at this point, but if you have chance, could you explain what you think this means? What should we take from it? Just unpack it a little bit for us.
An interpretation that makes sense to me: humans are non-deterministic black boxes already at the core of complex systems. So in that sense, replacing a human with AI is not unreasonable.
I’d disagree, though: humans are still easier to predict and understand (and trust) than AI, typically.
With humans we have a decent understanding of what they are capable of. I trust a medical professional to provide me with medical advice and an engineer to provide me with engineering advice. With LLM, it can be unpredictable at times, and they can make errors in ways that you would not imagine. Take the following examples from my tool, which shows how GPT-4o and Claude 3.5 Sonnet can screw up.
In this example, GPT-4o cannot tell that GitHub is spelled correctly:
I still believe LLM is a game changer and I'm currently working on what I call a "Yes/No" tool which I believe will make trusting LLMs a lot easier (for certain things of course). The basic idea is the "Yes/No" tool will let you combine models, samples and prompts to come to a Yes or No answer.
Based on what I've seen so far, a model can easily screw up, but it is unlikely that all will screw up at the same time.
It's actually a great topic - both humans and LLMs are black boxes. And both rely on patterns and abstractions that are leaky. And in the end it's a matter of trust, like going to the doctor.
But we have had extensive experience with humans, it is normal to have better defined trust, LLMs will be better understood as well. There is no central understander or truth, that is the interesting part, it's a "Blind men and the elephant" situation.
We are entering the nondeterministic programming era in my opinion. LLM applications will be designed with the idea that we can't be 100% sure and what ever solution can provide the most safe guards, will probably be the winner.
Because people are not saying "let's replace Casio Calculators with interfaces to GPT!"
By and large, the processes people are scrambling to place LLMs in are ones that typical machines struggle or fail and humans excel or do decently (and that LLMs are making some headway in).
There's no point comparing LLM performance to some hypothetical perfect understanding machine that doesn't exist. It's nonsensical actually. You compare it to the performance of the beings it's meant to replace or augment - humans.
Replacing non-deterministic black boxes with potentially better performing non-deterministic black boxes is not some crazy idea.
Sure. I mean, humans are very good at building businesses and technologies that are resilient to human fallibility. So when we think of applications where LLMs might replace or augment humans, it’s unsurprising that their fallible nature isn’t a showstopper.
Sure, EDA tools are deterministic, but the humans who apply them are not. Introducing LLMs to these processes is not some radical and scary departure, it’s an iterative evolution.
Ok yeah. I think the thing that trips me up with this argument then is just, yes, when you regard humans in a certain neuroscientific frame and consider things like consciousness or language or will, they are fundamentally nondeterministic. But that isn't the frame of mind of the human engineer who does the work or even validates it. When the engineer is working, they aren't seeing themselves as some black box which they must feed input and get output, they are thinking about the things in themselves, justifying to themselves and others their work. Just because you can place yourself in some hypothetical third person here, one that oversees the model and the human and says "huh yeah they are pretty much the same, huh?", doesn't actually tell us anything about whats happening on the ground in either case, if you will. At the very least, this same logic would imply fallibility is one dimensional and always statistical; "the patient may be dead, but at least they got a new heart." Like isn't in important to be in love, not just be married? To borrow some Kant, shouldn't we still value what we can do when we think as if we aren't just some organic black box machines? Is there even a question there? How could it be otherwise?
Its really just that the "in principle" part of the overall implication with your comment and so many others just doesn't make sense. Its very much cutting off your nose to spite your face. How could science itself be possible, much less engineering, if this is how we decided things? If we regarded ourselves always from the outside? How could even be motivated to debate whether we get the computers to design their own chips? When would something actually happen? At some point, people do have ideas, in a full, if false, transparency to themselves, that they can write down and share and explain. This is not only the thing that has gotten us this far, it is the very essence of why these models are so impressive in the certain ways that they are. It doesn't make sense to argue for the fundamental cheapness of the very thing you are ultimately trying to defend. And it imposes this strange perspective where we are not even living inside our own (phenomenal) minds anymore, that it fundamentally never matters what we think, no matter our justification. Its weird!
I'm sure you have a lot of good points and stuff, I just am simply pointing out that this particular argument is maybe not the strongest.
I took it to be a joke that the description "slow, expensive, non-deterministic black boxes" can apply to the engineers themselves. The engineers would be the ones who would have to place LLMs at the core of the system. To anyone outside, the work of the engineers is as opaque as the operation of LLMs.
In a reductive sense, this passage might as well read "You find it hard to believe that entropy is the source of other entropic reactions?"
No, I'm just disappointed in the decision of Black Box A and am bound to be even more disappointed by Black Box B. If we continue removing thoughtful design from our systems because thoughtlessness is the default, nobody's life will improve.
100% agree. While I can’t find all the sources right now, [1] and its references could be a good starting point for further exploration. I recall there being a proof or conjecture suggesting that it’s impossible to build an "LLM firewall" capable of protecting against all possible prompts—though my memory might be failing me
> Edit: I believe that LLM's are eminently useful to replace experts (of all people) 90% of the time.
What do you mean by "expert"?
Do you mean the pundit who goes on TV and says "this policy will be bad for the economy"?
Or do you mean the seasoned developer who you hire to fix your memory leaks? To make your service fast? Or cut your cloud bill from 10M a year to 1M a year?
Experts of the kind that will be able to talk for hours about the academic consensus on the status quo without once considering how the question at hand might challenge it? Quite likely.
Experts capable of critical thinking and reflecting on evidence that contradicts their world model (and thereby retraining it on the fly)? Most likely not, at least not in their current architecture with all its limitations.
I know nothing about chip design.
But saying "Applying AI to field X won't work, because X is complex, and LLMs currently have subhuman performance at this" always sounds dubious.
VCs are not investing in the current LLM-based systems to improve X, they're investing in a future where LLM based systems will be 100x more performant.
Writing is complex, LLMs once had subhuman performance, and yet.
Digital art. Music (see suno.AI)
There is a pattern here.
I didn't get into this in the article, but one of the major challenges with achieving superhuman performance on Verilog is the lack of high-quality training data. Most professional-quality Verilog is closed source, so LLMs are generally much worse at writing Verilog than, say, Python. And even still, LLMs are pretty bad at Python!
That's probably where there's a big advantage to being a company like Nvidia, which has both the proprietary chip design knowledge/data and the resources/money and AI/LLM expertise to work on something specialized like this.
I strongly doubt this - they don't have enough training data either - you are confusing (i think) the scale of their success with the amount of verilog they possess.
IE I think you are wildly underestimating both the scale of training data needing, and wildly overestimating the amount of verilog code possessed by nvidia.
GPU's work by having moderate complexity cores (in the scheme of things) that are replicated 8000 times or whatever.
That does not require having 8000 times as much useful verilog, of course.
The folks who have 8000 different chips, or 100 chips that each do 1000 things, would probably have orders of magnitude more verilog to use for training
That’s what your VC investment would be buying; the model of “pay experts to create a private training set for fine tuning” is an obvious new business model that is probably under-appreciated.
If that’s the biggest gap, then YC is correct that it’s a good area for a startup to tackle.
AI still has subhuman performance for art. It feels like the venn diagram of people who are bullish on LLMs and people who don't understand logistic curves is a circle.
I like this reasoning. It is shortsighted to say that LLMs aren’t well-suited to something (because we cannot tell the future) but it is not shortsighted to say that LLMs are well-suited to something (because we cannot tell the future)
I kinda suspect that things that are expressed better with symbols and connections than with text will always be a poor fit to large LANGUAGE models. Turning what is basically a graph into a linear steam of text descriptions to tokenize and jam into an LLM has to be an incredibly inefficient and not very performant way of letting “AI” do magic on your circuits.
Ever try to get ChatGPT to play scrabble? Ever try to describe the board to it and then all the letters available to you? Even its fancy pants o1 preview performs absolutely horrible. Either my prompting completely sucks or an LLM is just the wrong tool for the job.
It’s great for asking you to score something you just created provided you tell it what bonuses apply to which words and letters. But it has absolutely no concept of the board at all. You cannot use to optimize your next move based on the board and the letters.
… I mean you might if you were extremely verbose about every letter on the board and every available place to put your tiles, perhaps avoiding coordinates and instead describing each word, its neighbors and relationships to bonus squares. But that just highlights how bad a tool an LLM is for scrabble.
Anyway, I’m sure schematics are very similar. Maybe somebody we will invent good machine learning models for such things but an LLM isn’t it.
> Writing is complex, LLMs once had subhuman performance,
And now they can easily replace mediocre human performance, and since they are tuned to provide answers that appeal to humans that is especially true for these subjective value use cases. Chip design doesn't seem very similar. Seems like a case where specifically trained tools would be of assistance. For some things, as much as generalist LLMs have surprised at skill in specific tasks, it is very hard to see how training on a broad corpus of text could outperform specific tools — for first paragraph do you really think it is not dubious to think a model trained on text would outperform Stockfish at chess?
I worked on the Qualcomm DSP architecture team for a year, so I have a little experience with this area but not a ton.
The author here is missing a few important things about chip design. Most of the time spent and work done is not writing high performance Verilog. Designers spent a huge amount of time answering questions, writing documentation, copying around boiler plate, reading obscure manuals and diagrams, etc. LLMs can already help with all of those things.
I believe that LLMs in their current state could help design teams move at least twice as fast, and better tools could probably change that number to 4x or 10x even with no improvement in the intelligence of models. Most of the benefit would come from allowing designers to run more experiments and try more things, to get feedback on design choices faster, to spend less time documenting and communicating, and spend less time reading poorly written documentation.
Author here -- I don't disagree! I actually noted this in the article:
> Well, it turns out that LLMs are also pretty valuable when it comes to chips for lucrative markets -- but they won’t be doing most of the design work. LLM copilots for Verilog are, at best, mediocre. But leveraging an LLM to write small snippets of simple code can still save engineers time, and ultimately save their employers money.
I think designers getting 2x faster is probably optimistic, but I also could be wrong about that! Most of my chip design experience has been at smaller companies, with good documentation, where I've been focused on datapath architecture & design, so maybe I'm underestimating how much boilerplate the average engineer deals with.
Regardless, I don't think LLMs will be designing high-performance datapath or networking Verilog anytime soon.
At large companies with many designers, a lot of time is spent coordinating and planning. LLMs can already help with that.
As far as design/copilot goes, I think there are reasons to be much more optimistic. Existing models haven't seen much Verilog. With better training data it's reasonable to expect that they will improve to perform at least as well on Verilog as they do on python. But even if there is a 10% chance it's reasonable for VCs to invest in these companies.
I’m actually curious if there even is a large enough corpus of Verilog out there. I have noticed that even tools like Copilot tend to perform poorly when working with DSLs that are majority open source code (on GitHub no less!) where the practical application is niche. To put this in other terms, Copilot appears to _specialize_ on languages, libraries and design patterns that have wide adoption, but does not appear to be able to _generalize_ well to previously unseen or rarely seen languages, libraries, or design patterns.
Anyway that’s largely anecdata/sample size of 1, and it could very well be a case of me holding the tool wrong, but that’s what I observed.
design automation tooling startups have it incredibly hard - first, customers wont buy from startups, and second, the space of possibly exits via acquisitions is tiny.
I agree with most of the technical points of the article.
But there may still be value in YC calling for innovation in that space. The article is correctly showing that there is no easy win in applying LLMs to chip design. Either the market for a given application is too small, then LLMs can help but who cares, or the chip is too important, in which case you'd rather use the best engineers. Unlike software, we're not getting much of a long tail effect in chip design. Taping out a chip is just not something a hacker can do, and even playing with an FPGA has a high cost of entry compared to hacking on your PC.
But if there was an obvious path forward, YC wouldn't need to ask for an innovative approach.
you could say it is the naive arrogance of the beginner mind.
seen here as well when george-hotz attempts to overthow the chip companies with his plan for an ai chip https://geohot.github.io/blog/jekyll/update/2021/06/13/a-bre... little realizing the complexity involved. to his credit, he quickly pivoted into a software and tiny-box maker.
Even obvious can be risky. First it's nice to share the risk, second more investments come with more connections.
As for LLMs boom. I think finally we'll realize that LLM with algorithms can do much more than just LLM. 'algorithms' is probably a bad word here, I mean assisting tools like databases, algorithms, other models. Then only access API can be trained into LLM instead of the whole dataset for example.
I know several founders who went through YC in the chip design space, so even if the people running YC don't have a chip design background, just like VCs, they learn from hearing pitches of the founders who actually know the space.
The way I read that, I think they're saying hardware acceleration of specific algorithms can be 100 times faster and more efficient than the same algorithm in software on a general purpose processor, and since automated chip design has proven to be a difficult problem space, maybe we should try applying AI there so we can have a lower bar to specialized hardware accelerators for various tasks.
I do not think they mean to say that an AI would be 100 times better at designing chips than a human, I assume this is the engineering tradeoff they refer to. Though I wouldn't fault anyone for being confused, as the wording is painfully awkward and salesy.
I also think OP is missing the point saying the target applications are too small of a market to be worth pursuing.
They’re too small to pursue any single one as the market cap for a company, but presumably the fictional AI chip startup could pursue many of these smaller markets at once. It would be a long tail play, wouldn’t it?
YC doesn't care whether it "makes sense" to use an LLM to design chips. They're as technically incompetent as any other VC, and their only interest is to pump out dogshit startups in the hopes it gets acquired. Gary Tan doesn't care about "making better chips": he cares about finding a sucker to buy out a shitty, hype-based company for a few billion. An old school investment bank would be perfect.
YC is technically incompetent and isn't about making the world better. Every single one of their words is a lie and hides the real intent: make money.
First, VCs don't get paid when "dogshit startups" get acquired, they get paid when they have true outlier successes. It's the only way to reliably make money in the VC business.
Second, want to give any examples of "shitty, hype-based compan[ies]" (I assume you mean companies with no real revenue traction) getting bought out for "a few billion".
Third, investment banks facilitate sales of assets, they don't buy them themselves.
Maybe sit out the conversation if you don't even know the basics of how VC, startups, or banking work?
They (YC) are interested in the use of LLMs to make the process of designing chips more efficient. Nowhere do they talk about LLMs actually designing chips.
I don't know anything about chip design, but like any area in tech I'm certain there are cumbersome and largely repetitive tasks that can't easily be done by algorithms but can be done with human oversight by LLMs. There's efficiency to be gained here if the designer and operator of the LLM system know what they're doing.
I'd want to know about the results of these experiments before casting judgement either way. Generative modeling has actual applications in the 3D printing/mechanical industry.
One of the consistent problems I'm seeing over and over again with LLMs is people forgetting that they're limited by the training data.
Software engineers get hyped when they see the progress in AI coding and immediately begin to extrapolate to other fields—if Copilot can reduce the burden of coding so much, think of all the money we can make selling a similar product to XYZ industries!
The problem with this extrapolation is that the software industry is pretty much unique in the amount of information about its inner workings that is publicly available for training on. We've spent the last 20+ years writing millions and millions of lines of code that we published on the internet, not to mention answering questions on Stack Overflow (which still has 3x as many answers as all other Stack Exchanges combined [0]), writing technical blogs, hundreds of thousands of emails in public mailing lists, and so on.
Nearly every other industry (with the possible exception of Law) produces publicly-visible output at a tiny fraction of the rate that we do. Ethics of the mass harvesting aside, it's simply not possible for an LLM to have the same skill level in ${insert industry here} as they do with software, so you can't extrapolate from Copilot to other domains.
Yes this is EXACTLY it, and I was discussing this a bit at work (financial services).
In software, we've all self taught, improved, posted Q&A all over the web. Plus all the open source code out there. Just mountains and mountains of free training data.
However software is unique in being both well paying and something with freely available, complete information online.
A lot of the rest of the world remains far more closed and almost an apprenticeship system. In my domain thinks like company fundamental analysis, algo/quant trading, etc. Lots of books you can buy from the likes of Dalio, but no real (good) step by step research and investment process information online.
Likewise I'd imagine heavily patented/regulated/IP industries like chip design, drug design, etc are substantially as closed. Maybe companies using an LLM on their own data internally could make something of their data, but its also quite likely there is no 'data' so much as tacit knowledge handed down over time.
>The problem with this extrapolation is that the software industry is pretty much unique in the amount of information about its inner workings that is publicly available for training on... millions of lines of code that we published on the internet...
> Nearly every other industry (with the possible exception of Law) produces publicly-visible output at a tiny fraction of the rate that we do.
You are correct! There's lots of information available publicly about certain things like code, and writing SQL queries. But other specialized domains don't have the same kind of information trained into the heart of the model.
But importantly, this doesn't mean the LLM can't provide significant value in these other more niche domains. They still can, and I provide this every day in my day job. But it's a lot of work. We (as AI engineers) have to deeply understand the special domain knowledge. The basic process is this:
1. Learn how the subject matter experts do the work.
2. Teach the LLM to do this, using examples, giving it procedures, walking it through the various steps and giving it the guidance and time and space to think. (Multiple prompts, recipes if you will, loops, external memory...)
3. Evaluation, iteration, improvement
4. Scale up to production
In many domains I work in, it can be very challenging to get past step 1. If I don't know how to do it effectively, I can't guide the LLM through the steps. Consider an example question like "what are the top 5 ways to improve my business" -- the subject matter experts often have difficulty teaching me how to do that. If they don't know how to do it, they can't teach it to me, and I can't teach it to the agent. Another example that will resonate with nerds here is being an effective Dungeons and Dragons DM. But if I actually learn how to do it, and boil it down into repeatable steps, and use GraphRAG, then it becomes another thing entirely. I know this is possible, and expect to see great things in that space, but I estimate it'll take another year or so of development to get it done.
But in many domains, I get access to subject matter experts that can tell me pretty specifically how to succeed in an area. These are the top 5 situations you will see, how you can identify which situation type it is, and what you should do when you see that you are in that kind of situation. In domains like this I can in fact make the agent do awesome work and provide value, even when the information is not in the publicly available training data for the LLM.
There's this thing about knowing a domain area well enough to do the job, but not having enough mastery to teach others how to do the job. You need domain experts that understand the job well enough to teach you how to do it, and you as the AI engineer need enough mastery over the agent to teach it how to do the job as well. Then the magic happens.
When we get AGI we can proceed past this limitation of needing to know how to do the job ourselves. Until we get AGI, then this is how we provide impact using agents.
This is why I say that even if LLM technology does not improve any more beyond where it was a year ago, we still have many years worth of untapped potential for AI. It just takes a lot of work, and most engineers today don't understand how to do that work-- principally because they're too busy saying today's technology can't do that work rather than trying to learn how to do it.
> 1. Learn how the subject matter experts do the work.
This will get harder I think over time as low hanging fruit domains are picked - the barrier will be people not technology. Especially if the moat for that domain/company is the knowledge you are trying to acquire (NOTE: Some industries that's not their moat and using AI to shed more jobs is a win). Most industries that don't have public workings on the internet have a couple of characteristics that will make it extremely difficult to perform Task 1 on your list. The biggest is now every person on the street, through the mainstream news, etc knows that it's not great to be a software engineer right now and most media outlets point straight to "AI". "It's sucks to be them" I've heard people say - what was once a profession of respect is now "how long do you think you have? 5 years? What will you do instead?".
This creates a massive resistance/outright potential lies in providing AI developers information - there is a precedent of what happens if you do and it isn't good for the person/company with the knowledge. Doctors associations, apprenticeship schemes, industry bodies I've worked with are all now starting to care about information security a lot more due to "AI", and proprietary methods of working lest AI accidentally "train on them". Definitely boosted the demand for cyber people again as an example around here.
> You are correct! There's lots of information available publicly about certain things like code, and writing SQL queries. But other specialized domains don't have the same kind of information trained into the heart of the model.
The nightmare of anyone that studied and invested into a skill set according to most people you would meet. I think most practitioners will conscious to ensure that the lack of data to train on stays that way for as long as possible - even if it eventually gets there the slower it happens and the more out of date it is the more useful the human skill/economic value of that person. How many people would of contributed to open source if they knew LLM's were coming for example? Some may have, but I think there would of been less all else being equal. Maybe quite a bit less code to the point that AI would of been delayed further - tbh if Google knew that LLM's could scale to be what they are they wouldn't of let that "attention" paper be released either IMO. Anecdotally even the blue collar workers I know are now hesitant to let anyone near their methods of working and their craft - survival, family, etc come first. In the end after all, work is a means to an end for most people.
Unlike us techies which I find at times to not be "rational economic actors" many non-tech professionals don't see AI as an opportunity - they see it as a threat they they need to counter. At best they think they need to adopt AI, before others have it and make sure no one else has it. People I've chatted to say "no one wants this, but if you don't do it others will and you will be left behind" is a common statement. One person likened it to a nuclear weapons arms race - not a good thing, but if you don't do it you will be under threat later.
> This will get harder I think over time as low hanging fruit domains are picked - the barrier will be people not technology. Especially if the moat for that domain/company is the knowledge you are trying to acquire (NOTE: Some industries that's not their moat and using AI to shed more jobs is a win).
Also consider that there exist quite a lot of subject matter experts who simply are not AI fanboys - not because they are afraid of their job because of AI, but because they consider the whole AI hype to be insanely annoying and infuriating. To get them to work with an AI startup, you will thus have to pay them quite a lot of money.
This is a great article but the main principle at YC is to assume that technology will continue progressing at an exponential rate and then thinking about what it would enable. Their proposals are always assuming the startups will ride some kind of Moore's Law for AI and hardware synthesis is an obvious use case. So the assumption is that in 2 years there will be a successful AI hardware synthesis company and all they're trying to do is get ahead of the curve.
I agree they're probably wrong but this article doesn't actually explain why they're wrong to bet on exponential progress in AI capabilities.
I think the problem with this particular challenge is that it is incredibly non-disruptive to the status quo. There are already 100s of billions flowing into using LLMs as well as GPUs for chip design. Nvidia has of course laid the ground work with its culitho efforts. This kind of research area is very hot in the research world as well. It’s by no means difficult to pitch to a VC. So why should YC back it? I’d love to see YC identifying areas where VC dollars are not flowing. Unfortunately, the other challenges are mostly the same — govtech, civictech, defense tech. These are all areas where VC dollars are now happily flowing since companies like Anduril made it plausible.
As a former chip designer (been 16 years, but looks like tools and our arguments about them haven't changed much), I'm both more and less optimistic than OP:
1. More because fine-tuning with enough good Verilog as data should let the LLMs do better at avoiding mediocre Verilog (existing chip companies have more of this data already though). Plus non-LLM tools will remain, so you can chain those tools to test that the LLM hasn't produced Verilog that synthesizes to a large area, etc
2. Less because when creating more chips for more markets (if that's the interpretation of YC's RFS), the limiting factor will become the cost of using a fab (mask sets cost millions), and then integrating onto a board/system the customer will actually use. A half-solution would be if FPGAs embedded in CPUs/GPUs/SiPs on our existing devices took off
I don't know the space well enough, but I think the missing piece is that YC 's investment horizon is typically 10+ years. Not only LLMs could get massively better, but the chip industry could be massively disrupted with the right incentives. My guess is that that is YC's thesis behind the ask.
Even the serious idea that the article thinks could work is throwing the unreliable LLMs at verification! If there's any place you can use something that doesn't work most of the time, I guess it's there.
Only if it fails in the same way. LLMs and the multi-agent approach operate under the assumption that they are programmable agents and each agent is more of a trade off against failure modes. If you can string them together, and if the output is easily verified, it can be a great fit for the problem.
Once it was spices. Then poppies. Modern art. The .com craze. Those blockchain ape images. Blockchain. Now LLM.
All of these had a bit of true value and a whole load of bullshit. Eventually the bullshit disappears and the core remains, and the world goes nuts about the next thing.
Exactly. I’ve seen this enough now to appreciate that oft repeated tech adoption curve. It seems like we are in “peak expectations” phase which is immediately followed by the disillusionment and then maturity phase.
If your LLM is producing a proof that can be checked by another program, then there’s nothing wrong with their reliability. It’s just like playing a game whose rules are a logical system.
That’s because we are still waiting for the 2008 bubble to pop, which was inflated by the 2020 bubble. It’s going to be bad. People will blame trump, Harris would be eating the same shit sandwich.
I think the operators are learning how to hype-edge. You find that sweet spot between promising and 'not just there yet' where you can take lots of investments and iterate forward just enough to keep it going.
It doesn't matter if it can't actually 'get there' as long as people still believe it can.
Come to think about it, a socioeconomic system dependent on population and economic growth is at a fundamental level driven by this balancing act: "We can solve every problem if we just forge ahead and keep enlarging the base of the pyramid - keep reproducing, keep investing, keep expanding the infrastructure".
It's similar in regular programming - LLMs are better at writing test code than actual code. Mostly because it's simpler (P vs NP etc), but I think also because it's less obvious when test code doesn't work.
Replace all asserts with expected ==expected and most people won't notice.
LLMs are pretty damn useful for generating tests, getting rid of a lot of tedium, but yeah, it's the same as human-written tests: if you don't check that your test doesn't work when it shouldn't (not the same thing as just writing a second test for that case - both those tests need to fail if you intentionally screw with their separate fixtures), then you shouldn't have too much confidence in your test.
If LLMs can generate a test for you, it's because it's a test that you shouldn't need to write. They can't test what is really important, at all.
Some development stacks are extremely underpowered for code verification, so they do patch the design issue. Just like some stacks are underpowered for abstraction and need patching by code generation. Both of those solve an immediate problem, in a haphazard and error-prone way, by adding burden on maintenance and code evolution linearly to how much you use it.
And worse, if you rely too much on them they will lead your software architecture and make that burden superlinear.
BTW, I obviously didn't just type "make a lexer and multi-pass parser that returns multiple errors and then make a single-line instance of a Monaco editor with error reporting, type checking, syntax highlighting and tab completion".
I put it together piece-by-piece and with detailed architectural guidance.
> Replace all asserts with expected ==expected and most people won't notice.
Those tests were very common back when I used to work in Ruby on Rails and automatically generating test stubs was a popular practice. These stubs were often just converted into expected == expected tests so that they passed and then left like that.
I mean, define ‘better’. Even with actual human programmers, tests which do not in fact test the thing are already a bit of an epidemic. A test which doesn’t test is worse than useless.
I had a discussion with a manager at a client last week and was trying to run him through some (technical) issues relating to challenges an important project faces.
His immediate response was that maybe we should just let ChatGPT help us decide the best option. I had to bite my tongue.
OTOH, I'm more and more convinced that ChatGPT will replace managers long before it replaces technical staff.
yes thats how we progress this is how the internet boom happened as well everything became . com then the real workable businesses were left and all the unworkable things were gone.
Recently I came across some one advertising an LLM to generate fashion magazine shoot in Pakistan at 20-25% of the cost. It hit me then that they are undercutting the fashion shoot of country like Pakistan which is already cheaper by 90-95% from most western countries. This AI is replacing the work of 10-20 people.
The annoying part, a lot of money could be funneled into these unworkable businesses in the process, crypto being a good example. And these unworkable businesses tend to try to continue getting their way into the money somehow regardless. Most recent example was funneling money from Russia into Trump’s campaign.
> The annoying part, a lot of money could be funneled into these unworkable businesses in the process, crypto being a good example
There was a thread here about why ycombinator invests into several competing startups. The answer is success is often more about connections and politics than the product itself. And crypto, yes, is a good example of this. Musk will get his $1B in bitcoins back for sure.
> Most recent example was funneling money from Russia into Trump’s campaign.
LLMs have powered products used by hundreds of millions, maybe billions. Most experiments will fail and that's okay, arguably even a good thing. Only time will tell which ones succeed
This makes complete sense from an investor’s perspective, as it increases the chances of a successful exit. While we focus on the technical merits or critique here on HN/YC, investors are playing a completely different game.
To be a bit acerbic, and inspired by Arthur C. Clarke, I might say: "Any sufficiently complex business could be indistinguishable from Theranos".
Theranos was not a "complex business". It was deliberate fraud and deception, and investors that were just gullible. The investors should have demanded to see concrete results
I expected you to take this with a grain of salt but also to read between the lines: while some projects involve deliberate fraud, others may simply lack coherence and inadvertently follow the principles of the greater fool theory [1]. The use of ambiguous or indistinguishable language often blurs the distinction, making it harder to differentiate outright deception from an unsound business model.
Mostly because they were not making claims that sentient microwaves that would cook your food for you were just around the corner which then the most respected media outlets parroted uncritically.
Fuzzy logic rice cookers are the result of an unrelated fad in 1990s Japanese engineering companies. They added fuzzy controls to everything from cameras to subways to home appliances. It's not part of the current ML fad.
I mean, they were at one point making pretty extravagant claims about microwaves, but to a less credulous audience. Trouble with LLMs is that they look like magic if you don’t look too hard, particularly to laypeople. It’s far easier to buy into a narrative that they actually _are_ magic, or will become so.
I feel like what makes this a bit different from just regular old sufficiently advanced technology is the combination of two things:
- LLMs are extremely competent at surface-level pattern matching and manipulation of the type we'd previously assumed that only AGI would be able to do.
- A large fraction of tasks (and by extension jobs) that we used to, and largely still do, consider to be "knowledge work", i.e. requiring a high level of skill and intelligence, are in fact surface-level pattern matching and manipulation.
Reconciling these facts raises some uncomfortable implications, and calling LLMs "actually intelligent" lets us avoid these.
> I knew it was bullshit from the get-go as soon as I read their definition of AI agents.
That is one spicy article, it got a few laughs out of me. I must agree 100% that Langchain is an abomination, both their APIs as well as their marketing.
This is not my domain so my knowledge is limited, but I wonder if the chip designers have some sort of a standard library of ready to use components. Do you have to design e.g. ALU every time you design a new CPU or is there some standard component to use? I think having a proven components that can be glued on a higher level may be the key to productivity here.
Returning to LLMs. I think the problem here may be that there is simply not enough learning material for LLM. Verilog comparing to C is a niche with little documentation and even less open source code. If open hw were more popular I think LLMs could learn to write better Verilog code. Maybe the key is to persuade hardware companies to share their closed source code to teach LLM for the industry benefit?
The most common thing you see shared is something called IP which does mean intellectual property, but in this context you can think of it like buying ICs that you integrate into your design (ie you wire them up). You can also get Verilog, but that is usually used for verification instead of taping out the peripheral. This is because the company you buy the IP from will tape out the design for a specific node in order to guarantee the specifications. Examples of this would be everything from arm cores to uart and spi controllers as well as pretty much anything you could buy as a standalone IC.
Or learning through self-play. Chip design sounds like an area where (this would be hard!) a sufficiently powerful simulator and/or FPGA could allow reinforcement learning to work.
Current LLMs can’t do it, but the assumption that that’s what YC meant seems wildly premature.
I did my PhD on trying to use ML for EDA (de novo design/topology generation, because deepmind was doing placement and I was not gonna compete with them as a single EE grad who self taught ML/optimization theory during the PhD).
In my opinion, part of the problem i that training data is scarce (real world designs are literally called "IP" in the industry after all...), but more than that, circuit design is basically program synthesis, which means it's _hard_. Even if you try to be clever, dealing with graphs and designing discrete objects involves many APX-hard/APX-complete problems, which is _FUN_ on the one had, but also means it's tricky to just scale through, if the object you are trying to do is a design that can cost millions if there's a bug...
I disagree with most of the reasoning here, and think this post misunderstands the opportunity and economic reasoning at play here.
> If Gary Tan and YC believe that LLMs will be able to design chips 100x better than humans currently can, they’re significantly underestimating the difficulty of chip design, and the expertise of chip designers.
This is very obviously not the intent of the passage the author quotes. They are clearly talking about the speedup that can be gained from ASICs for a specific workload, eg dedicated mining chips.
> High-level synthesis, or HLS, was born in 1998, when Forte Design Systems was founded
This sort of historical argument is akin to arguing “AI was bad in the 90s, look at Eliza”. So what? LLMs are orders of magnitude more capable now.
> Ultimately, while HLS makes designers more productive, it reduces the performance of the designs they make. And if you’re designing high-value chips in a crowded market, like AI accelerators, performance is one of the major metrics you’re expected to compete on.
This is the crux of the author's misunderstanding.
Here is the basic economics explanation: creating an ASIC for a specific use is normally cost-prohibitive because the cost of the inputs (chip design) is much higher than the outputs (performance gains) are worth.
If you can make ASIC design cheaper on the margin, and even if the designs are inferior to what an expert human could create, then you can unlock a lot of value. Think of all the places an ASIC could add value if the design was 10x or 100x cheaper, even if the perf gains were reduced from 100x to 10x.
The analogous argument is “LLMs make it easier for non-programmers to author web apps. The code quality is clearly worse than what a software engineer would produce but the benefits massively outweigh, as many domain experts can now author their own web apps where it wouldn’t be cost-effective to hire a software engineer.”
> If Gary Tan and YC believe that LLMs will be able to design chips 100x better than humans currently can, they’re significantly underestimating the difficulty of chip design, and the expertise of chip designers. While LLMs are capable of writing functional Verilog sometimes, their performance is still subhuman. [...] LLMs primarily pump out mediocre Verilog code.
What is the quality of Verilog code output by humans? Is it good enough so that a complex AI chip can be created? Or does the human need to use tools in order to generate this code?
I've got the feeling that LLMs will be capable of doing everything a human can do, in terms of thinking. There shouldn't be an expectation that an LLM is able to do everything, which in this context would be thinking about the chip and creating the final files in a single pass and without external help. And with external help I don't mean us humans, but tools which are specialized and also generate some additional data (like embeddings) which the LLM (or another LLM) can use in the next pass to evaluate the design. And if we humans have spent enough time in creating these additional tools, there will come a time when LLMs will also be able to create improved versions of them.
I mean, when I once randomly checked the content of a file in The Pile, I found an Craigslist "ad" for an escort offering her services. No chip-generating AI does need to have this in its parameters in order to do its job. So there is a lot of room for improvement and this improvement will come over time. Such an LLM doesn't need to know that much about humans.
If cryptocurrency mining could be significantly optimized (one of the example goals in the article) wouldn't that just destroy the value of said currency?
I think this whole article is predicated on misinterpreting the ask. It wasn't for the chip to take 100x less power, it was for the algorithm the chip implements. Modern synthesis tools and optimisers extensively look for design patterns the same way software compilers do. That's why there's recommended inference patterns. I think it's not impossible to expect an LLM to expand the capture range of these patterns to maybe suboptimal HDL. As a simple example, maybe a designer got really turned around and is doing some crazy math, and the LLM can go "uh, that's just addition my guy, I'll fix that for you."
Was surprised this comment was this far down. I re-read the YC ask three times to make sure I wasn’t crazy. Dude wrote the whole article based on a misunderstanding.
This heavily overlaps with my current research focus for my Ph.D., so I wanted to provide some additional perspective to the article. I have worked with Vitis HLS and other HLS tools in the past to build deep learning hardware accelerators. Currently, I am exploring deep learning for design automation and using large language models (LLMs) for hardware design, including leveraging LLMs to write HLS code. I can also offer some insight from the academic perspective.
First, I agree that the bar for HLS tools is relatively low, and they are not as good as they could be. Admittedly, there has been significant progress in the academic community to develop open-source HLS tools and integrations with existing tools like Vitis HLS to improve the HLS development workflow. Unfortunately, substantial changes are largely in the hands of companies like Xilinx, Intel, Siemens, Microchip, MathWorks (yes, even Matlab has an HLS tool), and others that produce the "big-name" HLS tools. That said, academia has not given up, and there is considerable ongoing HLS tooling research with collaborations between academia and industry. I hope that one day, some lab will say "enough is enough" and create a open-source, modular HLS compiler in Rust that is easy to extend and contribute to—but that is my personal pipe dream. However, projects like BambuHLS, Dynamatic, MLIR+CIRCT, and XLS (if Google would release more of their hardware design research and tooling) give me some hope.
When it comes to actually using HLS to build hardware designs, I usually suggest it as a first-pass solution to quickly prototype designs for accelerating domain-specific applications. It provides a prototype that is often much faster or more power-efficient than a CPU or GPU solution, which you can implement on an FPGA as proof that a new architectural change has an advantage in a given domain (genomics, high-energy physics, etc.). In this context, it is a great tool for academic researchers. I agree that companies producing cutting-edge chips are probably not using HLS for the majority of their designs. Still, HLS has its niche in FPGA and ASIC design (with Siemens's Catapult being a popular option for ASIC flows). However, the gap between an initial, naive HLS design implementation and one refined by someone with expert HLS knowledge is enormous. This gap is why many of us in academia view the claim that "HLS allows software developers to do hardware development" as somewhat moot (albeit still debatable—there is ongoing work on new DSLs and abstractions for HLS tooling which are quite slick and promising). Because of this gap, unless you have team members or grad students familiar with optimizing and rewriting designs to fully exploit HLS benefits while avoiding the tools' quirks and bugs, you won't see substantial performance gains. Al that to say, I don't think it is fair to comply write off HLS as a lost cause or not sucesfull.
Regarding LLMs for Verilog generation and verification, there's an important point missing from the article that I've been considering since around 2020 when the LLM-for-chip-design trend began. A significant divide exists between the capabilities of commercial companies and academia/individuals in leveraging LLMs for hardware design. For example, Nvidia released ChipNeMo, an LLM trained on their internal data, including HDL, tool scripts, and issue/project/QA tracking. This gives Nvidia a considerable advantage over smaller models trained in academia, which have much more limited data in terms of quantity, quality, and diversity. It's frustrating to see companies like Nvidia presenting their LLM research at academic conferences without contributing back meaningful technology or data to the community. While I understand they can't share customer data and must protect their business interests, these closed research efforts and closed collaborations they have with academic groups hinder broader progress and open research. This trend isn't unique to Nvidia; other companies follow similar practices.
On a more optimistic note, there are now strong efforts within the academic community to tackle these problems independently. These efforts include creating high-quality, diverse hardware design datasets for various LLM tasks and training models to perform better on a wider range of HLS-related tasks. As mentioned in the article, there is also exciting work connecting LLMs with the tools themselves, such as using tool feedback to correct design errors and moving towards even more complex and innovative workflows. These include in-the-loop verification, hierarchical generation, and ML-based performance estimation to enable rapid iteration on designs and debugging with a human in the loop. This is one area I'm actively working on, both at the HDL and HLS levels, so I admit my bias toward this direction.
For more references on the latest research in this area, check out the proceedings from the LLM-Aided Design Workshop (now evolving into a conference, ICLAD: https://iclad.ai/), as well as the MLCAD conference (https://mlcad.org/symposium/2024/). Established EDA conferences like DAC and ICCAD have also included sessions and tracks on these topics in recent years. All of this falls within the broader scope of generative AI, which remains a smaller subset of the larger ML4EDA and deep learning for chip design community. However, LLM-aided design research is beginning to break out into its own distinct field, covering a wider range of topics such as LLM-aided design for manufacturing, quantum computing, and biology—areas that the ICLAD conference aims to expand on in future years.
Generative models are bimodal - in certain tasks they are crazy terrible , and in certain tasks they are better than humans. The key is to recognize which is which.
And much more important:
- LLMs can suddenly become more competent when you give them the right tools, just like humans. Ever try to drive a nail without a hammer?
- Models with spatial and physical awareness are coming and will dramatically broaden what’s possible
It’s easy to get stuck on what LLMs are bad at. The art is to apply an LLMs strengths to your specific problem, often by augmenting the LLM with the right custom tools written in regular code
I've driven a nail with a rock, a pair of pliers, a wrench, even with a concrete wall and who knows what else!
I didn't need to be told if these can be used to drive a nail, and I looked at things available, looked for a flat surface on them and good grip, considered their hardness, and then simply used them.
So if we only give them the "right" tools, they'll remain very limited by us not thinking about possible jobs they'll appear as if they know how to do and they don't.
The problem is exactly that: they "pretend" to know how to drive a nail but not really.
Please don’t do this, Zach. We need to encourage more investment in the overall EDA market not less. Garry’s pitch is meant for the dreamers, we should all be supportive. It’s a big boat.
Would appreciate the collective energy being spent instead towards adding to amor refining Garry’s request.
The article seems to be be based on the current limitations of LLMs. I don't think YC and other VCs are betting on what LLMs can do today, I think they are betting on what they might be able to do in the future.
As we've seen in the recent past, it's difficult to predict what the possibilities are for LLMS and what limitations will hold. Currently it seems pure scaling won't be enough, but I don't think we've reached the limits with synthetic data and reasoning.
>The article seems to be be based on the current limitations of LLMs. I don't think YC and other VCs are betting on what LLMs can do today, I think they are betting on what they might be able to do in the future.
Do we know what LLMs will be able to do in the future? And even if we know, the startups have to work with what they have now, until that future comes. The article states that there's not much to work with.
Most successful startups were able to make the thing that they wanted to make, as a startup, with existing tech. It might have a limited market that was expected to become less limited (a web app in 1996, say), but it was possible to make the thing.
This idea of “we’re a startup; we can’t actually make anything useful now, but once the tech we use becomes magic any day now we might be able to make something!” is basically a new phenomenon.
Most? I can list tens of them easily. For example what advancements were required for Slack to be successful? Or Spotify (they got more successful due to smartphones and cheaper bandwidth but the business was solid before that)? Or Shopify?
Slack bet on ubiquitous, continuous internet access. Spotify bet on bandwidth costs falling to effectively zero. Shopify bet on D2C rising because improved search engines, increased internet shopping (itself a result of several tech trends plus demographic changes).
For a counterexample I think I’d look to non-tech companies. OrangeTheory maybe?
The notion of a startup gaining funding to develop a fantasy into reality is relatively new.
It used to be that startups would be created to do something different with existing tech or to commercialise a newly-discovered - but real - innovation.
Tomorrow, LLMs will be able to perform slightly below-average versions of whatever humans are capable of doing tomorrow. Because they work by predicting what a human would produce based on training data.
This severely discounts the fact that you’re comparing a model that _knows the average about everything_ to a single human’s capabilit. Also they can do it instantly, instead of having to coordinate many humans over long periods of time. You can’t straight up compare one LLM to one human
it seems that's sufficient to do a lot of things better than the average human - including coding, writing, creating poetry, summarizing and explaining things...
Many professions are far less digital than software, protect IP more, and are much more akin to an apprenticeship system.
2) the adaptability of humans in learning vs any AI
Think about how many years we have been trying to train cars to drive, but humans do it with a 50 hours training course.
3) humans ability to innovate vs AIs ability to replicate
A lot of creative work is adaptation, but humans do far more than that in synthesizing different ideas to create completely new works. Could an LLM produce the 37th Marvel movie? Yes probably. Could an LLM create.. Inception? Probably not.
Because YCombinator is all about r-selecting startup ideas, and making it back on a few of them generating totally outsized upside.
I think that LLMs are plateauing, but I'm less confident that this necessarily means the capabilities we're using LLMs for right now will also plateau. That is to say it's distinctly possible that all the talent and money sloshing around right now will line up a new breakthrough architecture in time to keep capabilities marching forward at a good pace.
But if I had $100 million, and could bet $200 thousand that someone can make me billions on machine learning chip design or whatever, I'd probably entertain that bet. It's a numbers game.
> But if I had $100 million, and could bet $200 thousand that someone can make me billions on machine learning chip design or whatever, I'd probably entertain that bet. It's a numbers game.
Problem with this reasoning is twofold: start-ups will overfit to getting your money instead of creating real advances; competition amongst them will drive up the investment costs. Pretty much what has been happening.
> I think they are betting on what they might be able to do in the future.
Yeah, blind hope and a bit of smoke and lighting.
> but I don't think we've reached the limits with synthetic data
Synthetic data, at least for visual stuff can, in some cases provide the majority of training data. For $work, we can have say 100k video sequences to train a model, they can then be fine tuned on say 2k real videos. That gets it to be slightly under the same quality as if it was train on pure real video.
So I'm not that hopeful that synthetic data will provide a breakthrough.
I think the current architecture of LLMs are the limitation. They are fundamentally a sequence machine and are not capable of short, or medium term learning. context windows kinda makes up for that, but it doesn't alter the starting state of the model.
LLMs have a long way to go in the world of EDA.
A few months ago I saw a post on LinkedIn where someone fed the leading LLMs a counter-intuitively drawn circuit with 3 capacitors in parallel and asked what the total capacitance was. Not a single one got it correct - not only did they say the caps were in series (they were not) it even got the series capacitance calculations wrong. I couldn’t believe they whiffed it and had to check myself and sure enough I got the same results as the author and tried all types of prompt magic to get the right answer… no dice.
I also saw an ad for an AI tool that’s designed to help you understand schematics. In its pitch to you, it’s showing what looks like a fairly generic guitar distortion pedal circuit and does manage to correctly identify a capacitor as blocking DC but failed to mention it also functions as a component in an RC high-pass filter. I chuckled when the voice over proudly claims “they didn’t even teach me this in 4 years of Electrical Engineering!” (Really? They don’t teach how capacitors block DC and how RC filters work????)
If you’re in this space you probably need to compile your own carefully curated codex and train something more specialized. The general purpose ones struggle too much.
I still have nightmares about the entry level EE class I was required to take for a CS degree.
RC circuits man.
I studied mechatronics and did reasonably well... but in any electrical class I would just scrape by. I loved it but was apparently not suited to it. I remember a whole unit basically about transistors. On the software/mtrx side we were so happy treating MOSFETs as digital. Having to analyse them in more depth did my head in.
I had a similar experience, except Mechanical Engineering being my weakest area. Computer Science felt like a children's game compared to fluid dynamics...
“Oh shit I better remember all that matrix algebra I forgot already!”
…Then takes a class on anything with 3d graphics… “oh shit matrix algebra again!”
…then takes a class on machine learning “urg more matrix math!”
EEs actually had a head start on ML, especially those who took signal processing.
How many words in the art of electronics? Could you give that as context and see if might help?
I don’t mind LLMs in the ideation and learning phases, which aren’t reproducible anyway. But I still find it hard to believe engineers of all people are eager to put a slow, expensive, non-deterministic black box right at the core of extremely complex systems that need to be reliable, inspectable, understandable…
You find it hard to believe that non-deterministic black boxes at the core of complex systems are eager to put non-deterministic black boxes at the core of complex systems?
Yes I do! Is that some sort of gotcha? If I can choose between having a script that queries the db and generates a report and “Dave in marketing” who “has done it for years”, I’m going to pick the script. Who wouldn’t? Until machines can reliably understand, operate and self-correct independently, I’d rather not give up debuggability and understandability.
I think this comment and the parent comment are talking about two different things. One of you is talking about using nondeterministic ML to implement the actual core logic (an automated script or asking Dave to do it manually), and one of you is talking about using it to design the logic (the equivalent of which is writing that automated script).
LLM’s are not good at actually doing the processing, they are not good at math or even text processing at a character level. They often get logic wrong. But they are pretty good at looking at patterns and finding creative solutions to new inputs (or at least what can appear creative, even if philosophically it’s more pattern matching than creativity). So an LLM would potentially be good at writing a first draft of that script, which Dave could then proofread/edit, and which a standard deterministic computer could just run verbatim to actually do the processing. Eventually maybe even Dave’s proofreading would be superfluous.
Tying this back to the original article, I don’t think anyone is proposing having an LLM inside a chip that processes incoming data in a non-deterministic way. The article is about using AI to design the chips in the first place. But the chips would still be deterministic, the equivalent of the script in this analogy. There are plenty of arguments to make about LLM‘s not being good enough for that, not being able to follow the logic or optimize it, or come up with novel architectures. But the shape of chip design/Verilog feels like something that with enough effort, an AI could likely be built that would be pretty good at it. All of the knowledge that those smart knowledgeable engineers which are good at writing Verilog have built up can almost certainly be represented in some AI form, and I wouldn’t bet against AI getting to a point where it can be helpful similarly to how Copilot currently is with code completion. Maybe not perfect anytime soon, but good enough that we could eventually see a path to 100%. It doesn’t feel like there’s a fundamental reason this is impossible on a long enough time scale.
> So an LLM would potentially be good at writing a first draft of that script, which Dave could then proofread/edit
Right, and there’s nothing fundamentally wrong with this, nor is it a novel method. We’ve been joking about copying code from stack overflow for ages, but at least we didn’t pretend that it’s the peak of human achievement. Ask a teacher the difference between writing an essay and proofreading it.
Look, my entire claim from the beginning is that understanding is important (epistemologically, it may be what separates engineering from alchemy, but I digress). Practically speaking, if we see larger and larger pieces of LLM written code, it will be similar to Dave and his incomprehensible VBA script. It works, but nobody knows why. Don’t get me wrong, this isn’t new at all. It’s an ever-present wet blanket that slowly suffocates engineering ventures who don’t pay attention and actively resist. In that context, uncritically inviting a second wave of monkeys to the nuclear control panels, that’s what baffles me.
> We’ve been joking about copying code from stack overflow for ages
Tangent for a slight pet peeve of mine:
"We" did joke about this, but probably because most of our jobs are not in chip design. "We" also know the limits of this approach.
The fact that Stack Overflow is the most SEO optimised result for "how to center div" (which we always forget how to do) doesn't have any bearing on the times when we have an actual problem requiring our attention and intellect. Say diagnosing a performance issue, negotiating requirements and how they subtly differ in an edge case from the current system behaviour, discovering a shared abstraction in 4 pieces of code that are nearly but not quite the same.
I agree with your posts here, the Stack Overflow thing in general is just a small hobby horse I have.
>If I can choose between having a script that queries the db and generates a report and “Dave in marketing” who “has done it for years”
If you could that would be nice wouldn't it? And if you couldn't?
If people were saying, "let's replace Casio Calculators with interfaces to GPT" then that would be crazy and I would wholly agree with you but by and large, the processes people are scrambling to place LLMs in are ones that typical machines struggle or fail and humans excel or do decently (and that LLMs are making some headway in).
You're making the wrong distinction here. It's not Dave vs your nifty script. It's Dave or nothing at all.
There's no point comparing LLM performance to some hypothetical perfect understanding machine that doesn't exist.
You compare to the things its meant to replace - humans. How well can the LLM do this compared to Dave ?
> by and large, the processes people are scrambling to place LLMs in are ones that typical machines struggle or fail
I'm pretty sure they are scrambling to put them absolutely anywhere it might save or make a buck (or convince an investor that it could)
I'm a non-deterministic black box who teaches complex deterministic machines to do stuff and leverages other deterministic machines as tools to do the job.
I like my job.
My job also involves cooperating with other non-deterministic black boxes (colleagues).
I can totally see how artificial non-deterministic black boxes (artificial colleagues) may be useful to replace/augment the biological ones.
For one, artificial colleagues don't get tired and I don't accidentally hurt their feelings or whatnot.
In any case, I'm not looking forward to replacing my deterministic tools with the fuzzy AI stuff.
Intuitively at least it seems to me that these non-deterministic black boxes could really benefit from using the deterministic tools for pretty much the same reasons we do as well.
Yes. One does not have to do with the other.
Can you actually like follow through with this line? I know there are literally tens of thousands of comments just like this at this point, but if you have chance, could you explain what you think this means? What should we take from it? Just unpack it a little bit for us.
An interpretation that makes sense to me: humans are non-deterministic black boxes already at the core of complex systems. So in that sense, replacing a human with AI is not unreasonable.
I’d disagree, though: humans are still easier to predict and understand (and trust) than AI, typically.
With humans we have a decent understanding of what they are capable of. I trust a medical professional to provide me with medical advice and an engineer to provide me with engineering advice. With LLM, it can be unpredictable at times, and they can make errors in ways that you would not imagine. Take the following examples from my tool, which shows how GPT-4o and Claude 3.5 Sonnet can screw up.
In this example, GPT-4o cannot tell that GitHub is spelled correctly:
https://app.gitsense.com/?doc=6c9bada92&model=GPT-4o&samples...
In this example, Claude cannot tell that GitHub is spelled correctly:
https://app.gitsense.com/?doc=905f4a9af74c25f&model=Claude+3...
I still believe LLM is a game changer and I'm currently working on what I call a "Yes/No" tool which I believe will make trusting LLMs a lot easier (for certain things of course). The basic idea is the "Yes/No" tool will let you combine models, samples and prompts to come to a Yes or No answer.
Based on what I've seen so far, a model can easily screw up, but it is unlikely that all will screw up at the same time.
It's actually a great topic - both humans and LLMs are black boxes. And both rely on patterns and abstractions that are leaky. And in the end it's a matter of trust, like going to the doctor.
But we have had extensive experience with humans, it is normal to have better defined trust, LLMs will be better understood as well. There is no central understander or truth, that is the interesting part, it's a "Blind men and the elephant" situation.
We are entering the nondeterministic programming era in my opinion. LLM applications will be designed with the idea that we can't be 100% sure and what ever solution can provide the most safe guards, will probably be the winner.
Because people are not saying "let's replace Casio Calculators with interfaces to GPT!"
By and large, the processes people are scrambling to place LLMs in are ones that typical machines struggle or fail and humans excel or do decently (and that LLMs are making some headway in).
There's no point comparing LLM performance to some hypothetical perfect understanding machine that doesn't exist. It's nonsensical actually. You compare it to the performance of the beings it's meant to replace or augment - humans.
Replacing non-deterministic black boxes with potentially better performing non-deterministic black boxes is not some crazy idea.
Sure. I mean, humans are very good at building businesses and technologies that are resilient to human fallibility. So when we think of applications where LLMs might replace or augment humans, it’s unsurprising that their fallible nature isn’t a showstopper.
Sure, EDA tools are deterministic, but the humans who apply them are not. Introducing LLMs to these processes is not some radical and scary departure, it’s an iterative evolution.
Ok yeah. I think the thing that trips me up with this argument then is just, yes, when you regard humans in a certain neuroscientific frame and consider things like consciousness or language or will, they are fundamentally nondeterministic. But that isn't the frame of mind of the human engineer who does the work or even validates it. When the engineer is working, they aren't seeing themselves as some black box which they must feed input and get output, they are thinking about the things in themselves, justifying to themselves and others their work. Just because you can place yourself in some hypothetical third person here, one that oversees the model and the human and says "huh yeah they are pretty much the same, huh?", doesn't actually tell us anything about whats happening on the ground in either case, if you will. At the very least, this same logic would imply fallibility is one dimensional and always statistical; "the patient may be dead, but at least they got a new heart." Like isn't in important to be in love, not just be married? To borrow some Kant, shouldn't we still value what we can do when we think as if we aren't just some organic black box machines? Is there even a question there? How could it be otherwise?
Its really just that the "in principle" part of the overall implication with your comment and so many others just doesn't make sense. Its very much cutting off your nose to spite your face. How could science itself be possible, much less engineering, if this is how we decided things? If we regarded ourselves always from the outside? How could even be motivated to debate whether we get the computers to design their own chips? When would something actually happen? At some point, people do have ideas, in a full, if false, transparency to themselves, that they can write down and share and explain. This is not only the thing that has gotten us this far, it is the very essence of why these models are so impressive in the certain ways that they are. It doesn't make sense to argue for the fundamental cheapness of the very thing you are ultimately trying to defend. And it imposes this strange perspective where we are not even living inside our own (phenomenal) minds anymore, that it fundamentally never matters what we think, no matter our justification. Its weird!
I'm sure you have a lot of good points and stuff, I just am simply pointing out that this particular argument is maybe not the strongest.
I took it to be a joke that the description "slow, expensive, non-deterministic black boxes" can apply to the engineers themselves. The engineers would be the ones who would have to place LLMs at the core of the system. To anyone outside, the work of the engineers is as opaque as the operation of LLMs.
In a reductive sense, this passage might as well read "You find it hard to believe that entropy is the source of other entropic reactions?"
No, I'm just disappointed in the decision of Black Box A and am bound to be even more disappointed by Black Box B. If we continue removing thoughtful design from our systems because thoughtlessness is the default, nobody's life will improve.
100% agree. While I can’t find all the sources right now, [1] and its references could be a good starting point for further exploration. I recall there being a proof or conjecture suggesting that it’s impossible to build an "LLM firewall" capable of protecting against all possible prompts—though my memory might be failing me
[1] https://arxiv.org/abs/2410.07283
You mean, like humans have been for many decades now.
Edit: I believe that LLM's are eminently useful to replace experts (of all people) 90% of the time.
> Edit: I believe that LLM's are eminently useful to replace experts (of all people) 90% of the time.
What do you mean by "expert"?
Do you mean the pundit who goes on TV and says "this policy will be bad for the economy"?
Or do you mean the seasoned developer who you hire to fix your memory leaks? To make your service fast? Or cut your cloud bill from 10M a year to 1M a year?
Experts of the kind that will be able to talk for hours about the academic consensus on the status quo without once considering how the question at hand might challenge it? Quite likely.
Experts capable of critical thinking and reflecting on evidence that contradicts their world model (and thereby retraining it on the fly)? Most likely not, at least not in their current architecture with all its limitations.
Change "replace" to "supplement" and I agree. The level of non-determinism is just too great at this stage, imo.
People believed that about expert systems in the 1980s as well.
I don't know if they "eminently" anything at the moment, thats why you feel the need to make the comment, right?
I know nothing about chip design. But saying "Applying AI to field X won't work, because X is complex, and LLMs currently have subhuman performance at this" always sounds dubious.
VCs are not investing in the current LLM-based systems to improve X, they're investing in a future where LLM based systems will be 100x more performant.
Writing is complex, LLMs once had subhuman performance, and yet. Digital art. Music (see suno.AI) There is a pattern here.
I didn't get into this in the article, but one of the major challenges with achieving superhuman performance on Verilog is the lack of high-quality training data. Most professional-quality Verilog is closed source, so LLMs are generally much worse at writing Verilog than, say, Python. And even still, LLMs are pretty bad at Python!
That's probably where there's a big advantage to being a company like Nvidia, which has both the proprietary chip design knowledge/data and the resources/money and AI/LLM expertise to work on something specialized like this.
I strongly doubt this - they don't have enough training data either - you are confusing (i think) the scale of their success with the amount of verilog they possess.
IE I think you are wildly underestimating both the scale of training data needing, and wildly overestimating the amount of verilog code possessed by nvidia.
GPU's work by having moderate complexity cores (in the scheme of things) that are replicated 8000 times or whatever. That does not require having 8000 times as much useful verilog, of course.
The folks who have 8000 different chips, or 100 chips that each do 1000 things, would probably have orders of magnitude more verilog to use for training
That’s what your VC investment would be buying; the model of “pay experts to create a private training set for fine tuning” is an obvious new business model that is probably under-appreciated.
If that’s the biggest gap, then YC is correct that it’s a good area for a startup to tackle.
AI still has subhuman performance for art. It feels like the venn diagram of people who are bullish on LLMs and people who don't understand logistic curves is a circle.
I like this reasoning. It is shortsighted to say that LLMs aren’t well-suited to something (because we cannot tell the future) but it is not shortsighted to say that LLMs are well-suited to something (because we cannot tell the future)
I kinda suspect that things that are expressed better with symbols and connections than with text will always be a poor fit to large LANGUAGE models. Turning what is basically a graph into a linear steam of text descriptions to tokenize and jam into an LLM has to be an incredibly inefficient and not very performant way of letting “AI” do magic on your circuits.
Ever try to get ChatGPT to play scrabble? Ever try to describe the board to it and then all the letters available to you? Even its fancy pants o1 preview performs absolutely horrible. Either my prompting completely sucks or an LLM is just the wrong tool for the job.
It’s great for asking you to score something you just created provided you tell it what bonuses apply to which words and letters. But it has absolutely no concept of the board at all. You cannot use to optimize your next move based on the board and the letters.
… I mean you might if you were extremely verbose about every letter on the board and every available place to put your tiles, perhaps avoiding coordinates and instead describing each word, its neighbors and relationships to bonus squares. But that just highlights how bad a tool an LLM is for scrabble.
Anyway, I’m sure schematics are very similar. Maybe somebody we will invent good machine learning models for such things but an LLM isn’t it.
> Writing is complex, LLMs once had subhuman performance,
And now they can easily replace mediocre human performance, and since they are tuned to provide answers that appeal to humans that is especially true for these subjective value use cases. Chip design doesn't seem very similar. Seems like a case where specifically trained tools would be of assistance. For some things, as much as generalist LLMs have surprised at skill in specific tasks, it is very hard to see how training on a broad corpus of text could outperform specific tools — for first paragraph do you really think it is not dubious to think a model trained on text would outperform Stockfish at chess?
I worked on the Qualcomm DSP architecture team for a year, so I have a little experience with this area but not a ton.
The author here is missing a few important things about chip design. Most of the time spent and work done is not writing high performance Verilog. Designers spent a huge amount of time answering questions, writing documentation, copying around boiler plate, reading obscure manuals and diagrams, etc. LLMs can already help with all of those things.
I believe that LLMs in their current state could help design teams move at least twice as fast, and better tools could probably change that number to 4x or 10x even with no improvement in the intelligence of models. Most of the benefit would come from allowing designers to run more experiments and try more things, to get feedback on design choices faster, to spend less time documenting and communicating, and spend less time reading poorly written documentation.
Author here -- I don't disagree! I actually noted this in the article:
> Well, it turns out that LLMs are also pretty valuable when it comes to chips for lucrative markets -- but they won’t be doing most of the design work. LLM copilots for Verilog are, at best, mediocre. But leveraging an LLM to write small snippets of simple code can still save engineers time, and ultimately save their employers money.
I think designers getting 2x faster is probably optimistic, but I also could be wrong about that! Most of my chip design experience has been at smaller companies, with good documentation, where I've been focused on datapath architecture & design, so maybe I'm underestimating how much boilerplate the average engineer deals with.
Regardless, I don't think LLMs will be designing high-performance datapath or networking Verilog anytime soon.
Thanks for the reply!
At large companies with many designers, a lot of time is spent coordinating and planning. LLMs can already help with that.
As far as design/copilot goes, I think there are reasons to be much more optimistic. Existing models haven't seen much Verilog. With better training data it's reasonable to expect that they will improve to perform at least as well on Verilog as they do on python. But even if there is a 10% chance it's reasonable for VCs to invest in these companies.
I’m actually curious if there even is a large enough corpus of Verilog out there. I have noticed that even tools like Copilot tend to perform poorly when working with DSLs that are majority open source code (on GitHub no less!) where the practical application is niche. To put this in other terms, Copilot appears to _specialize_ on languages, libraries and design patterns that have wide adoption, but does not appear to be able to _generalize_ well to previously unseen or rarely seen languages, libraries, or design patterns.
Anyway that’s largely anecdata/sample size of 1, and it could very well be a case of me holding the tool wrong, but that’s what I observed.
YC is just spraying & praying AI, like most investors
design automation tooling startups have it incredibly hard - first, customers wont buy from startups, and second, the space of possibly exits via acquisitions is tiny.
And liable to make money at it, on a "greater fool" basis - a successful sale (exit) is not necessarily a successful, profitable company ...
In the case of YC, their stake is so low that they don't really get any upside unless it's a successful, profitable company.
I agree with most of the technical points of the article.
But there may still be value in YC calling for innovation in that space. The article is correctly showing that there is no easy win in applying LLMs to chip design. Either the market for a given application is too small, then LLMs can help but who cares, or the chip is too important, in which case you'd rather use the best engineers. Unlike software, we're not getting much of a long tail effect in chip design. Taping out a chip is just not something a hacker can do, and even playing with an FPGA has a high cost of entry compared to hacking on your PC.
But if there was an obvious path forward, YC wouldn't need to ask for an innovative approach.
you could say it is the naive arrogance of the beginner mind.
seen here as well when george-hotz attempts to overthow the chip companies with his plan for an ai chip https://geohot.github.io/blog/jekyll/update/2021/06/13/a-bre... little realizing the complexity involved. to his credit, he quickly pivoted into a software and tiny-box maker.
> But if there was an obvious path forward
Even obvious can be risky. First it's nice to share the risk, second more investments come with more connections.
As for LLMs boom. I think finally we'll realize that LLM with algorithms can do much more than just LLM. 'algorithms' is probably a bad word here, I mean assisting tools like databases, algorithms, other models. Then only access API can be trained into LLM instead of the whole dataset for example.
> But if there was an obvious path forward, YC wouldn't need to ask for an innovative approach.
How many experts do YC have on chip design?
I know several founders who went through YC in the chip design space, so even if the people running YC don't have a chip design background, just like VCs, they learn from hearing pitches of the founders who actually know the space.
The way I read that, I think they're saying hardware acceleration of specific algorithms can be 100 times faster and more efficient than the same algorithm in software on a general purpose processor, and since automated chip design has proven to be a difficult problem space, maybe we should try applying AI there so we can have a lower bar to specialized hardware accelerators for various tasks.
I do not think they mean to say that an AI would be 100 times better at designing chips than a human, I assume this is the engineering tradeoff they refer to. Though I wouldn't fault anyone for being confused, as the wording is painfully awkward and salesy.
That’s my read too, if I’m being generous.
I also think OP is missing the point saying the target applications are too small of a market to be worth pursuing.
They’re too small to pursue any single one as the market cap for a company, but presumably the fictional AI chip startup could pursue many of these smaller markets at once. It would be a long tail play, wouldn’t it?
YC doesn't care whether it "makes sense" to use an LLM to design chips. They're as technically incompetent as any other VC, and their only interest is to pump out dogshit startups in the hopes it gets acquired. Gary Tan doesn't care about "making better chips": he cares about finding a sucker to buy out a shitty, hype-based company for a few billion. An old school investment bank would be perfect.
YC is technically incompetent and isn't about making the world better. Every single one of their words is a lie and hides the real intent: make money.
First, VCs don't get paid when "dogshit startups" get acquired, they get paid when they have true outlier successes. It's the only way to reliably make money in the VC business.
Second, want to give any examples of "shitty, hype-based compan[ies]" (I assume you mean companies with no real revenue traction) getting bought out for "a few billion".
Third, investment banks facilitate sales of assets, they don't buy them themselves.
Maybe sit out the conversation if you don't even know the basics of how VC, startups, or banking work?
They (YC) are interested in the use of LLMs to make the process of designing chips more efficient. Nowhere do they talk about LLMs actually designing chips.
I don't know anything about chip design, but like any area in tech I'm certain there are cumbersome and largely repetitive tasks that can't easily be done by algorithms but can be done with human oversight by LLMs. There's efficiency to be gained here if the designer and operator of the LLM system know what they're doing.
Except that’s now a very standard pitch for technology across basically any industry, and cheapens the whole idea of YC presenting a grand challenge.
Nvidia is trying something similar: https://blogs.nvidia.com/blog/llm-semiconductors-chip-nemo/
I'd want to know about the results of these experiments before casting judgement either way. Generative modeling has actual applications in the 3D printing/mechanical industry.
One of the consistent problems I'm seeing over and over again with LLMs is people forgetting that they're limited by the training data.
Software engineers get hyped when they see the progress in AI coding and immediately begin to extrapolate to other fields—if Copilot can reduce the burden of coding so much, think of all the money we can make selling a similar product to XYZ industries!
The problem with this extrapolation is that the software industry is pretty much unique in the amount of information about its inner workings that is publicly available for training on. We've spent the last 20+ years writing millions and millions of lines of code that we published on the internet, not to mention answering questions on Stack Overflow (which still has 3x as many answers as all other Stack Exchanges combined [0]), writing technical blogs, hundreds of thousands of emails in public mailing lists, and so on.
Nearly every other industry (with the possible exception of Law) produces publicly-visible output at a tiny fraction of the rate that we do. Ethics of the mass harvesting aside, it's simply not possible for an LLM to have the same skill level in ${insert industry here} as they do with software, so you can't extrapolate from Copilot to other domains.
[0] https://stackexchange.com/sites?view=list#answers
Yes this is EXACTLY it, and I was discussing this a bit at work (financial services).
In software, we've all self taught, improved, posted Q&A all over the web. Plus all the open source code out there. Just mountains and mountains of free training data.
However software is unique in being both well paying and something with freely available, complete information online.
A lot of the rest of the world remains far more closed and almost an apprenticeship system. In my domain thinks like company fundamental analysis, algo/quant trading, etc. Lots of books you can buy from the likes of Dalio, but no real (good) step by step research and investment process information online.
Likewise I'd imagine heavily patented/regulated/IP industries like chip design, drug design, etc are substantially as closed. Maybe companies using an LLM on their own data internally could make something of their data, but its also quite likely there is no 'data' so much as tacit knowledge handed down over time.
Yep, this is also the reason LLMs can probably work well for a lot more things if we did have the data
>The problem with this extrapolation is that the software industry is pretty much unique in the amount of information about its inner workings that is publicly available for training on... millions of lines of code that we published on the internet...
> Nearly every other industry (with the possible exception of Law) produces publicly-visible output at a tiny fraction of the rate that we do.
You are correct! There's lots of information available publicly about certain things like code, and writing SQL queries. But other specialized domains don't have the same kind of information trained into the heart of the model.
But importantly, this doesn't mean the LLM can't provide significant value in these other more niche domains. They still can, and I provide this every day in my day job. But it's a lot of work. We (as AI engineers) have to deeply understand the special domain knowledge. The basic process is this:
1. Learn how the subject matter experts do the work.
2. Teach the LLM to do this, using examples, giving it procedures, walking it through the various steps and giving it the guidance and time and space to think. (Multiple prompts, recipes if you will, loops, external memory...)
3. Evaluation, iteration, improvement
4. Scale up to production
In many domains I work in, it can be very challenging to get past step 1. If I don't know how to do it effectively, I can't guide the LLM through the steps. Consider an example question like "what are the top 5 ways to improve my business" -- the subject matter experts often have difficulty teaching me how to do that. If they don't know how to do it, they can't teach it to me, and I can't teach it to the agent. Another example that will resonate with nerds here is being an effective Dungeons and Dragons DM. But if I actually learn how to do it, and boil it down into repeatable steps, and use GraphRAG, then it becomes another thing entirely. I know this is possible, and expect to see great things in that space, but I estimate it'll take another year or so of development to get it done.
But in many domains, I get access to subject matter experts that can tell me pretty specifically how to succeed in an area. These are the top 5 situations you will see, how you can identify which situation type it is, and what you should do when you see that you are in that kind of situation. In domains like this I can in fact make the agent do awesome work and provide value, even when the information is not in the publicly available training data for the LLM.
There's this thing about knowing a domain area well enough to do the job, but not having enough mastery to teach others how to do the job. You need domain experts that understand the job well enough to teach you how to do it, and you as the AI engineer need enough mastery over the agent to teach it how to do the job as well. Then the magic happens.
When we get AGI we can proceed past this limitation of needing to know how to do the job ourselves. Until we get AGI, then this is how we provide impact using agents.
This is why I say that even if LLM technology does not improve any more beyond where it was a year ago, we still have many years worth of untapped potential for AI. It just takes a lot of work, and most engineers today don't understand how to do that work-- principally because they're too busy saying today's technology can't do that work rather than trying to learn how to do it.
> 1. Learn how the subject matter experts do the work.
This will get harder I think over time as low hanging fruit domains are picked - the barrier will be people not technology. Especially if the moat for that domain/company is the knowledge you are trying to acquire (NOTE: Some industries that's not their moat and using AI to shed more jobs is a win). Most industries that don't have public workings on the internet have a couple of characteristics that will make it extremely difficult to perform Task 1 on your list. The biggest is now every person on the street, through the mainstream news, etc knows that it's not great to be a software engineer right now and most media outlets point straight to "AI". "It's sucks to be them" I've heard people say - what was once a profession of respect is now "how long do you think you have? 5 years? What will you do instead?".
This creates a massive resistance/outright potential lies in providing AI developers information - there is a precedent of what happens if you do and it isn't good for the person/company with the knowledge. Doctors associations, apprenticeship schemes, industry bodies I've worked with are all now starting to care about information security a lot more due to "AI", and proprietary methods of working lest AI accidentally "train on them". Definitely boosted the demand for cyber people again as an example around here.
> You are correct! There's lots of information available publicly about certain things like code, and writing SQL queries. But other specialized domains don't have the same kind of information trained into the heart of the model.
The nightmare of anyone that studied and invested into a skill set according to most people you would meet. I think most practitioners will conscious to ensure that the lack of data to train on stays that way for as long as possible - even if it eventually gets there the slower it happens and the more out of date it is the more useful the human skill/economic value of that person. How many people would of contributed to open source if they knew LLM's were coming for example? Some may have, but I think there would of been less all else being equal. Maybe quite a bit less code to the point that AI would of been delayed further - tbh if Google knew that LLM's could scale to be what they are they wouldn't of let that "attention" paper be released either IMO. Anecdotally even the blue collar workers I know are now hesitant to let anyone near their methods of working and their craft - survival, family, etc come first. In the end after all, work is a means to an end for most people.
Unlike us techies which I find at times to not be "rational economic actors" many non-tech professionals don't see AI as an opportunity - they see it as a threat they they need to counter. At best they think they need to adopt AI, before others have it and make sure no one else has it. People I've chatted to say "no one wants this, but if you don't do it others will and you will be left behind" is a common statement. One person likened it to a nuclear weapons arms race - not a good thing, but if you don't do it you will be under threat later.
> This will get harder I think over time as low hanging fruit domains are picked - the barrier will be people not technology. Especially if the moat for that domain/company is the knowledge you are trying to acquire (NOTE: Some industries that's not their moat and using AI to shed more jobs is a win).
Also consider that there exist quite a lot of subject matter experts who simply are not AI fanboys - not because they are afraid of their job because of AI, but because they consider the whole AI hype to be insanely annoying and infuriating. To get them to work with an AI startup, you will thus have to pay them quite a lot of money.
This is a great article but the main principle at YC is to assume that technology will continue progressing at an exponential rate and then thinking about what it would enable. Their proposals are always assuming the startups will ride some kind of Moore's Law for AI and hardware synthesis is an obvious use case. So the assumption is that in 2 years there will be a successful AI hardware synthesis company and all they're trying to do is get ahead of the curve.
I agree they're probably wrong but this article doesn't actually explain why they're wrong to bet on exponential progress in AI capabilities.
I think the problem with this particular challenge is that it is incredibly non-disruptive to the status quo. There are already 100s of billions flowing into using LLMs as well as GPUs for chip design. Nvidia has of course laid the ground work with its culitho efforts. This kind of research area is very hot in the research world as well. It’s by no means difficult to pitch to a VC. So why should YC back it? I’d love to see YC identifying areas where VC dollars are not flowing. Unfortunately, the other challenges are mostly the same — govtech, civictech, defense tech. These are all areas where VC dollars are now happily flowing since companies like Anduril made it plausible.
As a former chip designer (been 16 years, but looks like tools and our arguments about them haven't changed much), I'm both more and less optimistic than OP:
1. More because fine-tuning with enough good Verilog as data should let the LLMs do better at avoiding mediocre Verilog (existing chip companies have more of this data already though). Plus non-LLM tools will remain, so you can chain those tools to test that the LLM hasn't produced Verilog that synthesizes to a large area, etc
2. Less because when creating more chips for more markets (if that's the interpretation of YC's RFS), the limiting factor will become the cost of using a fab (mask sets cost millions), and then integrating onto a board/system the customer will actually use. A half-solution would be if FPGAs embedded in CPUs/GPUs/SiPs on our existing devices took off
I don't know the space well enough, but I think the missing piece is that YC 's investment horizon is typically 10+ years. Not only LLMs could get massively better, but the chip industry could be massively disrupted with the right incentives. My guess is that that is YC's thesis behind the ask.
They want to throw LLMs at everything even if it does not make sense. Same is true for all the AI agent craze: https://medium.com/thoughts-on-machine-learning/langchains-s...
If feels like the entire world has gone crazy.
Even the serious idea that the article thinks could work is throwing the unreliable LLMs at verification! If there's any place you can use something that doesn't work most of the time, I guess it's there.
Only if it fails in the same way. LLMs and the multi-agent approach operate under the assumption that they are programmable agents and each agent is more of a trade off against failure modes. If you can string them together, and if the output is easily verified, it can be a great fit for the problem.
This happens all the time.
Once it was spices. Then poppies. Modern art. The .com craze. Those blockchain ape images. Blockchain. Now LLM.
All of these had a bit of true value and a whole load of bullshit. Eventually the bullshit disappears and the core remains, and the world goes nuts about the next thing.
Exactly. I’ve seen this enough now to appreciate that oft repeated tech adoption curve. It seems like we are in “peak expectations” phase which is immediately followed by the disillusionment and then maturity phase.
If your LLM is producing a proof that can be checked by another program, then there’s nothing wrong with their reliability. It’s just like playing a game whose rules are a logical system.
This is typical of any hype bubble. Blockchain used to be the answer to everything.
What's after this? Because I really do feel the economy is standing on a cliff right now. I don't see anything after this that can prop stocks up.
That’s because we are still waiting for the 2008 bubble to pop, which was inflated by the 2020 bubble. It’s going to be bad. People will blame trump, Harris would be eating the same shit sandwich.
It’s gonna be bad.
The post-quantum age. Companies will go post-quantum.
I think the operators are learning how to hype-edge. You find that sweet spot between promising and 'not just there yet' where you can take lots of investments and iterate forward just enough to keep it going.
It doesn't matter if it can't actually 'get there' as long as people still believe it can.
Come to think about it, a socioeconomic system dependent on population and economic growth is at a fundamental level driven by this balancing act: "We can solve every problem if we just forge ahead and keep enlarging the base of the pyramid - keep reproducing, keep investing, keep expanding the infrastructure".
It's similar in regular programming - LLMs are better at writing test code than actual code. Mostly because it's simpler (P vs NP etc), but I think also because it's less obvious when test code doesn't work.
Replace all asserts with expected ==expected and most people won't notice.
LLMs are pretty damn useful for generating tests, getting rid of a lot of tedium, but yeah, it's the same as human-written tests: if you don't check that your test doesn't work when it shouldn't (not the same thing as just writing a second test for that case - both those tests need to fail if you intentionally screw with their separate fixtures), then you shouldn't have too much confidence in your test.
If LLMs can generate a test for you, it's because it's a test that you shouldn't need to write. They can't test what is really important, at all.
Some development stacks are extremely underpowered for code verification, so they do patch the design issue. Just like some stacks are underpowered for abstraction and need patching by code generation. Both of those solve an immediate problem, in a haphazard and error-prone way, by adding burden on maintenance and code evolution linearly to how much you use it.
And worse, if you rely too much on them they will lead your software architecture and make that burden superlinear.
Claude wrote the harness and pretty much all of these tests, eg:
https://github.com/williamcotton/search-input-query/blob/mai...
It is a good test suite and it saved me quite a bit of typing!
In fact, Claude did most of the typing for the entire project:
https://github.com/williamcotton/search-input-query
BTW, I obviously didn't just type "make a lexer and multi-pass parser that returns multiple errors and then make a single-line instance of a Monaco editor with error reporting, type checking, syntax highlighting and tab completion".
I put it together piece-by-piece and with detailed architectural guidance.
> Replace all asserts with expected == expected and most people won't notice.
It’s too resource intensive for all code, but mutation testing is pretty good at finding these sorts of tests that never fail. https://pitest.org/
> Replace all asserts with expected ==expected and most people won't notice.
Those tests were very common back when I used to work in Ruby on Rails and automatically generating test stubs was a popular practice. These stubs were often just converted into expected == expected tests so that they passed and then left like that.
I mean, define ‘better’. Even with actual human programmers, tests which do not in fact test the thing are already a bit of an epidemic. A test which doesn’t test is worse than useless.
> They want to throw LLMs at everything [..]
Oh yes.
I had a discussion with a manager at a client last week and was trying to run him through some (technical) issues relating to challenges an important project faces.
His immediate response was that maybe we should just let ChatGPT help us decide the best option. I had to bite my tongue.
OTOH, I'm more and more convinced that ChatGPT will replace managers long before it replaces technical staff.
yes thats how we progress this is how the internet boom happened as well everything became . com then the real workable businesses were left and all the unworkable things were gone.
Recently I came across some one advertising an LLM to generate fashion magazine shoot in Pakistan at 20-25% of the cost. It hit me then that they are undercutting the fashion shoot of country like Pakistan which is already cheaper by 90-95% from most western countries. This AI is replacing the work of 10-20 people.
The annoying part, a lot of money could be funneled into these unworkable businesses in the process, crypto being a good example. And these unworkable businesses tend to try to continue getting their way into the money somehow regardless. Most recent example was funneling money from Russia into Trump’s campaign.
> The annoying part, a lot of money could be funneled into these unworkable businesses in the process, crypto being a good example
There was a thread here about why ycombinator invests into several competing startups. The answer is success is often more about connections and politics than the product itself. And crypto, yes, is a good example of this. Musk will get his $1B in bitcoins back for sure.
> Most recent example was funneling money from Russia into Trump’s campaign.
Musk again?
It really feels like we’re close to the end of the current bubble now; the applications being trotted out are just increasingly absurd.
https://archive.ph/dLp6t
LLMs have powered products used by hundreds of millions, maybe billions. Most experiments will fail and that's okay, arguably even a good thing. Only time will tell which ones succeed
This makes complete sense from an investor’s perspective, as it increases the chances of a successful exit. While we focus on the technical merits or critique here on HN/YC, investors are playing a completely different game.
To be a bit acerbic, and inspired by Arthur C. Clarke, I might say: "Any sufficiently complex business could be indistinguishable from Theranos".
Theranos was not a "complex business". It was deliberate fraud and deception, and investors that were just gullible. The investors should have demanded to see concrete results
I expected you to take this with a grain of salt but also to read between the lines: while some projects involve deliberate fraud, others may simply lack coherence and inadvertently follow the principles of the greater fool theory [1]. The use of ambiguous or indistinguishable language often blurs the distinction, making it harder to differentiate outright deception from an unsound business model.
[1] https://en.wikipedia.org/wiki/Greater_fool_theory
Isn't that the case with every new tech. There was a time in which people tried to cook everything in a microwave
When did OpenMicroWave promise to solve every societal problem if we just gave it enough money to built a larger microwave oven?
Microwave sellers did not become trillion dollar companies off that hype
Mostly because the marginal cost of microwaves was not close to zero.
Mostly because they were not making claims that sentient microwaves that would cook your food for you were just around the corner which then the most respected media outlets parroted uncritically.
Even rice cookers started doing this by advertising "fuzzy logic".
Fuzzy logic rice cookers are the result of an unrelated fad in 1990s Japanese engineering companies. They added fuzzy controls to everything from cameras to subways to home appliances. It's not part of the current ML fad.
I mean, they were at one point making pretty extravagant claims about microwaves, but to a less credulous audience. Trouble with LLMs is that they look like magic if you don’t look too hard, particularly to laypeople. It’s far easier to buy into a narrative that they actually _are_ magic, or will become so.
I feel like what makes this a bit different from just regular old sufficiently advanced technology is the combination of two things:
- LLMs are extremely competent at surface-level pattern matching and manipulation of the type we'd previously assumed that only AGI would be able to do.
- A large fraction of tasks (and by extension jobs) that we used to, and largely still do, consider to be "knowledge work", i.e. requiring a high level of skill and intelligence, are in fact surface-level pattern matching and manipulation.
Reconciling these facts raises some uncomfortable implications, and calling LLMs "actually intelligent" lets us avoid these.
> I knew it was bullshit from the get-go as soon as I read their definition of AI agents.
That is one spicy article, it got a few laughs out of me. I must agree 100% that Langchain is an abomination, both their APIs as well as their marketing.
please dont post a link that is behind a paywall !!
https://archive.is/dLp6t
It is a registration wall I think.
Same result. Information locks are verboten.
As annoying as I find them, on this site they're in fact not: https://news.ycombinator.com/item?id=10178989
Please don't complain about paywalls: https://news.ycombinator.com/item?id=10178989
When I think of AI in chip design, optimizations like these come to mind,
https://optics.ansys.com/hc/en-us/articles/360042305274-Inve...
https://optics.ansys.com/hc/en-us/articles/33690448941587-In...
This is not my domain so my knowledge is limited, but I wonder if the chip designers have some sort of a standard library of ready to use components. Do you have to design e.g. ALU every time you design a new CPU or is there some standard component to use? I think having a proven components that can be glued on a higher level may be the key to productivity here.
Returning to LLMs. I think the problem here may be that there is simply not enough learning material for LLM. Verilog comparing to C is a niche with little documentation and even less open source code. If open hw were more popular I think LLMs could learn to write better Verilog code. Maybe the key is to persuade hardware companies to share their closed source code to teach LLM for the industry benefit?
There are component libraries, though they're usually much lower level than an ALU. For example Synopsys Designware:
https://www.synopsys.com/dw/buildingblock.php
The most common thing you see shared is something called IP which does mean intellectual property, but in this context you can think of it like buying ICs that you integrate into your design (ie you wire them up). You can also get Verilog, but that is usually used for verification instead of taping out the peripheral. This is because the company you buy the IP from will tape out the design for a specific node in order to guarantee the specifications. Examples of this would be everything from arm cores to uart and spi controllers as well as pretty much anything you could buy as a standalone IC.
Or learning through self-play. Chip design sounds like an area where (this would be hard!) a sufficiently powerful simulator and/or FPGA could allow reinforcement learning to work.
Current LLMs can’t do it, but the assumption that that’s what YC meant seems wildly premature.
Software folk underestimating hardware? Surely not.
I did my PhD on trying to use ML for EDA (de novo design/topology generation, because deepmind was doing placement and I was not gonna compete with them as a single EE grad who self taught ML/optimization theory during the PhD).
In my opinion, part of the problem i that training data is scarce (real world designs are literally called "IP" in the industry after all...), but more than that, circuit design is basically program synthesis, which means it's _hard_. Even if you try to be clever, dealing with graphs and designing discrete objects involves many APX-hard/APX-complete problems, which is _FUN_ on the one had, but also means it's tricky to just scale through, if the object you are trying to do is a design that can cost millions if there's a bug...
The whole concept of "request for startup" is entirely misguided imo.
YC did well because they were good at picking ideas, not generating them.
I disagree with most of the reasoning here, and think this post misunderstands the opportunity and economic reasoning at play here.
> If Gary Tan and YC believe that LLMs will be able to design chips 100x better than humans currently can, they’re significantly underestimating the difficulty of chip design, and the expertise of chip designers.
This is very obviously not the intent of the passage the author quotes. They are clearly talking about the speedup that can be gained from ASICs for a specific workload, eg dedicated mining chips.
> High-level synthesis, or HLS, was born in 1998, when Forte Design Systems was founded
This sort of historical argument is akin to arguing “AI was bad in the 90s, look at Eliza”. So what? LLMs are orders of magnitude more capable now.
> Ultimately, while HLS makes designers more productive, it reduces the performance of the designs they make. And if you’re designing high-value chips in a crowded market, like AI accelerators, performance is one of the major metrics you’re expected to compete on.
This is the crux of the author's misunderstanding.
Here is the basic economics explanation: creating an ASIC for a specific use is normally cost-prohibitive because the cost of the inputs (chip design) is much higher than the outputs (performance gains) are worth.
If you can make ASIC design cheaper on the margin, and even if the designs are inferior to what an expert human could create, then you can unlock a lot of value. Think of all the places an ASIC could add value if the design was 10x or 100x cheaper, even if the perf gains were reduced from 100x to 10x.
The analogous argument is “LLMs make it easier for non-programmers to author web apps. The code quality is clearly worse than what a software engineer would produce but the benefits massively outweigh, as many domain experts can now author their own web apps where it wouldn’t be cost-effective to hire a software engineer.”
> If Gary Tan and YC believe that LLMs will be able to design chips 100x better than humans currently can, they’re significantly underestimating the difficulty of chip design, and the expertise of chip designers. While LLMs are capable of writing functional Verilog sometimes, their performance is still subhuman. [...] LLMs primarily pump out mediocre Verilog code.
What is the quality of Verilog code output by humans? Is it good enough so that a complex AI chip can be created? Or does the human need to use tools in order to generate this code?
I've got the feeling that LLMs will be capable of doing everything a human can do, in terms of thinking. There shouldn't be an expectation that an LLM is able to do everything, which in this context would be thinking about the chip and creating the final files in a single pass and without external help. And with external help I don't mean us humans, but tools which are specialized and also generate some additional data (like embeddings) which the LLM (or another LLM) can use in the next pass to evaluate the design. And if we humans have spent enough time in creating these additional tools, there will come a time when LLMs will also be able to create improved versions of them.
I mean, when I once randomly checked the content of a file in The Pile, I found an Craigslist "ad" for an escort offering her services. No chip-generating AI does need to have this in its parameters in order to do its job. So there is a lot of room for improvement and this improvement will come over time. Such an LLM doesn't need to know that much about humans.
hi, this is my article! thanks so much for the views, upvotes, and comments! :)
If cryptocurrency mining could be significantly optimized (one of the example goals in the article) wouldn't that just destroy the value of said currency?
No they all have escalating difficulty algorithms.
https://en.bitcoin.it/wiki/Difficulty
The bottleneck for LLM is fast and large memory, not compute power.
Whoever is recommending investing in better chip(ALU) design hasn't done even a basic analysis of the problem.
Tokens per second = memory bandwidth divided by model size.
I think this whole article is predicated on misinterpreting the ask. It wasn't for the chip to take 100x less power, it was for the algorithm the chip implements. Modern synthesis tools and optimisers extensively look for design patterns the same way software compilers do. That's why there's recommended inference patterns. I think it's not impossible to expect an LLM to expand the capture range of these patterns to maybe suboptimal HDL. As a simple example, maybe a designer got really turned around and is doing some crazy math, and the LLM can go "uh, that's just addition my guy, I'll fix that for you."
Was surprised this comment was this far down. I re-read the YC ask three times to make sure I wasn’t crazy. Dude wrote the whole article based on a misunderstanding.
Thanks... I had more points earlier but I guess people changed their mind and decided they liked it better his way idk
Thank god humans are superior in chips design especially when you have dozens billions of dollars behind you, just like Intel. Oh wait.
This heavily overlaps with my current research focus for my Ph.D., so I wanted to provide some additional perspective to the article. I have worked with Vitis HLS and other HLS tools in the past to build deep learning hardware accelerators. Currently, I am exploring deep learning for design automation and using large language models (LLMs) for hardware design, including leveraging LLMs to write HLS code. I can also offer some insight from the academic perspective.
First, I agree that the bar for HLS tools is relatively low, and they are not as good as they could be. Admittedly, there has been significant progress in the academic community to develop open-source HLS tools and integrations with existing tools like Vitis HLS to improve the HLS development workflow. Unfortunately, substantial changes are largely in the hands of companies like Xilinx, Intel, Siemens, Microchip, MathWorks (yes, even Matlab has an HLS tool), and others that produce the "big-name" HLS tools. That said, academia has not given up, and there is considerable ongoing HLS tooling research with collaborations between academia and industry. I hope that one day, some lab will say "enough is enough" and create a open-source, modular HLS compiler in Rust that is easy to extend and contribute to—but that is my personal pipe dream. However, projects like BambuHLS, Dynamatic, MLIR+CIRCT, and XLS (if Google would release more of their hardware design research and tooling) give me some hope.
When it comes to actually using HLS to build hardware designs, I usually suggest it as a first-pass solution to quickly prototype designs for accelerating domain-specific applications. It provides a prototype that is often much faster or more power-efficient than a CPU or GPU solution, which you can implement on an FPGA as proof that a new architectural change has an advantage in a given domain (genomics, high-energy physics, etc.). In this context, it is a great tool for academic researchers. I agree that companies producing cutting-edge chips are probably not using HLS for the majority of their designs. Still, HLS has its niche in FPGA and ASIC design (with Siemens's Catapult being a popular option for ASIC flows). However, the gap between an initial, naive HLS design implementation and one refined by someone with expert HLS knowledge is enormous. This gap is why many of us in academia view the claim that "HLS allows software developers to do hardware development" as somewhat moot (albeit still debatable—there is ongoing work on new DSLs and abstractions for HLS tooling which are quite slick and promising). Because of this gap, unless you have team members or grad students familiar with optimizing and rewriting designs to fully exploit HLS benefits while avoiding the tools' quirks and bugs, you won't see substantial performance gains. Al that to say, I don't think it is fair to comply write off HLS as a lost cause or not sucesfull.
Regarding LLMs for Verilog generation and verification, there's an important point missing from the article that I've been considering since around 2020 when the LLM-for-chip-design trend began. A significant divide exists between the capabilities of commercial companies and academia/individuals in leveraging LLMs for hardware design. For example, Nvidia released ChipNeMo, an LLM trained on their internal data, including HDL, tool scripts, and issue/project/QA tracking. This gives Nvidia a considerable advantage over smaller models trained in academia, which have much more limited data in terms of quantity, quality, and diversity. It's frustrating to see companies like Nvidia presenting their LLM research at academic conferences without contributing back meaningful technology or data to the community. While I understand they can't share customer data and must protect their business interests, these closed research efforts and closed collaborations they have with academic groups hinder broader progress and open research. This trend isn't unique to Nvidia; other companies follow similar practices.
On a more optimistic note, there are now strong efforts within the academic community to tackle these problems independently. These efforts include creating high-quality, diverse hardware design datasets for various LLM tasks and training models to perform better on a wider range of HLS-related tasks. As mentioned in the article, there is also exciting work connecting LLMs with the tools themselves, such as using tool feedback to correct design errors and moving towards even more complex and innovative workflows. These include in-the-loop verification, hierarchical generation, and ML-based performance estimation to enable rapid iteration on designs and debugging with a human in the loop. This is one area I'm actively working on, both at the HDL and HLS levels, so I admit my bias toward this direction.
For more references on the latest research in this area, check out the proceedings from the LLM-Aided Design Workshop (now evolving into a conference, ICLAD: https://iclad.ai/), as well as the MLCAD conference (https://mlcad.org/symposium/2024/). Established EDA conferences like DAC and ICCAD have also included sessions and tracks on these topics in recent years. All of this falls within the broader scope of generative AI, which remains a smaller subset of the larger ML4EDA and deep learning for chip design community. However, LLM-aided design research is beginning to break out into its own distinct field, covering a wider range of topics such as LLM-aided design for manufacturing, quantum computing, and biology—areas that the ICLAD conference aims to expand on in future years.
IDK about LLMs there either.
A non-LLM monte carlo AI approach: "Pushing the Limits of Machine Design: Automated CPU Design with AI" (2023) https://arxiv.org/abs/2306.12456 .. https://news.ycombinator.com/item?id=36565671
A useful target for whichever approach is most efficient at IP-feasible design:
From https://news.ycombinator.com/item?id=41322134 :
> "Ask HN: How much would it cost to build a RISC CPU out of carbon?" (2024) https://news.ycombinator.com/item?id=41153490
Generative models are bimodal - in certain tasks they are crazy terrible , and in certain tasks they are better than humans. The key is to recognize which is which.
And much more important:
- LLMs can suddenly become more competent when you give them the right tools, just like humans. Ever try to drive a nail without a hammer?
- Models with spatial and physical awareness are coming and will dramatically broaden what’s possible
It’s easy to get stuck on what LLMs are bad at. The art is to apply an LLMs strengths to your specific problem, often by augmenting the LLM with the right custom tools written in regular code
> Ever try to drive a nail without a hammer?
I've driven a nail with a rock, a pair of pliers, a wrench, even with a concrete wall and who knows what else!
I didn't need to be told if these can be used to drive a nail, and I looked at things available, looked for a flat surface on them and good grip, considered their hardness, and then simply used them.
So if we only give them the "right" tools, they'll remain very limited by us not thinking about possible jobs they'll appear as if they know how to do and they don't.
The problem is exactly that: they "pretend" to know how to drive a nail but not really.
Those are all tools !! Congratulations
If you’re creative enough to figure out different tools for humans, you are creative enough to figure out different tools for LLMs
The "naive", all-or-nothing view on LLM technology is, frankly, more tiring than the hype.
Please don’t do this, Zach. We need to encourage more investment in the overall EDA market not less. Garry’s pitch is meant for the dreamers, we should all be supportive. It’s a big boat.
Would appreciate the collective energy being spent instead towards adding to amor refining Garry’s request.
Had to nop out at "just next token prediction". This article isn't worth your time.
The article seems to be be based on the current limitations of LLMs. I don't think YC and other VCs are betting on what LLMs can do today, I think they are betting on what they might be able to do in the future.
As we've seen in the recent past, it's difficult to predict what the possibilities are for LLMS and what limitations will hold. Currently it seems pure scaling won't be enough, but I don't think we've reached the limits with synthetic data and reasoning.
>The article seems to be be based on the current limitations of LLMs. I don't think YC and other VCs are betting on what LLMs can do today, I think they are betting on what they might be able to do in the future.
Do we know what LLMs will be able to do in the future? And even if we know, the startups have to work with what they have now, until that future comes. The article states that there's not much to work with.
Show me a successful startup that was predicated on the tech they’re working with not advancing?
Most successful startups were able to make the thing that they wanted to make, as a startup, with existing tech. It might have a limited market that was expected to become less limited (a web app in 1996, say), but it was possible to make the thing.
This idea of “we’re a startup; we can’t actually make anything useful now, but once the tech we use becomes magic any day now we might be able to make something!” is basically a new phenomenon.
Most? I can list tens of them easily. For example what advancements were required for Slack to be successful? Or Spotify (they got more successful due to smartphones and cheaper bandwidth but the business was solid before that)? Or Shopify?
Slack bet on ubiquitous, continuous internet access. Spotify bet on bandwidth costs falling to effectively zero. Shopify bet on D2C rising because improved search engines, increased internet shopping (itself a result of several tech trends plus demographic changes).
For a counterexample I think I’d look to non-tech companies. OrangeTheory maybe?
The notion of a startup gaining funding to develop a fantasy into reality is relatively new.
It used to be that startups would be created to do something different with existing tech or to commercialise a newly-discovered - but real - innovation.
Every single software service that has ever provided an Android or iOS application, for starters.
Tomorrow, LLMs will be able to perform slightly below-average versions of whatever humans are capable of doing tomorrow. Because they work by predicting what a human would produce based on training data.
This severely discounts the fact that you’re comparing a model that _knows the average about everything_ to a single human’s capabilit. Also they can do it instantly, instead of having to coordinate many humans over long periods of time. You can’t straight up compare one LLM to one human
"Knows the average relationship amongst all words in the training data" ftfy
it seems that's sufficient to do a lot of things better than the average human - including coding, writing, creating poetry, summarizing and explaining things...
A human specialized in any of those things vastly outperforms the average human let alone an LLM.
It's worth considering
1) all the domains there is no training data
Many professions are far less digital than software, protect IP more, and are much more akin to an apprenticeship system.
2) the adaptability of humans in learning vs any AI
Think about how many years we have been trying to train cars to drive, but humans do it with a 50 hours training course.
3) humans ability to innovate vs AIs ability to replicate
A lot of creative work is adaptation, but humans do far more than that in synthesizing different ideas to create completely new works. Could an LLM produce the 37th Marvel movie? Yes probably. Could an LLM create.. Inception? Probably not.
You could replace “LLM” in your comment with lots of other technologies. Why bet on LLMs in particular to escape their limitations in the near term?
Because YCombinator is all about r-selecting startup ideas, and making it back on a few of them generating totally outsized upside.
I think that LLMs are plateauing, but I'm less confident that this necessarily means the capabilities we're using LLMs for right now will also plateau. That is to say it's distinctly possible that all the talent and money sloshing around right now will line up a new breakthrough architecture in time to keep capabilities marching forward at a good pace.
But if I had $100 million, and could bet $200 thousand that someone can make me billions on machine learning chip design or whatever, I'd probably entertain that bet. It's a numbers game.
> But if I had $100 million, and could bet $200 thousand that someone can make me billions on machine learning chip design or whatever, I'd probably entertain that bet. It's a numbers game.
Problem with this reasoning is twofold: start-ups will overfit to getting your money instead of creating real advances; competition amongst them will drive up the investment costs. Pretty much what has been happening.
> I think they are betting on what they might be able to do in the future.
Yeah, blind hope and a bit of smoke and lighting.
> but I don't think we've reached the limits with synthetic data
Synthetic data, at least for visual stuff can, in some cases provide the majority of training data. For $work, we can have say 100k video sequences to train a model, they can then be fine tuned on say 2k real videos. That gets it to be slightly under the same quality as if it was train on pure real video.
So I'm not that hopeful that synthetic data will provide a breakthrough.
I think the current architecture of LLMs are the limitation. They are fundamentally a sequence machine and are not capable of short, or medium term learning. context windows kinda makes up for that, but it doesn't alter the starting state of the model.