Making LeCun report to Wang was the most boneheaded move imaginable. But… I suppose Zuckerberg knows what he wants, which is AI slopware and not truly groundbreaking foundation models.
In industry research, someone in a chief position like LeCun should know how to balance long-term research with short-term projects. However, for whatever reason, he consistently shows hostility toward LLMs and engineering projects, even though Llama and PyTorch are two of the most influential projects from Meta AI. His attitude doesn’t really match what is expected from a Chief position at a product company like Facebook. When Llama 4 got criticized, he distanced himself from the project, stating that he only leads FAIR and that the project falls under a different organization. That kind of attitude doesn’t seem suitable for the face of AI at the company. It's not a surprise that Zuck tried to demote him.
I would pose a question differently, under his leadership did Meta achieve good outcome?
If the answer is yes, then better to keep him, because he has already proved himself and you can win in the long-term. With Meta's pockets, you can always create a new department specifically for short-term projects.
If the answer is no, then nothing to discuss here.
Zuck did this on purpose, humiliating LeCun so he would leave.
Despite LeCun being proved wrong on LLMs capabilities such as reasoning, he remained extremely negative, not exactly inspiring leadership to the Meta Ai team, he had to go.
Yeah I think LeCun is underestimating the impact that LLM's and Diffusion models are going to have, even considering the huge impact they're already having. That's no problem as I'm sure whatever LeCun is working on is going to be amazing as well, but an enterprise like Facebook can't have their top researcher work on risky things when there's surefire paths to success still available.
I politely disagree - it is exactly an industry researcher's purpose to do the risky things that may not work, simply because the rest of the corporation cannot take such risks but must walk on more well-trodden paths.
Corporate R&D teams are there to absorb risk, innovate, disrupt, create new fields, not for doing small incremental improvements. "If we know it works, it's not research." (Albert Einstein)
I also agree with LeCun that LLMs in their current form - are a dead end. Note that this does not mean that I think we have already exploited LLMs to the limit, we are still at the beginning. We also need to create an ecosystem in which they can operate well: for instance, to combine LLMs with Web agents better we need a scalable "C2B2C" (customer delegated to business to business) micropayment infrastructure, because as these systems have already begun talking to each other, in the longer run nobody would offer their APIs for free.
I work on spatial/geographic models, inter alia, which by coincident is one of the direction mentioned in the LeCun article. I do not know what his reasoning is, but mine was/is: LMs are language models, and should (only) be used as such. We need other models - in particular a knowledge model (KM/KB) to cleanly separate knowledge from text generation - it looks to me right now that only that will solve hallucination.
Knowledge models, like ontologies, always seem suspect to me; like they promise a schema for crisp binary facts, when the world is full of probabilistic and fuzzy information loosely categorized by fallible humans based on an ever slowly shifting social consensus.
Everything from the sorites paradox to leaky abstractions; everything real defies precise definition when you look closely at it, and when you try to abstract over it, to chunk up, the details have an annoying way of making themselves visible again.
You can get purity in mathematical models, and in information systems, but those imperfectly model the world and continually need to be updated, refactored, and rewritten as they decay and diverge from reality.
These things are best used as tools by something similar to LLMs, models to be used, built and discarded as needed, but never a ground source of truth.
> it is exactly a researcher's purpose to do the risky things that may not work
Maybe at university, but not at a trillion dollar company. That job as chief scientist is leading risky things that will work to please the shareholders.
They knew what Yann LeCun was when they hired him. If anything, those brilliant academics who have done what they're told and loyally pursued corporate objectives the way the corporation wanted (e.g. Karpathy when he was at Tesla) haven't had great success either.
LLMs and Diffusion solve a completely different problem than world models.
If you want to predict future text, you use an LLM. If you want to predict future frames in a video, you go with Diffusion. But what both of them lack is object permanence. If a car isn't visible in the input frame, it won't be visible in the output. But in the real world, there are A LOT of things that are invisible (image) or not mentioned but only implied (text) that still strongly affect the future. Every kid knows that when you roll a marble behind your hand, it'll come out on the other side. But LLMs and Diffusion models routinely fail to predict that, as for them the object disappears when it stops being visible.
Based on what I heard from others, world models are considered the missing ingredient for useful robots and self-driving cars. If that's halfway accurate, it would make sense to pour A LOT of money into world models, because they will unlock high-value products.
Sure, if you only consider the model they have no object permanence. However you can just put your model in a loop, and feed the previous frame into the next frame. This is what LLM agent engineers do with their context histories, and it's probably also what the diffusion engineers do with their video models.
Messing with the logic in the loop and combining models has an enormous potential, but it's more engineering than researching, and it's just not the sort of work that LeCun is interested in. I think the conflict lies there, that Facebook is an engineering company, and a possible future of AI lies in AI engineering rather than AI research.
I think World models is way to go for Super Iintelligence. One of teh patent i saw already going in this direction for Autonomous mobility is https://patents.google.com/patent/EP4379577A1 where synthetic data generation (visualization) is missing step in terms of our human intelligence.
I thoroughly disagree, I believe world models will be critical in some aspect for text generation too. A predictive world model you can help to validate your token prediction. Take a look at the Code World Model for example.
The last time LeCun disagreed with the AI mainstream was when he kept working on neural net when everyone thought it was a dead end. He might be entirely right in his LLM scepticism. It's hardly a surefire path. He didn't prevent Meta from working on LLM anyway.
The issue is more than his position is not compatible with short term investors expectations and that's fatal in a company like Meta at the position LeCun occupies.
While I agree with your point, “Superintelligence” is a far cry from what Meta will end up delivering with Wang in charge. I suppose that, at the end of the day, it’s all marketing. What else should we expect from an ads company :?
Do you? Or is it possible to acknowledge a plateau in innovation without necessarily having an immediate solution cooked-up and ready to go?
Are all critiques of the obvious decline in physical durability of American-made products invalid unless they figure out a solution to the problem? Or may critics of a subject exist without necessarily being accredited engineers themselves?
LLM's are probably always going to be the fundamental interface, the problem they solved was related to the flexibility of human languages allowing us to have decent mimikry's.
And while we've been able to approximate the world behind the words, it's just full of hallucinations because the AI's lack axiomatic systems beyond much manually constructed machinery.
You can probably expand the capabilties by attaching to the front-end but I suspect that Yann is seeing limits to this and wants to go back and build up from the back-end of world reasoning and then _among other things_ attach LLM's at the front-end (but maybe on equal terms with vision models that allows for seamless integration of LLM interfacing _combined_ with vision for proper autonomous systems).
Zuckerberg knows what he wants but he rarely knows how to get it. That's been his problem all along. Unlike others he isn't scared to throw ridiculous amounts of money at a problem though and buy companies who do things he can't get done himself.
LeCun, who's been saying LLMs are a dead end for years, is finally putting his money where his mouth is. Watch for LeCun to raise an absolutely massive VC round.
Good. The world model is absolutely the right play in my opinion.
AI Agents like LLMs make great use of pre-computed information. Providing a comprehensive but efficient world model (one where more detail is available wherever one is paying more attention given a specific task) will definitely eke out new autonomous agents.
Swarms of these, acting in concert or with some hive mind, could be how we get to AGI.
I wish I could help, world models are something I am very passionate about.
He needs a patient investor and realized Zuck is not that. As someone who delivers product and works a lot with researchers I get the constant tension that might exist with competing priorities. Very curious to see how he does, imho the outcome will be either of the extremes - one of the fastest growing companies by valuation ever or a total flop. Either way this move might advance us to whatever end state we are heading towards with AI.
Will be interesting to see how he fares outside the ample resources of Meta: Personnel, capital, infrastructure, data, etc. Startups have a lot of flexibility, but a lot of additional moving parts. Good luck!
Fi Fi Lee also recently founded a new AI startup called World Labs, which focus on creating AI world models with spatial intelligence to understand and interact with the 3D world, unlike current LLM AI that primarily processes 2D images and text. Almost exactly the same focus as Yann LeCun's new venture stated in the parent article.
Really? From where I'm standing LeCun is a pompous researcher who had early success in his career, and has been capitalizing on that ever since. Have you read any of his papers from the last 20 years? 90% of his citations are to his own previous papers. From there, he missed the boat on LLMs and is now pretending everyone else is wrong so that he can feel better about it.
Winter is a cyclical concept, just like all the other seasons. It will be no different here; the pendulum swings back and forth. The unknown factor is the length of the cycle.
I suspect he sees a lot of scattered pieces of fundamental research outside of LLM's that he thinks could be integrated for a core within a year, the 10 years is to temper investors (that he can buy leeway for with his record) and fine tune and work out the kinks when actually integrating everything that might not have some obvious issues.
During his years at Meta, LeCun failed to deliver anything that delivered real value to stockholders, and may have demotivated people working on LLMs—he repeatedly said, "If you are interested in human-level AI, don’t work on LLMs."
His stance is understandable, but hardly the best way to rally a team that needs to push current tech to the limit.
The real issue: Meta is *far behind* Google, Anthropic, and OpenAI.
A radical shift is absolutely necessary - regardless of how much we sympathize with LeCun’s vision.
According to Grok, these were LeCun's real contributions at Meta (2013–2025):
- PyTorch – he championed a dynamic, open-source framework; now powers 70%+ of AI research
- LLaMA 1–3 – his open-source push; he even picked the name
- SAM / SAM 2 – born from his "segment anything like a baby" vision
- JEPA (I-JEPA, V-JEPA) – his personal bet on non-autoregressive world models
Everything else (Movie Gen, LLaMA 4, Meta AI Assistant) came after he left or was outside his scope.
What the hell does Mark see in Wang? Wang was born into a family whose parents got Chinese government scholarships to study abroad but secretly stayed in the US, and then the guy turns super anti-China. From any angle, this dude just doesn't seem reliable at all.
> Wang was born into a family whose parents got Chinese government scholarships to study abroad but secretly stayed in the US, and then the guy turns super anti-China.
All I'm hearing is he's a smart guy from a smart family?
he is very smart. but Mark is not. Ever since Wang joined Meta, way too many big-name AI scientists have bounced because of him. US AI companies have at least half their researchers being Chinese, and now they've stuck this ultimate anti-China hardliner in charge—I just don't get what the hell Meta's up to(And even a lot of times, it ends up affecting non-Chinese scientists too.). Being anti-China? Fine, whatever, but don't let it tank your own business and products first.
Everybody has found out how LLMs no longer have a real research expanding horizon. Now most progress will likely be done by tweaks in the data, and lots of hardware. OpenAI's strategy.
And also it has extreme limitations that only world models or RL can fix.
Meta can't fight Google (has integrated supply chain, from TPUs to their own research lab) or OpenAI (brand awareness, best models).
He’s an incredible operator and has managed to acquire and grow an astounding number of successful businesses under the Meta banner. That is not trivial.
He definitely has horrible product instincts, but he also bought insta and whatsapp at what were, back then, eye-watering prices, and these were clearly massive successes in terms of killing off threats to the mothership. Everything since then, though…
You people need a PR guy, I'm serious. OpenAI is the first company I've ever seen that comes across as actively trying to be misanthropic in its messaging. I'm probably too old-fashioned, but this honestly sounds like Marlboro launching the slogan "lung cancer for the weak of mind".
It’s going to take money, what if your AGI has some tax policy ideas that are different from the inference owners?
Why would they let that AGI out into the wild?
Let’s say you create AGI. How long will it take for society to recover? How long will it take for people of a certain tax ideology to finally say oh OK, UBI maybe?
The last part is my main question. How long do you think it would take our civilization to recover from the introduction of AGI?
Edit: sama gets a lot of shit, but I have to admit at least he used to work on the UBI problem, orb and all. However, those days seem very long gone from the outside, at least.
He also said other things about LLMs that turned out to be either wrong or easily bypassed with some glue. While I understand where he comes from, and that his stance is pure research-y theory driven, at the end of the day his positions were wrong.
Previously, he very publicly and strongly said:
a) LLMs can't do math. They trick us in poetry but that's subjective. They can't do objective math.
b) they can't plan
c) by the very nature of autoregressive arch, errors compound. So the longer you go in your generation, the higher the error rate. so at long contexts the answers become utter garbage.
All of these were proven wrong, 1-2 years later. "a" at the core (gold at IMO), "b" w/ software glue and "c" with better training regimes.
I'm not interested in the will it won't it debates about AGI, I'm happy with what we have now, and I think these things are good enough now, for several usecases. But it's important to note when people making strong claims get them wrong. Again, I think I get where he's coming from, but the public stances aren't the place to get into the deep research minutia.
That being said, I hope he gets to find whatever it is that he's looking for, and wish him success in his endeavours. Between him, Fei Fei Li and Ilya, something cool has to come out of the small shops. Heck, I'm even rooting for the "let's commoditise lora training" that Mira's startup seems to go for.
That's true but I also think despite being wrong about the capabilities of LLMs, LeCun has been right in that variations of LLMs are not an appropriate target for long term research that aims to significantly advance AI. Especially at the level of Meta.
I think transformers have been proven to be general purpose, but that doesn't mean that we can't use new fundamental approaches.
To me it's obvious that researchers are acting like sheep as they always do. He's trying to come up with a real innovation.
LeCun has seen how new paradigms have taken over. Variations of LLMs are not the type of new paradigm that serious researches should be aiming for.
I wonder if there can be a unification of spatial-temporal representations and language. I am guessing diffusion video generators already achieve this in some way. But I wonder if new techniques can improve the efficiency and capabilities.
I assume the Nested Learning stuff is pretty relevant.
Although I've never totally grokked transformers and LLMs, I always felt that MoE was the right direction and besides having a strong mapping or unified view of spatial and language info, there also should somehow be the capability of representing information in a non-sequential way. We really use sequences because we can only speak or hear one sound at a time. Information in general isn't particularly sequential, so I doubt that's an ideal representation.
So I guess I am kind of variations of transformers myself to be honest.
But besides being able to convert between sequential discrete representations and less discrete non-sequential representations (maybe you have tokens but every token has a scalar attached), there should be lots of tokenizations, maybe for each expert. Then you have experts that specialize in combining and translating between different scalar-token tokenizations.
Like automatically clustering problems or world model artifacts or something and automatically encoding DSLs for each sub problem.
why do you say it is garbage ? I watched some of its videos on YT and it looks interesting. I can't judge if it's good or really good, but that didn't sound like garbage at all.
Making LeCun report to Wang was the most boneheaded move imaginable. But… I suppose Zuckerberg knows what he wants, which is AI slopware and not truly groundbreaking foundation models.
In industry research, someone in a chief position like LeCun should know how to balance long-term research with short-term projects. However, for whatever reason, he consistently shows hostility toward LLMs and engineering projects, even though Llama and PyTorch are two of the most influential projects from Meta AI. His attitude doesn’t really match what is expected from a Chief position at a product company like Facebook. When Llama 4 got criticized, he distanced himself from the project, stating that he only leads FAIR and that the project falls under a different organization. That kind of attitude doesn’t seem suitable for the face of AI at the company. It's not a surprise that Zuck tried to demote him.
I would pose a question differently, under his leadership did Meta achieve good outcome?
If the answer is yes, then better to keep him, because he has already proved himself and you can win in the long-term. With Meta's pockets, you can always create a new department specifically for short-term projects.
If the answer is no, then nothing to discuss here.
I believe that the fact that Chinese models are beating the crap of of Llama means it's a huge no.
Zuck did this on purpose, humiliating LeCun so he would leave. Despite LeCun being proved wrong on LLMs capabilities such as reasoning, he remained extremely negative, not exactly inspiring leadership to the Meta Ai team, he had to go.
That was obviously him getting sidelined. And it's easy to see why.
LLMs get results. None of the Yann LeCun's pet projects do. He had ample time to prove that his approach is promising, and he didn't.
He is also not very interested in LLMs, and that seems to be Zuck's top priority.
Yeah I think LeCun is underestimating the impact that LLM's and Diffusion models are going to have, even considering the huge impact they're already having. That's no problem as I'm sure whatever LeCun is working on is going to be amazing as well, but an enterprise like Facebook can't have their top researcher work on risky things when there's surefire paths to success still available.
I politely disagree - it is exactly an industry researcher's purpose to do the risky things that may not work, simply because the rest of the corporation cannot take such risks but must walk on more well-trodden paths.
Corporate R&D teams are there to absorb risk, innovate, disrupt, create new fields, not for doing small incremental improvements. "If we know it works, it's not research." (Albert Einstein)
I also agree with LeCun that LLMs in their current form - are a dead end. Note that this does not mean that I think we have already exploited LLMs to the limit, we are still at the beginning. We also need to create an ecosystem in which they can operate well: for instance, to combine LLMs with Web agents better we need a scalable "C2B2C" (customer delegated to business to business) micropayment infrastructure, because as these systems have already begun talking to each other, in the longer run nobody would offer their APIs for free.
I work on spatial/geographic models, inter alia, which by coincident is one of the direction mentioned in the LeCun article. I do not know what his reasoning is, but mine was/is: LMs are language models, and should (only) be used as such. We need other models - in particular a knowledge model (KM/KB) to cleanly separate knowledge from text generation - it looks to me right now that only that will solve hallucination.
Knowledge models, like ontologies, always seem suspect to me; like they promise a schema for crisp binary facts, when the world is full of probabilistic and fuzzy information loosely categorized by fallible humans based on an ever slowly shifting social consensus.
Everything from the sorites paradox to leaky abstractions; everything real defies precise definition when you look closely at it, and when you try to abstract over it, to chunk up, the details have an annoying way of making themselves visible again.
You can get purity in mathematical models, and in information systems, but those imperfectly model the world and continually need to be updated, refactored, and rewritten as they decay and diverge from reality.
These things are best used as tools by something similar to LLMs, models to be used, built and discarded as needed, but never a ground source of truth.
> it is exactly a researcher's purpose to do the risky things that may not work
Maybe at university, but not at a trillion dollar company. That job as chief scientist is leading risky things that will work to please the shareholders.
They knew what Yann LeCun was when they hired him. If anything, those brilliant academics who have done what they're told and loyally pursued corporate objectives the way the corporation wanted (e.g. Karpathy when he was at Tesla) haven't had great success either.
LLMs and Diffusion solve a completely different problem than world models.
If you want to predict future text, you use an LLM. If you want to predict future frames in a video, you go with Diffusion. But what both of them lack is object permanence. If a car isn't visible in the input frame, it won't be visible in the output. But in the real world, there are A LOT of things that are invisible (image) or not mentioned but only implied (text) that still strongly affect the future. Every kid knows that when you roll a marble behind your hand, it'll come out on the other side. But LLMs and Diffusion models routinely fail to predict that, as for them the object disappears when it stops being visible.
Based on what I heard from others, world models are considered the missing ingredient for useful robots and self-driving cars. If that's halfway accurate, it would make sense to pour A LOT of money into world models, because they will unlock high-value products.
Sure, if you only consider the model they have no object permanence. However you can just put your model in a loop, and feed the previous frame into the next frame. This is what LLM agent engineers do with their context histories, and it's probably also what the diffusion engineers do with their video models.
Messing with the logic in the loop and combining models has an enormous potential, but it's more engineering than researching, and it's just not the sort of work that LeCun is interested in. I think the conflict lies there, that Facebook is an engineering company, and a possible future of AI lies in AI engineering rather than AI research.
I think World models is way to go for Super Iintelligence. One of teh patent i saw already going in this direction for Autonomous mobility is https://patents.google.com/patent/EP4379577A1 where synthetic data generation (visualization) is missing step in terms of our human intelligence.
I thoroughly disagree, I believe world models will be critical in some aspect for text generation too. A predictive world model you can help to validate your token prediction. Take a look at the Code World Model for example.
> but an enterprise like Facebook can't have their top researcher work on risky things when there's surefire paths to success still available.
Bell Labs
Unless I've missed a few updates, much of the JEPA stuff didn't really bear a lot of fruit in the end.
Hard to tell.
The last time LeCun disagreed with the AI mainstream was when he kept working on neural net when everyone thought it was a dead end. He might be entirely right in his LLM scepticism. It's hardly a surefire path. He didn't prevent Meta from working on LLM anyway.
The issue is more than his position is not compatible with short term investors expectations and that's fatal in a company like Meta at the position LeCun occupies.
While I agree with your point, “Superintelligence” is a far cry from what Meta will end up delivering with Wang in charge. I suppose that, at the end of the day, it’s all marketing. What else should we expect from an ads company :?
The Meta Super-Intelligence can dwell in the Metaverse with the 23 other active users there.
Yeah honestly I'm with the LLM people here
If you think LLMs are not the future then you need to come with something better
If you have a theoretical idea that's great, but take to at least GPT2 level first before writing off LLMs
Theoretical people love coming up with "better ideas" that fall flat or have hidden gotchas when they get to practical implementation
As Linus says, "talk is cheap, show me the code".
Do you? Or is it possible to acknowledge a plateau in innovation without necessarily having an immediate solution cooked-up and ready to go?
Are all critiques of the obvious decline in physical durability of American-made products invalid unless they figure out a solution to the problem? Or may critics of a subject exist without necessarily being accredited engineers themselves?
LLM's are probably always going to be the fundamental interface, the problem they solved was related to the flexibility of human languages allowing us to have decent mimikry's.
And while we've been able to approximate the world behind the words, it's just full of hallucinations because the AI's lack axiomatic systems beyond much manually constructed machinery.
You can probably expand the capabilties by attaching to the front-end but I suspect that Yann is seeing limits to this and wants to go back and build up from the back-end of world reasoning and then _among other things_ attach LLM's at the front-end (but maybe on equal terms with vision models that allows for seamless integration of LLM interfacing _combined_ with vision for proper autonomous systems).
Why not both? LLM:s probably have a lot more potential than what is currently being realized but so does world models.
LLMs are the present. We will see what the future holds.
Of course the challenge with that is it's often not obvious until after quite a bit of work and refinement that something else is, in fact, better.
Isn't that exactly why he's starting a new company?
Well, we will see if Yann can.
When I first saw their LLM integration on Facebook I thought the screenshot was fake and a joke
Zuckerberg knows what he wants but he rarely knows how to get it. That's been his problem all along. Unlike others he isn't scared to throw ridiculous amounts of money at a problem though and buy companies who do things he can't get done himself.
Would love to have been a fly on the wall during one of their 1:1’s.
Yes, that was such a bizarre move.
LeCun, who's been saying LLMs are a dead end for years, is finally putting his money where his mouth is. Watch for LeCun to raise an absolutely massive VC round.
So not his money ;)
Good. The world model is absolutely the right play in my opinion.
AI Agents like LLMs make great use of pre-computed information. Providing a comprehensive but efficient world model (one where more detail is available wherever one is paying more attention given a specific task) will definitely eke out new autonomous agents.
Swarms of these, acting in concert or with some hive mind, could be how we get to AGI.
I wish I could help, world models are something I am very passionate about.
Can you explain this “world model” concept to me? How do you actually interface with a model like this?
He needs a patient investor and realized Zuck is not that. As someone who delivers product and works a lot with researchers I get the constant tension that might exist with competing priorities. Very curious to see how he does, imho the outcome will be either of the extremes - one of the fastest growing companies by valuation ever or a total flop. Either way this move might advance us to whatever end state we are heading towards with AI.
Will be interesting to see how he fares outside the ample resources of Meta: Personnel, capital, infrastructure, data, etc. Startups have a lot of flexibility, but a lot of additional moving parts. Good luck!
I would love to join his startup, if he hires me, and there are many such people like me, and more talented.
Fi Fi Lee also recently founded a new AI startup called World Labs, which focus on creating AI world models with spatial intelligence to understand and interact with the 3D world, unlike current LLM AI that primarily processes 2D images and text. Almost exactly the same focus as Yann LeCun's new venture stated in the parent article.
The writing was on the wall when Zuck hired Wang. That combined with LeCun's bearish sentiment on LLMs led to this.
Interesting he isn't just working with Feifei Li if he's really interested in 'world models'.
Exactly where my mind turned. It's interesting how the AI OG's (Feifei and Cunn) think world models are the way forward.
Working under LeCun but outside of Zuckerberg's sphere of influence sure sounds like a dream job.
Really? From where I'm standing LeCun is a pompous researcher who had early success in his career, and has been capitalizing on that ever since. Have you read any of his papers from the last 20 years? 90% of his citations are to his own previous papers. From there, he missed the boat on LLMs and is now pretending everyone else is wrong so that he can feel better about it.
His research group have introduced some pretty impactful research and open source models.
https://ai.meta.com/research/
"These models aim to replicate human reasoning and understanding of the physical world, a project LeCun has said could take a decade to mature."
What an insane time horizon to define success. I suppose he easily can raise enough capital for that kind of runway.
That guy has survived the AI winter. He can wait 10 years for yet another breakthrough. [but the market can’t]
https://en.wikipedia.org/wiki/AI_winter
We're at most in an "AI Autumn" right now. The real Winter is yet to come.
We have already been through winter.Ffor those of us old enough to remember, the OP was making a very clear statement.
Winter is a cyclical concept, just like all the other seasons. It will be no different here; the pendulum swings back and forth. The unknown factor is the length of the cycle.
Java Spring.
Google summer.
AI autumn.
Nuclear winter.
A pretty short time horizon for actual research. Interesting to see it combined with the SV/VC world, though.
I suspect he sees a lot of scattered pieces of fundamental research outside of LLM's that he thinks could be integrated for a core within a year, the 10 years is to temper investors (that he can buy leeway for with his record) and fine tune and work out the kinks when actually integrating everything that might not have some obvious issues.
Zuck is a business guy, understandable that this isn't going to fly with him
What is going on at meta?
Soumith probably knew about Lecun.
I’m taking a second look at my PyTorch stack.
During his years at Meta, LeCun failed to deliver anything that delivered real value to stockholders, and may have demotivated people working on LLMs—he repeatedly said, "If you are interested in human-level AI, don’t work on LLMs."
His stance is understandable, but hardly the best way to rally a team that needs to push current tech to the limit.
The real issue: Meta is *far behind* Google, Anthropic, and OpenAI.
A radical shift is absolutely necessary - regardless of how much we sympathize with LeCun’s vision.
According to Grok, these were LeCun's real contributions at Meta (2013–2025):
- PyTorch – he championed a dynamic, open-source framework; now powers 70%+ of AI research
- LLaMA 1–3 – his open-source push; he even picked the name
- SAM / SAM 2 – born from his "segment anything like a baby" vision
- JEPA (I-JEPA, V-JEPA) – his personal bet on non-autoregressive world models
Everything else (Movie Gen, LLaMA 4, Meta AI Assistant) came after he left or was outside his scope.
What the hell does Mark see in Wang? Wang was born into a family whose parents got Chinese government scholarships to study abroad but secretly stayed in the US, and then the guy turns super anti-China. From any angle, this dude just doesn't seem reliable at all.
> Wang was born into a family whose parents got Chinese government scholarships to study abroad but secretly stayed in the US, and then the guy turns super anti-China.
All I'm hearing is he's a smart guy from a smart family?
he is very smart. but Mark is not. Ever since Wang joined Meta, way too many big-name AI scientists have bounced because of him. US AI companies have at least half their researchers being Chinese, and now they've stuck this ultimate anti-China hardliner in charge—I just don't get what the hell Meta's up to(And even a lot of times, it ends up affecting non-Chinese scientists too.). Being anti-China? Fine, whatever, but don't let it tank your own business and products first.
All I'm hearing is unreliable grifter from a family of unreliable grifters.
Everybody has found out how LLMs no longer have a real research expanding horizon. Now most progress will likely be done by tweaks in the data, and lots of hardware. OpenAI's strategy.
And also it has extreme limitations that only world models or RL can fix.
Meta can't fight Google (has integrated supply chain, from TPUs to their own research lab) or OpenAI (brand awareness, best models).
You gotta give it to Meta. They were making AI slop before AI even existed.
Change my mind, Facebook was never invented by Zuck's genius
All he's been responsible for is making it worse
He’s an incredible operator and has managed to acquire and grow an astounding number of successful businesses under the Meta banner. That is not trivial.
Almost every company in Facebook's position in 2005 would have disappeared into irrelevance by now.
Somehow it's one of the most valuable businesses in the world instead.
I don't know him, but, if not him, who else would be responsible for that?
He definitely has horrible product instincts, but he also bought insta and whatsapp at what were, back then, eye-watering prices, and these were clearly massive successes in terms of killing off threats to the mothership. Everything since then, though…
I know but isn't "massive success" rubbing up against antitrust here? The condition was "Don't share data with Facebook"
But wait they're just about to get AGI why would he leave???
LeCun always said that LLMs do not lead to AGI.
Can anyone explain to me the non-$$ logic for one working towards AGI, aside from misanthropy?
The only other thing I can imagine is not very charitable: intellectual greed.
It can't just be that, can it? I genuinely don't understand. I would love to be educated.
That's the old dream of creating life, becoming God. Like the Golem, Frankenstein...
I'm working toward AGI. I hope AGI can be used to automate work and make life easier for people.
>> non-$$ logic [...] aside from misanthropy
> I hope AGI can be used to automate work
You people need a PR guy, I'm serious. OpenAI is the first company I've ever seen that comes across as actively trying to be misanthropic in its messaging. I'm probably too old-fashioned, but this honestly sounds like Marlboro launching the slogan "lung cancer for the weak of mind".
Who’s gonna pay for that inference?
It’s going to take money, what if your AGI has some tax policy ideas that are different from the inference owners?
Why would they let that AGI out into the wild?
Let’s say you create AGI. How long will it take for society to recover? How long will it take for people of a certain tax ideology to finally say oh OK, UBI maybe?
The last part is my main question. How long do you think it would take our civilization to recover from the introduction of AGI?
Edit: sama gets a lot of shit, but I have to admit at least he used to work on the UBI problem, orb and all. However, those days seem very long gone from the outside, at least.
How old are you?
That's what they've been selling us for the past 50 years and nothing has changed, all the productivity gain was pocketed by the elite
He also said other things about LLMs that turned out to be either wrong or easily bypassed with some glue. While I understand where he comes from, and that his stance is pure research-y theory driven, at the end of the day his positions were wrong.
Previously, he very publicly and strongly said:
a) LLMs can't do math. They trick us in poetry but that's subjective. They can't do objective math.
b) they can't plan
c) by the very nature of autoregressive arch, errors compound. So the longer you go in your generation, the higher the error rate. so at long contexts the answers become utter garbage.
All of these were proven wrong, 1-2 years later. "a" at the core (gold at IMO), "b" w/ software glue and "c" with better training regimes.
I'm not interested in the will it won't it debates about AGI, I'm happy with what we have now, and I think these things are good enough now, for several usecases. But it's important to note when people making strong claims get them wrong. Again, I think I get where he's coming from, but the public stances aren't the place to get into the deep research minutia.
That being said, I hope he gets to find whatever it is that he's looking for, and wish him success in his endeavours. Between him, Fei Fei Li and Ilya, something cool has to come out of the small shops. Heck, I'm even rooting for the "let's commoditise lora training" that Mira's startup seems to go for.
That's true but I also think despite being wrong about the capabilities of LLMs, LeCun has been right in that variations of LLMs are not an appropriate target for long term research that aims to significantly advance AI. Especially at the level of Meta.
I think transformers have been proven to be general purpose, but that doesn't mean that we can't use new fundamental approaches.
To me it's obvious that researchers are acting like sheep as they always do. He's trying to come up with a real innovation.
LeCun has seen how new paradigms have taken over. Variations of LLMs are not the type of new paradigm that serious researches should be aiming for.
I wonder if there can be a unification of spatial-temporal representations and language. I am guessing diffusion video generators already achieve this in some way. But I wonder if new techniques can improve the efficiency and capabilities.
I assume the Nested Learning stuff is pretty relevant.
Although I've never totally grokked transformers and LLMs, I always felt that MoE was the right direction and besides having a strong mapping or unified view of spatial and language info, there also should somehow be the capability of representing information in a non-sequential way. We really use sequences because we can only speak or hear one sound at a time. Information in general isn't particularly sequential, so I doubt that's an ideal representation.
So I guess I am kind of variations of transformers myself to be honest.
But besides being able to convert between sequential discrete representations and less discrete non-sequential representations (maybe you have tokens but every token has a scalar attached), there should be lots of tokenizations, maybe for each expert. Then you have experts that specialize in combining and translating between different scalar-token tokenizations.
Like automatically clustering problems or world model artifacts or something and automatically encoding DSLs for each sub problem.
I wish I really understood machine learning.
Zuck is definitely an idiot and MSL is an expensive joke, but LeCun hasn’t been relevant in a decade at this point.
No doubt his pitch deck will be the same garbage slides he’s been peddling in every talk since the 2010’s.
why do you say it is garbage ? I watched some of its videos on YT and it looks interesting. I can't judge if it's good or really good, but that didn't sound like garbage at all.