A stubborn computer scientist accidentally launched the deep learning boom

(arstechnica.com)

114 points | by rbanffy 2 days ago ago

27 comments

afh1 a day ago

Related discussion, 7 days ago: https://news.ycombinator.com/item?id=42057139

I may be wrong, but I disagree with a lot of the spin being pushed in this article, although of course Hinton deserves a lot of the credit for keeping the field alive and igniting the "deep learning" explosion.

- The ImageNet dataset / competition pre-dates neural net entrants, and I'm not aware that Fei Fei created it in anticipation of such. The reason that AlexNet (2012 ILSVRC/ImageNet entry) made such an impact was that it beat all non-ANN entrants by such a huge margin that it was impossible to ignore, and pretty much killed all ongoing attempts to hand-design transform-invariant image features (such as SIFT).

- While NVidia deserve credit for enabling GPU-compute with CUDA, it was pure luck that ANNs subsequently took off, and became the primary use of CUDA (which then added ANN specific libraries such as cuDNN).

- AlexNet (ImageNet 2012) was certainly what started the ANN boom, which Hinton/LeCun/Bengio then decided to re-brand as "deep learning" to escape any historically negative association of "neural networks". However, while AlexNet demonstrated the power of a large (& deep) neural net, I don't think it's fair to say that it was responsible the current/waning belief in LLM "scaling laws". The immediate aftermath of AlexNet wasn't bigger datasets, but attempts to build bigger/better ANNs to do better on the ImageNet benchmark. The LLM "scaling laws" originated from OpenAI's GPT-1 and GPT-2 where unexpected model capabilities lead to experimentation of scaling, with Sutskever and Amodei being two of the earliest believers. Sutskever has recently said that he thinks that transformer scaling has plateaued, and has started his own company (SSI) pursuing a different approach.

I don't think we can say that Hinton accidentally created the deep learning boom. He always (to his huge credit) believed in ANNs, pushed it into the public eye with AlexNet, created the "Deep Learning" branding, and generally promoted it until it got too powerful for his liking.

[-]

pavon a day ago

I don't see any of that in the article, in fact quite the opposite. The article explicitly states that neither Fei Fei nor NVidia were thinking about neural networks when creating ImageNet or CUDA, rather it is spinning the serendipity of the three unrelated efforts coming together to create deep learning, and the idea that none of the three would have happened if it weren't for people pushing ideas that went contrary to conventional wisdom.

> I don't think we can say that Hinton accidentally created the deep learning boom.

Yeah, the headline is unclear, but I read it as referring to Fei Fei being the "accidental" contributor to deep learning, not Hinton.

qrios a day ago

> Huang argued that the simple existence of CUDA would enlarge the supercomputing sector. This view was not widely held, and by the end of 2008, Nvidia’s stock price had declined by seventy percent…

This is a good example of how investor behaviour can only quantitatively project what will happen in the future. Huang's bet on GPUs for high performance computing made sense in the long term.

Intel didn't have the staying power with the i860[1] a decade earlier (and of course had no idea how to offer decent developer tools). I tried really hard to develop meaningful and executable programmes with an 8bit card (DSM). CUDA was a revelation for me.

[1] https://www.geekdot.com/intel-80860/

[-]

trynumber9 a day ago

The stock value didn't drop because of CUDA but because most stocks and tech stocks dropped in 2008. AMD was also down 80%, Intel down 40%.

[-]

magicalhippo a day ago

I was going to say anyone should have noted that stocks being down in 2008 means nothing due to the global financial crisis[1].

But then I realized the author might have been a toddler at that time so might not have lived through it, making it easier to forget.

How time flies...

[1]: https://en.wikipedia.org/wiki/2007%E2%80%932008_financial_cr...

VyseofArcadia a day ago

> Nvidia invented the GPU in 1999

This is just plain not true.

Who edited this? Did no one even bother to skim the Wikipedia page on GPUs?

[-]

giobox a day ago

Interestingly Nvidia make the exact same claim in 1999 when launching the Geforce 256.

"3-D graphics systems maker nVidia (NVDA) Tuesday unveiled its next generation graphics accelerator, which it hopes will deflect some of the attention from new console gaming machines. The GeForce 256, which nVidia calls the world's first graphics processing unit, will be the successor to its TNT 2 line of graphics chips."

> https://money.cnn.com/1999/08/31/technology/nvidia/

Reading deeper, it seems like Nvidia considered it the first because it was the first GPU to ship with hardware accelerated T&L:

"GeForce 256 was marketed as "the world's first 'GPU', or Graphics Processing Unit", a term Nvidia defined at the time as "a single-chip processor with integrated transform, lighting, triangle setup/clipping, and rendering engines that is capable of processing a minimum of 10 million polygons per second"."

> https://en.wikipedia.org/wiki/GeForce_256

I would also consider earlier NVidia parts like the TNT2 to be GPUs, personally, not sure I agree with this claim either. I certainly called my TNT2 a graphics card back when I bought one in the 90s.

[-]

p_l a day ago

It wasn't even first GPU with hw t&l, it was first PC GPU with hw T&L, because legacy of gaming use meant that major graphic cards on PCs started from texturing side while hw t&l is where workstation GPUs started out.

[-]

VyseofArcadia a day ago

Like when Henry Ford invented the wheel*.

* The first wheel with an axle**.

** The first wheel with an axle for horseless carriages.

f1shy a day ago

The whole thing is absolutely utter crap. As early as 2002 there was, in the university where I studied a BIG push into NN. The term “deep learning” is also lousy and marketing.

[-]

Chabsff a day ago

Isn't "deep learning" specifically referring to the breakthroughs that finally started addressing the issues preventing architectures with more than a single hidden layer from achieving anything?

Unless I'm getting all of that wrong, it doesn't sound like that bad of a term. I get that it has since been used and abused to absurdity. But it's not like it came out of nowhere at first.

[-]

f1shy a day ago

That is deep neural network. More than 2 hidden layers. The “learning” is marketing

mp05 a day ago

My professor told me learning is considered "deep" if your NN has more than two layers.

IAmGraydon a day ago

Yes and to add a bit more color for those who are curious, the term "GPU" was coined by Sony in reference to the 32-bit Sony GPU (designed by Toshiba) in the PlayStation video game console, released in 1994. That said, the first devices that could be qualified as GPUs came about in the 1970s.

baq a day ago

> Who edited this?

chatgpt, probably.

parshimers a day ago

yeah, you could argue that the RIVA128 was the first successful PC GPU that didn't require a separate video card, but that's not the same thing.

[-]

p_l a day ago

succesful consumer PC GPU, if that.

davideg a day ago

The Alignment Problem: Machine Learning and Human Values[1] by Brian Christian is a nice overview of some of this history and exploration of some of the pitfalls in different ML approaches. It's really interesting reading it now given what's been happening with OpenAI and ChatGPT, etc.

[1] https://en.wikipedia.org/wiki/The_Alignment_Problem

a day ago

[deleted]

julianeon a day ago

She wrote a book incidentally, I haven't read it but I'm sure it's worth checking out: The Worlds I See.

throw0101b a day ago

See "Geoffrey Hinton in conversation with Fei-Fei Li" where the two got together and looked back at the developments over the last ~10 years, including why and how they did their part:

* https://www.youtube.com/watch?v=QWWgr2rN45o

prettyStandard a day ago

In college 2004-2008 I was ridiculed a few times for thinking Neural Networks were interesting. The way I saw it was human intelligence was built on neural network. All we were missing were sufficiently accurate models of neurons or scale.

In middle school I estimated to rival the processing power of a human brain you would need the processing power of an Empire State building full of Pentium 3 processors. It was a pretty rough calculation. I'm having a hard time remembering just how it went.

lol

[-]

neofrommatrix a day ago

I had a similar experience when I wanted to focus on NN research in 2007. My advisor had the wrong incentives. FML.

area51org a day ago

While I do like a lot of what AI brings us, it also scares the hell out of me and makes me fear more for the future of humanity than I ever have.

"Your scientists were so preoccupied with whether or not they could, they didn't stop to think if they should." —"Dr. Ian Malcolm" (Jeff Goldblum), Jurassic Park

yapyap a day ago

Lol someone likes Nvidia

trhway a day ago

in 15 years from :

"He then reached out to Nvidia. “I sent an e-mail saying, ‘Look, I just told a thousand machine-learning researchers they should go and buy Nvidia cards. Can you send me a free one?’ ” Hinton told me. “They said no.”"

to Altman saying he needs $7T and people eagerly lining up to give the money. Moore law for the venture money in tech.

And each time the new tech wave is higher. Somebody somewhere is already working on that next "trillions dollar something in 10-15 years". Quantum computing based NNs? Or may be just robots everywhere.