I feel that the Zen used in the West and the Zen in East Asia are quite different. I think the Western Zen is probably the one from the 1970s book Zen and the Art of Motorcycle Maintenance. It usually carries a sense of equanimity and beginner's mind. But in East Asia, Zen actually emphasizes aimlessness or non‑purposefulness.
The point where I really feel the difference is that Western Zen seems to be about how to train the self to become stronger, whereas actual Seon (Zen) in East Asia is about going with nature, letting go of the self, and allowing things to flow. In the actual practice of Seon, it's about doubting the self, letting go of attachments, and realizing that achievement, comparison, and the desire for control are all just fleeting. There's a famous phrase: 'Banghasak (放下著)' — let it all go.
If anything, I think ancient Roman Stoicism feels more like Zen than Western Zen does
So that's fascinating. When I saw this article, I was expecting it to be about whether we should give up the desire for success, but instead it took a completely different direction, which was surprising
Similarly, the Western idea of Stoicism seems to focus mostly on controlling or even suppressing your emotions (at least on surface level), while the Stoicism you rightly call "Roman" (thanks for that, btw) is much more holistic and more of an ethical framework.
> Zen actually emphasizes aimlessness or non‑purposefulness
The visual metaphor from Taoists is being like 'uncarved wood'. Western Zen has been bastardised and commercialised, whereas one can look into Taoism to find many of the same concepts that, by virtue of their own simplicity, have remained timeless. The "problem", so to speak, with Zen is being associated with Buddhism, which has a long and intricate history and body of works attached to it, yet moves towards the same line of simplicity and spontaneity of Taoism.
In the words of Alan Watts, it all starts with the eternal Tao; all other religions are for people that need the same ideas overcomplicated with too many words.
You seem to know quite a lot about the East. Buddhism and Taoism are a bit different, of course, but your understanding is largely in line with how Eastern popular thought actually sees things. It seems like you've done a fair amount of business with Easterners.
"To be done with doing", from Ursula K. LeGuin's Earthsea novels, always struck me as such a powerful phrase. An entire state of mind boiled down to 5 words. But then again I remember her saying eastern philosophy greatly influenced her writing, if I'm not mistaken
The 'Le Guin' series actually had similar kinds of stories in Asia before. There's a strong Taoist influence, you see—more specifically, Chinese-style Taoism rather than a Buddhist perspective.
From the viewpoint of '不立文字 (Bù lì wén zì): truth is not confined to language; language is merely the finger pointing at the truth' — this is closer to Taoism than to Zen. In fact, the Chinese worldview runs deep throughout her worldbuilding. Le Guin's take on 'magic' reflects a profound understanding of Eastern philosophy. The reason Ged doesn't use magic lightly is precisely a matter of balance, and (without giving away spoilers) the final confrontation between Ged and the Shadow is essentially about embracing one's own dark side — which shows a deep grasp of Taoist thought.
Personally, I also love the Earthsea series. The philosophy underlying that world is exactly the kind that resonates especially well with East Asian readers
Ha, wow, thanks for the refinement. Indeed use of language (especially at the end with the dragons) is a very important theme.
And I agree, it's more than excellent. The judicious magic, the way she manages to naturally - without it becoming a sermon - describe acts of kindness as the biggest miracles, is great.
Around 2015, I found myself managing back end and machine learning engineers (not researchers) at the same time. Many of the back end engineers wanted to do more ML. Some of them did well when given a chance, but others wanted to revert to back end within a few months. At the same time, one of the ML leaders wanted to step away from ML and only do back end work to support ML.
As I studied these dynamics, something occurred to me... Different people need to see signs of success at different frequencies. Because of the nature of our product, measuring the performance of a new/updated model required the model to be live for at least a full calendar month. So, between initial work and final analysis, it was often a 2 month wait or more. For many back end tasks, you can build a quick prototype, run it to see if it works, and be on your way - the signals come all day long. The varying frequency needs of different people went a long way to determining which of them liked working on ML.
This is sort of a manager's version of feature engineering. ;-) The people on that team taught me a lot!
I saw the same thing and always wondered how you can manage it effectively.
I had a team of data engineers that wanted to do more data science, and 2 data scientists that both wanted to be data engineers(one of them argued that everyone wants to be DS and so it was too crowded, saying that they could make more money as a DE).
I also remember a specific instance where, one day, my friend ranted about how he needs to step away from pure front end and that it's a dead end career (he was quite good at it too!) and then the next day at lunch a colleague started complaining about how front end developers get all the credit and he's considering moving.
Stepping away from the work to find inspiration, to allow the subconscious time to process everything, to present your conscious mind ideas is necessary. I try to pick a wild or almost outlandish idea from time to time, because if I only try what I think will work, then I'm not doing my job.
I think this also stems from ML being more like biology or alchemy and less like math or programming (where you can get down to the first principles, abstractions are rock solid, and non-determinism is limited in scope).
like the author said, so much of 'success' or 'progress' (in research but of course also across disciplines) depends upon temperament. just straight up having a good attitude about things. the skills that make a good researcher could not be more transferable: patience, innate curiosity, and a resilience against failure.
that said, these skills are increasingly rare/at a premium given our culture of minimizing discomfort tolerance via hyperconvenience. people have a harder and harder time waiting or failing.
> If you want to solve a problem, the tried-and-true path to success is to attempt a solution, try it, reach a bottleneck, try to solve it, and only reach for literature when you’ve run out of ideas yourself.
I've found this to be the right balance between using your creativity and getting stuck too long
Perhaps I've been deep in my own issues for too long, but it seems to me that the author is trying to say "don't trust the current evaluation suites too much"; scores only reflect a small part of the problem. What's interesting is discovering a new, stable evaluation metric, doing something new based on it, and having that new thing yield some unexpected intelligent results
I have some coworkers that are similar in everything--education, work ethic, and intelligence--but some of the tick out ML ideas that work like clockwork, while others get hits rarely if ever. I cannot tell what makes it work for some and not others. Their ideas both sound equally good.
Sometimes a coworker will be an ML star for a year or two, but then suddenly run out of steam. It's brutal to watch.
I used to think most smart people had similar distributions of good ideas, and it was just that the hardest working tried out all 50 of their ideas to pick out the 2 good ones. But I've seen smart and hardworking people have a hit rate of 0.
It's not just ML research; that's just human nature.
We like to see hard-working, God-fearing people minting raw knowledge from Mount Olympus itself, whereby each shard of crystalline insight is carved meticulously by the Apprentice over the course of a productive and morally pure career.
The reality is it's some skill plus the occasional drive-by of an unknown force of nature, hitting you on the head with a shattered fragment of insight whose provenance you'll remain completely ignorant of. I'd say we just revert back to invoking the muses. It was a fine explanation.
That's the nature of research. You try every idea that may be a good avenue and only a handful work out, if at all. That's why quantifying research credibility via publication and citation counts inherently lead to toxic work cultures. The best ideas must be given time to be discovered, not forced out and contorted to fit the requirements of a journal.
this is part of why I think most researchers get less productive over time... Someone gets some big result during grad school or early career, get some big job from it, and then struggle to get new results of similar quality :shrug:
With ML in particular, there's also the sheer volume of people basically all looking at (essentially) the same problems... so it's kind of like monkeys with type writers spamming ideas until some work.
In spirituality it is believed that ideas and inspirations aren't our own. That our mind is like an LLM that gets prompted by higher beings. In research everyone has high param count minds, trained for many years by studying. But just like LLMs by themselves are useless at creating new original work, no matter the compute you have available, so the mind can not create anything new without "inspiration"
It revolves around the sentiment of "go deeper" - but I think it is a double-edged sword.
Sure, entropy, tensors and gradients are important - and yes, they are pretty much requirements.
But from what I see, it is the opposite - a lot (if not virtually all) progress in the last decade of deep learning was not because of a fundamental idea, but incremental, experimentally-verified practice.
Even though I think there is good intuition for why ReLU is better than sigmoid (tl;dr: last layer is log(sigmoid) ~ ReLU, putting anything different inside kills the gradient), the original paper by Hinton himself was more or less "because it trains 3x faster".
Re-thinking fundamentals might help, but most "let's change the fundamentals" is rarely how it works. Even the most seminal papers, i.e. AlexNet and "Attention Is All You Need", are refinements of existing ideas, and show how they help.
Machine learning is an experimental science. Many mathematically cool ideas do not work. Many engineering ones do.
> I've tweeted before that one of the most important traits in a researcher is healthy paranoia. Be paranoid!
I have seen so many PhDs burned out to cinders; I don't think it is any more a good piece of advice than "depression is good for philosophers". Sure, be a relentless explorer.
> In short, holding on to ideas for too long can actually be counterproductive. Stay open-minded and refuse to let ego cloud your judgement.
I feel that the Zen used in the West and the Zen in East Asia are quite different. I think the Western Zen is probably the one from the 1970s book Zen and the Art of Motorcycle Maintenance. It usually carries a sense of equanimity and beginner's mind. But in East Asia, Zen actually emphasizes aimlessness or non‑purposefulness.
The point where I really feel the difference is that Western Zen seems to be about how to train the self to become stronger, whereas actual Seon (Zen) in East Asia is about going with nature, letting go of the self, and allowing things to flow. In the actual practice of Seon, it's about doubting the self, letting go of attachments, and realizing that achievement, comparison, and the desire for control are all just fleeting. There's a famous phrase: 'Banghasak (放下著)' — let it all go.
If anything, I think ancient Roman Stoicism feels more like Zen than Western Zen does
So that's fascinating. When I saw this article, I was expecting it to be about whether we should give up the desire for success, but instead it took a completely different direction, which was surprising
Similarly, the Western idea of Stoicism seems to focus mostly on controlling or even suppressing your emotions (at least on surface level), while the Stoicism you rightly call "Roman" (thanks for that, btw) is much more holistic and more of an ethical framework.
Who doesn't call stoicism Roman?
Most pop stoics focus on the Greeks :P
> Who doesn't call stoicism Roman?
The Greeks?
Thank you for letting me know correctly.
When I read the title I thought it was about running machine learning algorithms on AMD/Zen processors
> Zen actually emphasizes aimlessness or non‑purposefulness
The visual metaphor from Taoists is being like 'uncarved wood'. Western Zen has been bastardised and commercialised, whereas one can look into Taoism to find many of the same concepts that, by virtue of their own simplicity, have remained timeless. The "problem", so to speak, with Zen is being associated with Buddhism, which has a long and intricate history and body of works attached to it, yet moves towards the same line of simplicity and spontaneity of Taoism.
In the words of Alan Watts, it all starts with the eternal Tao; all other religions are for people that need the same ideas overcomplicated with too many words.
You seem to know quite a lot about the East. Buddhism and Taoism are a bit different, of course, but your understanding is largely in line with how Eastern popular thought actually sees things. It seems like you've done a fair amount of business with Easterners.
Would either of you have a recommendation on where to start learning about either?
My journey into this world started with Watts' "The Way of Zen", and later, with his posthumous book "Tao: The Watercourse Way"
And I am a big fan of Ron Hogan's "Getting Right with Tao" translation/modern interpretation of the Tao Te Ching.
Lao Tzu: Tao Te Ching (Translated Ursula K. Le Guin) The Way of Lao Tzu (Wing-tsit Chan)
I am just another western poser that has sought peace of mind reading Eastern philosophy. I am no expert.
"To be done with doing", from Ursula K. LeGuin's Earthsea novels, always struck me as such a powerful phrase. An entire state of mind boiled down to 5 words. But then again I remember her saying eastern philosophy greatly influenced her writing, if I'm not mistaken
To be done with doing, would appear to require passive income?
The 'Le Guin' series actually had similar kinds of stories in Asia before. There's a strong Taoist influence, you see—more specifically, Chinese-style Taoism rather than a Buddhist perspective.
From the viewpoint of '不立文字 (Bù lì wén zì): truth is not confined to language; language is merely the finger pointing at the truth' — this is closer to Taoism than to Zen. In fact, the Chinese worldview runs deep throughout her worldbuilding. Le Guin's take on 'magic' reflects a profound understanding of Eastern philosophy. The reason Ged doesn't use magic lightly is precisely a matter of balance, and (without giving away spoilers) the final confrontation between Ged and the Shadow is essentially about embracing one's own dark side — which shows a deep grasp of Taoist thought.
Personally, I also love the Earthsea series. The philosophy underlying that world is exactly the kind that resonates especially well with East Asian readers
Ha, wow, thanks for the refinement. Indeed use of language (especially at the end with the dragons) is a very important theme.
And I agree, it's more than excellent. The judicious magic, the way she manages to naturally - without it becoming a sermon - describe acts of kindness as the biggest miracles, is great.
Highly recommended.
Kind of like when I had dark bread in Asia, it was white bread with food color.
Some things don't transfer well.
That good 'ole Protestant work ethic. Idle hands are the Devil's play things!
Given AI’s impact on society, I read this more as Zen And The Practice of Kamikaze.
Around 2015, I found myself managing back end and machine learning engineers (not researchers) at the same time. Many of the back end engineers wanted to do more ML. Some of them did well when given a chance, but others wanted to revert to back end within a few months. At the same time, one of the ML leaders wanted to step away from ML and only do back end work to support ML.
As I studied these dynamics, something occurred to me... Different people need to see signs of success at different frequencies. Because of the nature of our product, measuring the performance of a new/updated model required the model to be live for at least a full calendar month. So, between initial work and final analysis, it was often a 2 month wait or more. For many back end tasks, you can build a quick prototype, run it to see if it works, and be on your way - the signals come all day long. The varying frequency needs of different people went a long way to determining which of them liked working on ML.
This is sort of a manager's version of feature engineering. ;-) The people on that team taught me a lot!
I saw the same thing and always wondered how you can manage it effectively.
I had a team of data engineers that wanted to do more data science, and 2 data scientists that both wanted to be data engineers(one of them argued that everyone wants to be DS and so it was too crowded, saying that they could make more money as a DE).
I also remember a specific instance where, one day, my friend ranted about how he needs to step away from pure front end and that it's a dead end career (he was quite good at it too!) and then the next day at lunch a colleague started complaining about how front end developers get all the credit and he's considering moving.
Stepping away from the work to find inspiration, to allow the subconscious time to process everything, to present your conscious mind ideas is necessary. I try to pick a wild or almost outlandish idea from time to time, because if I only try what I think will work, then I'm not doing my job.
I think this also stems from ML being more like biology or alchemy and less like math or programming (where you can get down to the first principles, abstractions are rock solid, and non-determinism is limited in scope).
excellent essay. what a great read.
like the author said, so much of 'success' or 'progress' (in research but of course also across disciplines) depends upon temperament. just straight up having a good attitude about things. the skills that make a good researcher could not be more transferable: patience, innate curiosity, and a resilience against failure.
that said, these skills are increasingly rare/at a premium given our culture of minimizing discomfort tolerance via hyperconvenience. people have a harder and harder time waiting or failing.
> If you want to solve a problem, the tried-and-true path to success is to attempt a solution, try it, reach a bottleneck, try to solve it, and only reach for literature when you’ve run out of ideas yourself.
I've found this to be the right balance between using your creativity and getting stuck too long
Perhaps I've been deep in my own issues for too long, but it seems to me that the author is trying to say "don't trust the current evaluation suites too much"; scores only reflect a small part of the problem. What's interesting is discovering a new, stable evaluation metric, doing something new based on it, and having that new thing yield some unexpected intelligent results
I have some coworkers that are similar in everything--education, work ethic, and intelligence--but some of the tick out ML ideas that work like clockwork, while others get hits rarely if ever. I cannot tell what makes it work for some and not others. Their ideas both sound equally good.
Sometimes a coworker will be an ML star for a year or two, but then suddenly run out of steam. It's brutal to watch.
I used to think most smart people had similar distributions of good ideas, and it was just that the hardest working tried out all 50 of their ideas to pick out the 2 good ones. But I've seen smart and hardworking people have a hit rate of 0.
It's not just ML research; that's just human nature.
We like to see hard-working, God-fearing people minting raw knowledge from Mount Olympus itself, whereby each shard of crystalline insight is carved meticulously by the Apprentice over the course of a productive and morally pure career.
The reality is it's some skill plus the occasional drive-by of an unknown force of nature, hitting you on the head with a shattered fragment of insight whose provenance you'll remain completely ignorant of. I'd say we just revert back to invoking the muses. It was a fine explanation.
That's the nature of research. You try every idea that may be a good avenue and only a handful work out, if at all. That's why quantifying research credibility via publication and citation counts inherently lead to toxic work cultures. The best ideas must be given time to be discovered, not forced out and contorted to fit the requirements of a journal.
this is part of why I think most researchers get less productive over time... Someone gets some big result during grad school or early career, get some big job from it, and then struggle to get new results of similar quality :shrug:
With ML in particular, there's also the sheer volume of people basically all looking at (essentially) the same problems... so it's kind of like monkeys with type writers spamming ideas until some work.
In spirituality it is believed that ideas and inspirations aren't our own. That our mind is like an LLM that gets prompted by higher beings. In research everyone has high param count minds, trained for many years by studying. But just like LLMs by themselves are useless at creating new original work, no matter the compute you have available, so the mind can not create anything new without "inspiration"
Wow, this makes ML sound even more like voodoo than I thought. Can you give examples of what the nature of these ideas is?
It revolves around the sentiment of "go deeper" - but I think it is a double-edged sword. Sure, entropy, tensors and gradients are important - and yes, they are pretty much requirements.
But from what I see, it is the opposite - a lot (if not virtually all) progress in the last decade of deep learning was not because of a fundamental idea, but incremental, experimentally-verified practice. Even though I think there is good intuition for why ReLU is better than sigmoid (tl;dr: last layer is log(sigmoid) ~ ReLU, putting anything different inside kills the gradient), the original paper by Hinton himself was more or less "because it trains 3x faster".
Re-thinking fundamentals might help, but most "let's change the fundamentals" is rarely how it works. Even the most seminal papers, i.e. AlexNet and "Attention Is All You Need", are refinements of existing ideas, and show how they help.
Machine learning is an experimental science. Many mathematically cool ideas do not work. Many engineering ones do.
> I've tweeted before that one of the most important traits in a researcher is healthy paranoia. Be paranoid!
I have seen so many PhDs burned out to cinders; I don't think it is any more a good piece of advice than "depression is good for philosophers". Sure, be a relentless explorer.
> In short, holding on to ideas for too long can actually be counterproductive. Stay open-minded and refuse to let ego cloud your judgement.
Which I think is true.
This is gold!!!!