this 3 page classic [1] captures most of the core ideas and explains it in a manner anyone with basic calculus background can understand - "Learning representations by back-propagating errors"
I took calculus over 30 years ago and have never really used it -- I'll put your conjecture to test (sample size: 1). Will let you know if your conjecture is true ;-).
The corresponding row vector is denoted by x^T when we need to distinguish them. We can also ignore the transpose for readability, if the shape is clear from context.
I am tilting at windmills, but I am continually annoyed at the sloppiness of mathematicians in writing. Fine, you don’t like verbosity, but for didactic purposes, please do not assume the reader is equipped to know that variable x actually implies variable y.
All that being said, the writing style from the first chapter is very encouraging at how approachable this will be.
It is weird to be honest. I first learned Coq and then started taking upper level maths classes. My group theory proofs were panned by my TAs as overly verbose, very precise, and I was specializing on H_1 and H_2s everywhere and having IHns flying around like crazy because I could not fathom how one proves things without formally connecting things up.
Then my profs told me I was not “wrong”, but proofs or expositions are to most mathematicians not programs (ha! How did I not know. You teach me natural deduction and expect me not to program?), more like convincing arguments/prose. At some point one abstracts.
> I am tilting at windmills, but I am continually annoyed at the sloppiness of mathematicians in writing. Fine, you don’t like verbosity, but for didactic purposes, please do not assume the reader is equipped to know that variable x actually implies variable y.
I am a practicing mathematician who felt the same way you did when I started, and who still writes their papers in a way that many of my colleagues feel is gallingly pedantic. With that as my credentials, I hope I may say that it can be much worse as a reader to read something where every detail is spelled out, because a bit of syntactic sugar begins to seem as important as the heart of an argument. Where the dividing line is between precision and obfuscation depends on the reader, and so inevitably will leave some readers on the wrong side, but a trade-off does have to be made somewhere.
I wish the formality would be included in an appendix — as someone who has had to implement a lot of things (and more than once, found errors).
But I agree with your general point: understanding the recipe and general thrust of the approach is often more important, because even if the exact proof misses some technical detail, that can often be patched.
It would be nice if arXiv included a small-layout pdf or native epub option for e-readers. Now that they serve the Tex files and are experimenting with HTML, it feels like a natural step.
I do that all the time to support authors, plus the physicality of a tangible book is irreplaceable. In fact, I did that just today with a different book.
Actually, it is peer reviewed following the standard practice for books: some other people read it and provided feedback as evidenced by the Acknowledgments section.
The funny thing about books is that authors in free societies are allowed to self-publish whatever they want. The norms are different and, frankly, more democratic and with less gatekeeping.
arXiv is a preprint server trusted by the scientific community for decades - papers there often undergo peer review later, and many top ML researchers publish their work there first for faster dissemination.
this 3 page classic [1] captures most of the core ideas and explains it in a manner anyone with basic calculus background can understand - "Learning representations by back-propagating errors"
[1] https://gwern.net/doc/ai/nn/1986-rumelhart-2.pdf
I took calculus over 30 years ago and have never really used it -- I'll put your conjecture to test (sample size: 1). Will let you know if your conjecture is true ;-).
All that being said, the writing style from the first chapter is very encouraging at how approachable this will be.
It is weird to be honest. I first learned Coq and then started taking upper level maths classes. My group theory proofs were panned by my TAs as overly verbose, very precise, and I was specializing on H_1 and H_2s everywhere and having IHns flying around like crazy because I could not fathom how one proves things without formally connecting things up.
Then my profs told me I was not “wrong”, but proofs or expositions are to most mathematicians not programs (ha! How did I not know. You teach me natural deduction and expect me not to program?), more like convincing arguments/prose. At some point one abstracts.
[delayed]
> I am tilting at windmills, but I am continually annoyed at the sloppiness of mathematicians in writing. Fine, you don’t like verbosity, but for didactic purposes, please do not assume the reader is equipped to know that variable x actually implies variable y.
I am a practicing mathematician who felt the same way you did when I started, and who still writes their papers in a way that many of my colleagues feel is gallingly pedantic. With that as my credentials, I hope I may say that it can be much worse as a reader to read something where every detail is spelled out, because a bit of syntactic sugar begins to seem as important as the heart of an argument. Where the dividing line is between precision and obfuscation depends on the reader, and so inevitably will leave some readers on the wrong side, but a trade-off does have to be made somewhere.
Could there be a compromise where the verbosity is kept but the key points are highlighted, grouped or presented in a different color.
I would certainly appreciate if math papers were more explicit and "hand-holding" but understand why trained mathematicians would find that tedious.
> Could there be a compromise where the verbosity is kept but the key points are highlighted, grouped or presented in a different color.
There's no reason except inertia why there couldn't be. Lamport actually proposed a system for this: https://lamport.azurewebsites.net/pubs/lamport-how-to-write.....
I wish the formality would be included in an appendix — as someone who has had to implement a lot of things (and more than once, found errors).
But I agree with your general point: understanding the recipe and general thrust of the approach is often more important, because even if the exact proof misses some technical detail, that can often be patched.
> I wish the formality would be included in an appendix — as someone who has had to implement a lot of things (and more than once, found errors).
Indeed. Lamport says that this was part of what inspired his interest in formal proofs: https://mathoverflow.net/questions/35727/community-experienc....
Wow, kudos to the Author. Very easy to digest, beautifully crafted, and took the time to explain the concepts when most places take them for granted.
Well, kudos to your one-line comment too because now I am encouraged to actually read this.
This looks like a good practical companion for a more theoretical text, such as Deep Learning by Bishop.
Beautifully formatted and has the right combination of code and theory for noobs like me. Strong vibes for Simone right now, hero of the people.
It would be nice if arXiv included a small-layout pdf or native epub option for e-readers. Now that they serve the Tex files and are experimenting with HTML, it feels like a natural step.
And I just bought the physical book...
I do that all the time to support authors, plus the physicality of a tangible book is irreplaceable. In fact, I did that just today with a different book.
Glad to see JAX featured alongside PyTorch. JAX still feels like the best-kept secret in deep learning
Website of the author with more material and lab sessions
https://www.sscardapane.it/alice-book/
https://sscardapane.notion.site/Guided-lab-sessions-18c25bd1...
Although I love this, it's not peer reviewed and I don't trust arxiv.
Actually, it is peer reviewed following the standard practice for books: some other people read it and provided feedback as evidenced by the Acknowledgments section.
It’s more a book than academic research.
The funny thing about books is that authors in free societies are allowed to self-publish whatever they want. The norms are different and, frankly, more democratic and with less gatekeeping.
People are submitting corrections: https://www.sscardapane.it/assets/alice/errata_list.pdf
arXiv is a preprint server trusted by the scientific community for decades - papers there often undergo peer review later, and many top ML researchers publish their work there first for faster dissemination.
Damn beeeeefffffyyyyy. Need the month to eat ten pages a day, Tnx looks awesome. Could append diffusion too ultimately