This is basically a Pytorch library for executing computations over dynamic ranges that exceed Float64's limits, including on GPUs.
I can see how it could be useful when you really need it. Thank you for sharing it on HN.
I tried the sample code for estimating Lyapunov exponents in parallel. It worked on the first try, and it was much faster than existing methods, as advertised. It's nice to come across something that works as advertised on the first try!
The high-dynamic-range RNN stuff may be interesting to others, but it's not for me. In my book, Transformers have won. Nowadays it's so easy to whip-up a small Transformer with a few lines of Python, and it will work well on anything you throw at it.
To the best of our knowledge, this is the first time anyone has successfully trained a non-diagonal RNN computed in parallel, via prefix scan, without requiring any form of stabilization. We abstained from claiming as much out of an abundance of caution.
In particular, it feels like storing the complete complex number is a bit silly since we know, a priori, that the number exponentiates to ±1, so, wouldn't this mean that we have wasted 31 bits? (=32-1 since only one bit is needed for the sign.)
That being said, this representation is very useful for certain scenarios, of course, when you know that the dynamic range of your number is very large, but, as far as I can tell, it's not exactly super novel, unless I'm missing something!
The manuscript formally defines GOOMs as a set of mathematical objects, shows that floating-point formats are a special case of GOOMs, and notes that they extend prior work on logarithmic number systems (LNSs), which go back to at least the early 1970's. That is, LNSs are a special case of GOOMs too. Defining and naming GOOMs enables reasoning about all possible special cases in the abstract. In practice, each implementation makes different trade-offs.
The formal definition stops short of inducing an isomorphism between GOOMs and R, to allow for the possibility of transformations that leverage the structure of the complex plane, e.g., deep learning models that process data in C and apply a final transformation from C to GOOMs, thereby allowing the data to be exponentiated to R. The library in this repository makes implementing such a model trivial, because it ensures that backpropagation works seamlessly over C, over GOOMs, and across mappings between C, GOOMs, and floats.
Take a look at the selective-resetting algorithm in the manuscript too. To the best of our knowledge, it's a new algorithm, but we opted not to claim as much, out of an abundance of caution. You will appreciate reading about it.
This is basically a Pytorch library for executing computations over dynamic ranges that exceed Float64's limits, including on GPUs.
I can see how it could be useful when you really need it. Thank you for sharing it on HN.
I tried the sample code for estimating Lyapunov exponents in parallel. It worked on the first try, and it was much faster than existing methods, as advertised. It's nice to come across something that works as advertised on the first try!
The high-dynamic-range RNN stuff may be interesting to others, but it's not for me. In my book, Transformers have won. Nowadays it's so easy to whip-up a small Transformer with a few lines of Python, and it will work well on anything you throw at it.
To the best of our knowledge, this is the first time anyone has successfully trained a non-diagonal RNN computed in parallel, via prefix scan, without requiring any form of stabilization. We abstained from claiming as much out of an abundance of caution.
Hmm, how does this compare to things like
https://github.com/cjdoris/LogarithmicNumbers.jl
or
https://github.com/cjdoris/HugeNumbers.jl
(Apart from the PyTorch impl)
In particular, it feels like storing the complete complex number is a bit silly since we know, a priori, that the number exponentiates to ±1, so, wouldn't this mean that we have wasted 31 bits? (=32-1 since only one bit is needed for the sign.)
That being said, this representation is very useful for certain scenarios, of course, when you know that the dynamic range of your number is very large, but, as far as I can tell, it's not exactly super novel, unless I'm missing something!
The manuscript formally defines GOOMs as a set of mathematical objects, shows that floating-point formats are a special case of GOOMs, and notes that they extend prior work on logarithmic number systems (LNSs), which go back to at least the early 1970's. That is, LNSs are a special case of GOOMs too. Defining and naming GOOMs enables reasoning about all possible special cases in the abstract. In practice, each implementation makes different trade-offs.
The formal definition stops short of inducing an isomorphism between GOOMs and R, to allow for the possibility of transformations that leverage the structure of the complex plane, e.g., deep learning models that process data in C and apply a final transformation from C to GOOMs, thereby allowing the data to be exponentiated to R. The library in this repository makes implementing such a model trivial, because it ensures that backpropagation works seamlessly over C, over GOOMs, and across mappings between C, GOOMs, and floats.
Take a look at the selective-resetting algorithm in the manuscript too. To the best of our knowledge, it's a new algorithm, but we opted not to claim as much, out of an abundance of caution. You will appreciate reading about it.
A different approach to the same problem: see https://posithub.org/ or https://en.wikipedia.org/wiki/Unum_(number_format)
Yes. See this comment for context: https://news.ycombinator.com/item?id=45611863
Does it mean I can use the word “zillion” in a professional scientific context?
repo:
https://github.com/glassroom/generalized_orders_of_magnitude
We'll add that link to the toptext as well. Thanks!
Thank you for things like this, it significantly enhances news.yc to make these kinds of tweaks and choices.