A different floating point hack makes exp() easier to compute in hardware (and consequently tanh). You cast the input to an int and take the first 2 bits of what would be the mantissa. LUT[Index] and LUT[Index+1] from your 5-entry table are used to either lerp or poly approx. the function, with the remaining mantissa bits to help.
A different approach, refining the square root based sigmoid with a polynomial, is in my blog post "a few of my favorite sigmoids" [1]. I'm not sure which is faster without benchmarking, but I'm pretty sure its worst case error is better than any of the fast approximations.
There’s an analysis of the Schraudolph approximation of the exponential function (along with an improvement upon it) that someone might find interesting at https://typ.dev/attention#affine-cast
Looks interesting. Should start with a definition of the Hyperbolic Tangent. It is only about 2/3 of the way that the definition occurs in a discussion of computing exp(x).
A different floating point hack makes exp() easier to compute in hardware (and consequently tanh). You cast the input to an int and take the first 2 bits of what would be the mantissa. LUT[Index] and LUT[Index+1] from your 5-entry table are used to either lerp or poly approx. the function, with the remaining mantissa bits to help.
A different approach, refining the square root based sigmoid with a polynomial, is in my blog post "a few of my favorite sigmoids" [1]. I'm not sure which is faster without benchmarking, but I'm pretty sure its worst case error is better than any of the fast approximations.
[1]: https://raphlinus.github.io/audio/2018/09/05/sigmoid.html
There’s an analysis of the Schraudolph approximation of the exponential function (along with an improvement upon it) that someone might find interesting at https://typ.dev/attention#affine-cast
Looks interesting. Should start with a definition of the Hyperbolic Tangent. It is only about 2/3 of the way that the definition occurs in a discussion of computing exp(x).