Using Faster Exponential Approximation

In some scenarios/applicatons, where the precision may not be so critically important but the speed (performance) is, you may be willing to sacrifice some extent of accuracy for the speed.

In neutral networks, where the math function $tex_fa6b7e7736a40da84ecdc0e4ee3c81ca Using Faster Exponential Approximation C/C++ double faster approximation$ where n is usually small (less than 2, for instance), you can avoid the expensive exp() provided by math.h (for other programming languages, similar inbuilt system functions are provided)

The $tex_474f9eefecc087e5804b33cc66a0b65e Using Faster Exponential Approximation C/C++ double faster approximation$ (exponential function) can be considered as the following:

exp Using Faster Exponential Approximation C/C++ double faster approximation

In practice, n cannot approach to the infinity but we can achieve a relatively good accuracy by using a large n.

For example, if we put $tex_a43617c028098702649915821cef059d Using Faster Exponential Approximation C/C++ double faster approximation$ , then we can multiply $tex_da03c601b0dd4362bd753780bdb46ced Using Faster Exponential Approximation C/C++ double faster approximation$ by itself 8 times due to the fact $tex_19c74ac38a6f22f8ec6319a87fde9130 Using Faster Exponential Approximation C/C++ double faster approximation$

With this in mind, we can come up with the following approximation:

inline
double exp1(double x) {
  x = 1.0 + x / 256.0;
  x *= x; x *= x; x *= x; x *= x;
  x *= x; x *= x; x *= x; x *= x;
  return x;
}

inline
double exp1(double x) {
  x = 1.0 + x / 256.0;
  x *= x; x *= x; x *= x; x *= x;
  x *= x; x *= x; x *= x; x *= x;
  return x;
}

We can also multiply a few more times, to increase the accuracy.

inline
double exp2(double x) {
  x = 1.0 + x / 1024;
  x *= x; x *= x; x *= x; x *= x;
  x *= x; x *= x; x *= x; x *= x;
  x *= x; x *= x;
  return x;
}

inline
double exp2(double x) {
  x = 1.0 + x / 1024;
  x *= x; x *= x; x *= x; x *= x;
  x *= x; x *= x; x *= x; x *= x;
  x *= x; x *= x;
  return x;
}

Now, you have the pattern, but for now, we need to test how accurate these approximations are:

exp Using Faster Exponential Approximation C/C++ double faster approximation

The above plots 3 curves, which are the exp provided by math.h, the exp 256 and the exp 1024. They show very good agreement for input smaller than 5.

We plot the difference to make it easier to see.

exp-diff Using Faster Exponential Approximation C/C++ double faster approximation

Wow, it really can be a faster alternative if the required input range smaller than 5. For negative inputs, the difference won’t be so noticeable because the value itself is so tiny that can’t be observed visually in graph.

The exp 256 is 360 times faster than the traditional exp and the exp 1024 is 330 times faster than the traditional exp.

–EOF (Coding For Speed) —

GD Star Rating
loading...

634 words Last Post: Using Faster Integer Power in Java
Next Post: Counting the number of Leading Zeros for a 32-bit Integer (Signed or Unsigned)

The Permanent URL is: Using Faster Exponential Approximation (AMP Version)

6 Comments

Juan L.

I did some tests of these functions, however the do not seem to be much faster than the exp function from the library. I tested 70 million calls to each one of the functions and t yielded the next results

library time
time 2.21931
e2 time
time 2.06819
e1 time
time 1.65665

I can see at most a two times increase in performance, how did you tested that it was 300 times faster ?

- ACMer
  
  I think I used codeblocks with gcc compiler…
  
Bruno

Benchmark Time(ns) CPU(ns) Iterations
————————————————–
bench_math_exp/1 95 94 7241354
bench_math_exp/10 95 95 7499946
bench_math_exp/15 94 94 7500027
bench_math_exp/20 94 94 7500027
bench_exp256/1 17 17 40384693
bench_exp256/10 17 17 41176471
bench_exp256/15 17 17 39622567
bench_exp256/20 17 17 40384460
bench_exp1024/1 10 10 72413543
bench_exp1024/10 9 9 65624795
bench_exp1024/15 10 10 74999464
bench_exp1024/20 9 9 72413543

exp of 1, 10, 15 and 20, using google benchmark and -O3, it is odd that exp1024 is faster than exp256, but if i use -O0, exp1024 is slower and i don’t know why this is happening.

- Bruno
  
  well, it’s broken the whitespaces .-.
  
Antoni Gual

A good idea for an embedded system where a math library is not avaliable. Not practical in a present day PC

Chris Harvey

Thank you! I was looking for a way to speed up the expensive exp(-x^2) operation in the non-local means denoising algorithm for image processing, where x represents patch similarity. This approximation worked very well. If two patches are similar, then x is small, and the approximation is very accurate. If x is larger than some threshold, then the patches are dissimilar, and so x is large and exp(-x^2) can be approximated as zero directly. (In fact, this latter approximation is necessary as otherwise the contributions from dissimilar patches build up because the approximation is poorer and there are significant artefacts.)

I tested this with n=1024 on an 3715*2527 image with patch 7*7 and search window 21*21 and this reduced the computation time from 307.87 s to 15.0171 s when using a naive NLM calculation, so that’s a 20.5 times speed-up. I didn’t get the 300+ times speed-up you mentioned (I have more going on that just exp calculations), but I’m still pleased with this.

If you have suggestions for an even faster calculation, I am eager to know. (I need to avoid a look-up table as the discretization destorys the data).

CodingForSpeed.COM

Using Faster Exponential Approximation

6 Comments

Leave a Reply

Related posts:

6 Comments

Leave a Reply