PDA

View Full Version : performance - call function



mickey
28th February 2007, 03:04
calc (int i) {
pp = (i + val) * 200/s;
.....................
}
for (int j=0; j<NUMBER; ++j) //NUMBER may be 1..n (typically from 1000 to 10000 (but it can grow up over 100 000)
for (int i=0; i<NUMBER; ++i
calc(i);

Hi, my question is: maybe is better unfolding calc function inside the 2nd for? (is there any problem with that call function?)
thanks

camel
28th February 2007, 09:06
If you only use the calc function inside of the loop, try declaring it

static inline void calc(int i) {...}
This way the compiler will check to see if it can inline it into the loop..

The static here is the "non-class or member" static, which means that the function is only visible in the compile-unit. Together with the inline it tells the compiler it can go ahead and do a lot of stuff (i.e. optimize) with the function it might otherwise not dare to do :-)

If calc is a member-function it would not work this way of course. But you can make the function "inline"d non-the-less :-)

mickey
28th February 2007, 14:35
sorry I don't understand the last part. calc is a menber function.....

camel
28th February 2007, 15:01
The keyword "static" has different meanings in C++, depending on where it is used.

Since your calc is a member function, the meaning of static that I would use is not available to you. (Because adding static in from of a member function make the function a "static member function", not a "function with local visibility to the compilation unit")

What you still could do is to make calc an inlined function...

But if that does not work for you there is still a (rather hacky) method you could use...(not nice, but better than duplicating your code)


//BEGIN: HEADER FILE
class TestClass
{
public:
TestClass() : pp(0), val(0), s(200) {}
void calc(int i);

void containsLoop();
private:
int pp;
int val;
int s;
};
//END: HEADER FILE



//BEGIN: IMPLEMENTATION FILE

//You need to add all the member variables that you use as pass-by-reference parameters
//The compiler will check if it makes sense to inline
static inline void calcStaticImpl(const int i, int &pp, int &val, int &s)
{
pp = (i + val) * 200 / s;
//etc
}

void TestClass::calc(int i)
{
calcStaticImpl(i, pp, val, s);
}

static const int NUMBER = 10000;

void TestClass::containsLoop()
{
for (int j = 0; j < NUMBER; ++j) {
for (int i = 0; i < NUMBER; ++i) {
calcStaticImpl(i, pp, val, s);
}
}
}

int main(int argc, char* argv[])
{
TestClass test;

test.calc(1);

test.containsLoop();

return 0;
}


But check first if that function call is really a problem for you. (Profilers are your friends ;-)

wysota
28th February 2007, 15:17
You can always convert the method to a macro.

camel
28th February 2007, 15:27
You can always convert the method to a macro.

Yeah...but I like static inline better...it makes finding errors much much easier :-)

And besides..according to the GCC manual An Inline Function is As Fast As a Macro (http://gcc.gnu.org/onlinedocs/gcc/Inline.html)

And since this is one of the easier optimizations...I would imagine this is true for most compilers :-)

mickey
28th February 2007, 18:29
But check first if that function call is really a problem for you.

my question was this!!

camel
28th February 2007, 18:43
my question was this!!

Sure,

and it is always a good idea to allow for inlining (hence the tips).

It is always never a good idea to duplicate code (what I gathered you wanted to do).


But I consider the "static inline" thing, not as really nice code, thus, you should check if the call itself is a problem. This is nothing we can answer without more information.

Is the small calculation you have in there the only thing? => the overhead of the function call is probably noticeable...good idea to work towards inlining.

Do you access perhaps a database? => The function call itself isn't your problem in this case, trying to have the compiler inline the function wouldn't do you any good, beside making your code uglier.

Yes they are two extremes, but both would be possible from what I have seen...


Take a look at a profiler of your choice and do an informed decision :-)

camel
28th February 2007, 19:29
By the way, in the case of the little test-programm I just posted, the decision would be easy (I made some small modifications):

Do it, but only if you like the coding style, because it does not matter that much... 6.5% speed up (when you only have this one formula)... or to put it into perspective:


10000000000 iterations in callLoop() in 245 seconds
10000000000 iterations in inlineLoop() in 229 seconds

This was done with an optimized build, i.e. "-O2 -march=opteron"


10 Billion function calls will cost you. But if it is enough to warant uglier code at this point...I do not know...remember the more work you do inside of the function, the less the relative savings will be...

For a pure debug build this looks as follows:


10000000000 iterations in callLoop() in 441 seconds
10000000000 iterations in inlineLoop() in 353 seconds

As you see, here the difference is much more pronounced...but recompiling optimized is probably still a better option in this case ;-)