PDA

View Full Version : Qt5.7/C++ - Playing with 'auto' keyword and chrono - loop problem?



jimbo
4th January 2017, 12:00
Hello,

Playing with the 'auto' keyword.
I am trying to do a test on speed with vectors against arrays.
As you can see using a vector works fine, using an array does'nt.
Arrays always gives a time of zero.
Can some one please point out my error?

Thanks




auto v = {10, 1, 2, 3, 4, 5};

auto start_time3 = std::chrono::high_resolution_clock::now();
for (int s = 0; s < 250000; s++)
f3 (2147483647, v);
auto end_time3 = std::chrono::high_resolution_clock::now();

auto time3 = end_time3 - start_time3;
std::cout << "f3 took " << std::chrono::duration_cast<std::chrono::nanoseconds>(time3).count()

//f3 took 46879000 to run.

void MainWindow::f3(int n, std::vector<int> v) {
int k;
for (int i = 0; i < n; i++)
{
for (int i = 0; i <= v.size(); i++)
{
//k = v.at(i); //ok both of these are similar times
k = v[i]; //ok
//qDebug() << k << i;
}
}
}



int a[] = {10, 1, 2, 3, 4, 5};

auto start_time4 = std::chrono::high_resolution_clock::now();
for (int s = 0; s < 250000; s++)
f4 (2147483647, a, (sizeof(a)/sizeof(a[0])));
auto end_time4 = std::chrono::high_resolution_clock::now();

auto time4 = end_time4 - start_time4;
std::cout << "f4 took " << std::chrono::duration_cast<std::chrono::nanoseconds>(time4).count()

//f4 took 0 to run.

void MainWindow::f4(int n, int arr[], int size) {
int k;
for (int i = 0; i < n; i++)
{
for (int i = 0; i <= size - 1; i++)
{
k = arr[i]; //ok
//qDebug() << k << i;
}
}
}

anda_skoa
4th January 2017, 15:18
Does it also give you "0" when you do the qDebug in side the loop?
And the qDebug() output is printed every time?

Cheers,
_

jimbo
4th January 2017, 18:02
Hello anda_skoa,

Thanks for your reply.
The debug output prints every time.
I changed some of the loop counts and it ran as expected.
I guess one/some of the 'chrono' internal counters was/were over-flowing.
I'll have a play with other timer periods.

Regards

anda_skoa
4th January 2017, 18:24
I changed some of the loop counts and it ran as expected.
I guess one/some of the 'chrono' internal counters was/were over-flowing.


Ah, good.

One thing I always check when such functions apparently don't need any time to execute is that they have some result or side effect, e.g. like the qDebug().
Otherwise one can't be sure that the compiler isn't detecting that it does nothing and just optimizes it away.

Cheers,
_

jimbo
5th January 2017, 10:59
Hello,

I've found a way to turn off optimisation for a function.
This seems to work, the time output shows.


#pragma GCC push_options
#pragma GCC optimize ("O0") //capital 0 - zero
void MainWindow::f4(int n, int arr[], int size) {
int k;
for (int j = 0; j < n; j++)
{
for (int i = 0; i <= size - 1; i++)
{
k = arr[i]; //ok
//qDebug() << k << i << j;
}
}
}
#pragma GCC pop_options
Regards

jimbo
6th January 2017, 19:39
Hello,

Some simple speed tests.

Regards


void f0(int n, std::vector<int> v);
void f1(int n, std::vector<int> v);
void f2(int n, std::vector<int> v);
void f3(int n, std::vector<int> v);
void f4(int n, int arr[], int size);
void f5(int n, int arr[], int size);


//**********
double d;
//auto v = {10, 1, 2, 3, 4, 5, 6, 7, 8, 9};
std::vector<int> v = {10, 1, 2, 3, 4, 5, 6, 7, 8, 9};
std::cout << "vector\n";
//**********
auto start_time = std::chrono::high_resolution_clock::now();
//for (int t = 0; t < 100; t++)
for (auto t = 0; t != 100; ++t)
{
for (int s = 0; s < 1000; s++)
f0 (1000, v);
}
auto end_time = std::chrono::high_resolution_clock::now();
auto time = end_time - start_time;
d = std::chrono::duration_cast<std::chrono::microseconds>(time).count();
std::cout << "f0 took " << (d / 100) << " microseconds.\n";
//**********
auto start_time6 = std::chrono::high_resolution_clock::now();
for (int t = 0; t < 100; t++)
//for (auto t = 0; t != 100; ++t)
{
for (int s = 0; s < 1000; s++)
f0 (1000, v);
}
auto end_time6 = std::chrono::high_resolution_clock::now();
auto time6 = end_time6 - start_time6;
d = std::chrono::duration_cast<std::chrono::microseconds>(time6).count();
std::cout << "f6 took " << (d / 100) << " microseconds.\n";
//**********
auto start_time1 = std::chrono::high_resolution_clock::now();
for (int t = 0; t < 100; t++)
{
for (int s = 0; s < 1000; s++)
f1 (1000, v);
}
auto end_time1 = std::chrono::high_resolution_clock::now();
auto time1 = end_time1 - start_time1;
d = std::chrono::duration_cast<std::chrono::microseconds>(time1).count();
std::cout << "f1 took " << (d / 100) << " microseconds.\n";
//**********
auto start_time2 = std::chrono::high_resolution_clock::now();
for (int t = 0; t < 100; t++)
{
for (int s = 0; s < 1000; s++)
f2 (1000, v);
}
auto end_time2 = std::chrono::high_resolution_clock::now();
auto time2 = end_time2 - start_time2;
d = std::chrono::duration_cast<std::chrono::microseconds>(time2).count();
std::cout << "f2 took " << (d / 100) << " microseconds reverse.\n";
//**********
auto start_time3 = std::chrono::high_resolution_clock::now();
for (int t = 0; t < 100; t++)
{
for (int s = 0; s < 1000; s++)
f3 (1000, v);
}
auto end_time3 = std::chrono::high_resolution_clock::now();
auto time3 = end_time3 - start_time3;
d = std::chrono::duration_cast<std::chrono::microseconds>(time3).count();
std::cout << "f3 took " << (d / 100)
<< " microseconds.\n";
//**********
int a[] = {10, 1, 2, 3, 4, 5, 6, 7, 8, 9};
std::cout << "\narray\n";
//**********
auto start_time4 = std::chrono::high_resolution_clock::now();
for (int t = 0; t < 100; t++)
{
for (int s = 0; s < 1000; s++)
f4 (1000, a, (sizeof(a)/sizeof(a[0])));
}
auto end_time4 = std::chrono::high_resolution_clock::now();
auto time4 = end_time4 - start_time4;
d = std::chrono::duration_cast<std::chrono::microseconds>(time4).count();
std::cout << "f4 took "
<< (d / 100)
<< " microseconds.\n";
//**********
auto start_time5 = std::chrono::high_resolution_clock::now();
for (int t = 0; t < 100; t++)
{
for (int s = 0; s < 1000; s++)
f5 (1000, a, (sizeof(a)/sizeof(a[0])));
}
auto end_time5 = std::chrono::high_resolution_clock::now();
auto time5 = end_time5 - start_time5;
d = std::chrono::duration_cast<std::chrono::microseconds>(time5).count();
std::cout << "f5 took "
<< (d / 100)
<< " microseconds reverse.\n";
//**********
std::cout << "\n";


#pragma GCC push_options
#pragma GCC optimize ("O0")
void MainWindow::f0(int n, std::vector<int> v) {
int k;
for (int i = 0; i < n; i++)
{
for (const int& i : v) // access by const reference
{
k = i;
}
}
}
#pragma GCC pop_options

#pragma GCC push_options
#pragma GCC optimize ("O0")
void MainWindow::f1(int n, std::vector<int> v) {
int k;
for (int i = 0; i < n; i++)
{
for (auto i : v) // access by value, the type of i is int
{
k = i;
}
}
}
#pragma GCC pop_options

#pragma GCC push_options
#pragma GCC optimize ("O0")
void MainWindow::f2(int n, std::vector<int> v) {
int k;
for (int i = 0; i < n; i++)
{
for (auto i = v.end() - 1; i >= v.begin(); --i) //reverse
{
k = *i;
//qDebug() << k << *i;
}
}
}
#pragma GCC pop_options

#pragma GCC push_options
#pragma GCC optimize ("O0")
void MainWindow::f3(int n, std::vector<int> v) {
int k;
for (int i = 0; i < n; i++)
{
for (int i = 0; i <= v.size(); i++) //conventional style
{
//k = v.at(i); //ok both of these are similar times
k = v[i]; //ok
//qDebug() << k << i;
}
}
}
#pragma GCC pop_options

#pragma GCC push_options
#pragma GCC optimize ("O0")
void MainWindow::f4(int n, int arr[], int size) {
int k;
for (int i = 0; i < n; i++)
{
for (int i = 0; i <= size - 1; i++)
{
k = arr[i]; //ok
//qDebug() << k << i << j;
}
}
}
#pragma GCC pop_options

#pragma GCC push_options
#pragma GCC optimize ("O0")
void MainWindow::f5(int n, int arr[], int size) {
int k;
for (auto i = 0; i != n; ++i)
{
for (auto i = size - 1; i >= 0; --i) //reverse
{
k = arr[i]; //ok
//qDebug() << arr[i];
}
}
}
#pragma GCC pop_options


vector
f0 took 171214 microseconds.
f6 took 171078 microseconds.
f1 took 147593 microseconds.
f2 took 154707 microseconds reverse.
f3 took 87834.4 microseconds.

array
f4 took 71596.9 microseconds.
f5 took 66486.2 microseconds reverse.

d_stranz
6th January 2017, 23:36
Interesting. You should also try this using std:: vector<int>:: const_iterator() and std:: vector<int>:: iterator() (added whitespace to avoid the "smiley effect"). In our testing, iterator-based access was up to an order of magnitude slower than pointer or operator[] based access in MSVC++.

jimbo
10th January 2017, 12:08
Hello d_stranz,

I don't quite understand the use of 'const'.
However I ran test 'f7' along with those already posted.

Regards


auto start_time7 = std::chrono::high_resolution_clock::now();
for (int t = 0; t < 100; t++)
{
for (int s = 0; s < 1000; s++)
f7 (1000, v);
}
auto end_time7 = std::chrono::high_resolution_clock::now();
auto time7 = end_time7 - start_time7;
d = std::chrono::duration_cast<std::chrono::microseconds>(time7).count();
std::cout << "f7 took " << (d / 100) << " microseconds.\n";

#pragma GCC push_options
#pragma GCC optimize ("O0")
void MainWindow::f7(int n, std::vector<int> v) {
int k;
for (int i = 0; i < n; i++)
{
for (std::vector<int>::iterator it = v.begin() ; it != v.end(); ++it)
{
k = *it;
//qDebug() << *it << k;
}
}
}
#pragma GCC pop_options

vector
f0 took 168825 microseconds.
f6 took 167124 microseconds.
f1 took 136416 microseconds.
f2 took 155180 microseconds reverse.
f3 took 89297.5 microseconds.
f7 took 128919 microseconds. //***

array
f4 took 71238 microseconds.
f5 took 66366.4 microseconds reverse.

d_stranz
10th January 2017, 15:47
Even more interesting. The Microsoft C++ implementation of STL iterators is definitely slower than the GCC version.

The const_iterator returns a const reference to the value stored in the vector. This guarantees to the compiler that the caller will not modify the value and thus might allow it to generate more optimized (i.e. faster) code than using the non-const iterator.

jimbo
10th January 2017, 16:47
Hello d_stranz,

OS - Windows 10 - 32 bit

g++ -v
gcc version 5.3.0 (i686-posix-dwarf-rev0, Built by MinGW-W64 project)

Regards

d_stranz
11th January 2017, 03:54
When I get some free time I will run your tests using MSVC++ in Visual Studio 2013 / Windows 10 x64 and post my findings. Good work here.

jimbo
11th January 2017, 12:54
Hello,

OS - Windows 7 - 32 Bit
Same machine - dual boot system

g++ -v
gcc version 5.3.0 (i686-posix-dwarf-rev0, Built by MinGW-W64 project)


vector
f0 took 150927 microseconds.
f6 took 158602 microseconds.
f1 took 135515 microseconds.
f2 took 160152 microseconds reverse.
f3 took 85401.3 microseconds.
f7 took 137652 microseconds.

array
f4 took 70551.1 microseconds.
f5 took 65513.5 microseconds reverse.

Regards

d_stranz
11th January 2017, 16:06
Can you please make a ZIP file of your current code and attach it to a reply? There is code scattered in various places in this post, and I would like to be sure to use the same thing you are using.

jimbo
11th January 2017, 18:47
Hello,

Enclosed 'Tests.txt'

Regards