PDA

View Full Version : Memory aligned QVector().data()



sto
17th November 2010, 16:07
Hello there. I'm using some SSE optimizations on a large QVector and I need the QVector().data() pointer to be 16-byte memory aligned. My code goes something like this:


QVector *vec = new QVector ();

if(((uintptr_t)vec->data() & 0x0F) == 0) {
// great, we got memory aligned data
}
else {
// try again or use backup code
}

My code works but whether I can use optimizations is completely random and so far I couldn't find a way to make QVector acquire a memory aligned data pointer. Any ideas?

Timoteo
17th November 2010, 17:19
This isn't really a Qt question, but about fundamental c++. That is: operator overloading, and more specifically operator new.

high_flyer
18th November 2010, 10:22
you might find this interesting.
Its not about QVector, but I think the problematics, and solutions are probably similar if not the same.
http://ompf.org/forum/viewtopic.php?f=11&t=686

sto
18th November 2010, 10:54
@ Timoteo - Not so.

On one hand QVector uses the malloc/free functions for the memory allocation of its internal array (which is of type QVectorTypedData<T> -> check "qvector.h") and the way it aligns the memory is specified (as in "it's not your default malloc"; it does some memory alignment, just not the one I want).

On the other hand modifying the Qt source code to force a specific alignment is not an option so I was wondering if there were any other options available instead of using hacks.

So no, it's definitely not a C++ question.

@high_flyer - thank you for the link, unfortunately my problem is directly related to Qt because I want to use QVector's functionality. I dug a bit through Qt's sources and if I find anything worthwhile I'll post it here.

high_flyer
18th November 2010, 11:06
On the other hand modifying the Qt source code to force a specific alignment is not an option so I was wondering if there were any other options available instead of using hacks.
You can Subclass QVector.


unfortunately my problem is directly related to Qt because I want to use QVector's functionality.
Yes, but the problematic in the link and yours are the same.
They have a problem with how std::vector aligns (or rather not) its data, yours is with QVector.

So I think to Timoteo is not that wrong - you need to add code to align your data in QVector the way you need it.
I agree that Overloading new would be too much, but a special "allocAlignedQVector()" or subclassing QVector to QAlignedQVector where you implement this will be in place.

sto
18th November 2010, 12:36
The problem is not the same. For std::vector you can create custom memory allocators that do the job ; I am not aware of such functionality for QVector. Unfortunately the problem is particular to Qt.

As far as I can see inheriting from QVector does not work; the data is stored in a QVectorData object (which also has the functions for all memory allocation) that is -of course- private so I can't touch it.

Overloading new doesn't seem to solve anything because neither QVector nor QVectorData use it internally (they do use in-place new but that's useless, the memory is already malloc-ed).

Timoteo
18th November 2010, 21:42
Yea, I was referring to defining an operator new for your contained types, but you say that QVector uses C-style allocations internally. Is QVector::fromStdVector feasible? I haven't looked at the source for it, so you may end up with the same situation.

ChrisW67
18th November 2010, 23:45
QVector::fromStdVector() copies the data from the standard vector, so alignment will not be preserved.

I'd be inclined to use std::vector with a custom allocator myself. Something along these lines: http://stackoverflow.com/questions/2340311/posix-memalign-for-stdvector

sto
19th November 2010, 09:31
Just to clarify - I could use std::vector and everything would work fine but for the sake of uniformity (using Qt throughout the whole project) and science I wanted to know if it could be done in Qt. Two things:

1) I want to use a QVector<float> but that only creates a memory alignment of sizeof(float), which in my case is 4 (SSE needs 16 to work at its best). I could use a hack such as


typedef union {
__m128 stub; // sizeof(__m128) == 128 ; // 16 bytes
float f[4];
} float4;

QVector<float4> vec;

Q_ASSERT( ((uintptr_t)vec.data() & 0xF) == 0); // always zero

but this creates extra complexity because I don't always want to store a number of values that's a multiple of 4.


2) Here's a part of the QVector::append() function.


template <typename T>
void QVector<T>::append(const T &t)
{
...
if (QTypeInfo<T>::isComplex)
new (p->array + d->size) T(copy);
...
}

It uses in-place new, specifying the memory address QVector expects for my value to have. I could use some hacks here as well (use a custom type, overload in-place new, etc) but this is much more complex than simply using another container.

I guess there's no easy way to make Qt use custom memory alignment but if I find anything I'll let you know. Thanks for the help.

high_flyer
19th November 2010, 09:41
Just a comment:
From the wikipedia (http://en.wikipedia.org/wiki/Data_structure_alignment) definition:

A memory address a, is said to be n-byte aligned when n is a power of two and a is a multiple of n bytes.

conflicts with your statement:

I don't always want to store a number of values that's a multiple of 4.

sto
19th November 2010, 10:30
I don't always want to store a number of values [inside my vector] that's a multiple of 4.
There, that should make things more clear. My statement is related to the hack where each element of the vector stores 4 floating point values instead of just one so that QVector gives the proper data alignment.

If it's still not clear I'll quote this as well:

...I need the QVector().data() pointer to be 16-byte memory aligned
QVector allocates a contiguous memory region for its values and the pointer resulted from that memory allocation needs to be 16-byte memory aligned. The memory alignment has nothing to do with the number of elements inside the vector.

I simply want QVector<float> to return a 16-byte aligned data pointer but it won't because sizeof(float) == 4 (in my case) and not 16. Actually things get more complicated because Qt seems to use the Q_ALIGNOF macro to determine the alignment. So, for example, regular 32-bit code:



struct test0 { float f;};
struct test1 { float f; double d;};
struct test2 { float f[4]; };
union test3 { float f[4]; __m128 spfp; };

qDebug() << sizeof(test0) << Q_ALIGNOF(test0);
qDebug() << sizeof(test1) << Q_ALIGNOF(test1);
qDebug() << sizeof(test2) << Q_ALIGNOF(test2);
qDebug() << sizeof(test3) << Q_ALIGNOF(test3);

For Visual Studio Q_ALIGNOF defaults to the __alignof operator, I assume that the same behavior is used for all compilers. The first qDebug value outputs the size of the structure/union and the second value gives you the alignment Qt decides to use. Output goes as follows:



4 4 // on my machine sizeof(float) == 4 and sizeof(double) == 8

16 8 // sizeof is 16 (8 + 4 + (extra 4 padding, bonus from Visual C++))
// Q_ALIGNOF returns the largest data type (double)

16 4 // sizeof is 16 (4 floats)
// Q_ALIGNOF returns only the base data type's sizeof, which is 4

16 16 // sizeof is 16 and Q_ALIGNOF sees the __m128 and returns 16; unfortunately now I'm forced to store 4 values at once (even if I only need 1, 2 or 3; so a QVector using this type would always store a multiple of 4 floating point values)