PDA

View Full Version : how to confirm the value of one QList in other QList quickly?



weixj2003ld
7th June 2012, 04:32
Now,I have two group values(the type of it are int),and each group has many values,for example,100000.
I put them in two QList seperately,now I want to know each value in QList A exists in QList B or not quickly?
How to do?
Could you tell me if there is other method,that is,QList not used?

ChrisW67
7th June 2012, 05:37
You could just brute force search the lists using QList::contains().

If the lists are sorted then you could do something like:


QList<int>::const_iterator a = listA.constBegin();
QList<int>::const_iterator b = listB.constBegin();
while (a != listA.constEnd() && b != listB.constEnd()) {
if (*a > *b)
++b;
else {
if (*a == *b)
qDebug() << *a;
++a;
}
}

If the lists contain no repeats then you could merge the lists, sort the compound list, and look for duplicates.


You could use a QHash<int, int> or QMap<int, int> for list B (value is just 1 or the number of instances of the key) to speed up lookup.
You could use an SQL database and query.
Lots of things affect the possibilities. Do you know anything about the values? Are they in order and does order matter? Are there duplicate values? What level of duplication? What form is the data in before you put it in the lists? Are you interested only in existence or number of instances?

weixj2003ld
8th June 2012, 01:42
the two group values are both in order.Could you tell me the more efficient method?

ChrisW67
8th June 2012, 02:18
There is no perfect method... just one that works well enough.

Try the code I posted and see if it is fast enough. On my machine it gets through two sorted 100000 entry lists of random numbers , building a third list, in 1 or 2 milliseconds (and I think it even gives the correct answer for some definition of correct).

amleto
8th June 2012, 12:45
http://www.cplusplus.com/reference/algorithm/set_difference/

wysota
8th June 2012, 20:48
Try the code I posted and see if it is fast enough.

I don't think you can get faster than O(n+m) complexity which this algorithm has. At least not without knowing characteristics of the data and focusing only on SISD approach (you could make it faster with SIMD).