PDA

View Full Version : QRegExp lastIndexOf always minimal?



skyphyr
11th December 2006, 19:16
Hi All,

I think this may be a but, but perhaps it's something I've missed.

It seems that when you use lastIndexOf it won't be greedy. I think it's because it's the way it works - checking each time from the an index working it's way backwards then when it hits a match it stops.

So say I've got a filename:
coolfilename02.153.ext
I want to split this into
"coolfilename02." , "153", ".ext"
So I used the expression ([0-9]+)(\..*)
running from the last index of the expression [0-9]+\.

Problem is as soon as the last expression hits 3.ext it returns as successful - whereas to be greedy it should keep going until it hits 153.ext

Any suggestions for a clean way to fix it or should I just dig into the implementation of QRegExp and submit it upstream? I know I could just implement the backward searching myself, but doing it such an ugly way seems like a bad choice compared with fixing it in Qt. (Assuming it's broken and I haven't just missed something).

Cheers,

Alan.

danadam
11th December 2006, 20:22
You could search for last index of \\.[0-9]+\\. and add 1 to the result:
#include <QString>
#include <QRegExp>
#include <QtDebug>

int main() {
QString filename = "terefere.123.ext";
qDebug() << ( filename.lastIndexOf(QRegExp("[0-9]+\\.")) );
qDebug() << ( filename.lastIndexOf(QRegExp("\\.[0-9]+\\.")) + 1 );
}
As a result you get:

danadam@lappy]$ ./test
11
9

wysota
11th December 2006, 20:30
Hmm... this indeed is strange but I can believe the behaviour is intentional.

You may obtain the result you want using this code:

#include <QString>
#include <QRegExp>

int main(int argc, char **argv){
QString string = "coolfilename02.153.ext";
QRegExp rx("([0-9]+)(\\.[^\\.]*)$");
rx.setMinimal(true);
rx.indexIn(string);
qDebug(qPrintable(QString("0: %1 1: %2 2: %3").arg(rx.cap(0)).arg(rx.cap(1)).arg(rx.cap(2))));
return 0;
}

skyphyr
12th December 2006, 11:06
Thanks Guys - seems I've got a whole bunch of solutions now :-)

It also hit me last night while trying to fall asleep (why does code not working do that to me?)

[^0-9][0-9]+\.

Thanks again,

Alan.