PDA

View Full Version : What is faster, QRegExp, or QByteArray::indexOf



_Stefan
15th September 2010, 15:48
Hello all,

I am trying to do some stream parsing, in where I would like to be able to parse HTTP headers, but this question i guess is more in general.

The parsing at least is time critical, so I want the fastest way to do this. String comparison is always a nasty thing. What would be i.e. the fastest way, to find the end of the HTTP header section (the empty CRLF line).

I guess, I could do it, by searching for the CRLF in the data stream part I just read. Or I could use a regular expression. QRegExp probably uses string comparison, as QByteArray may compare by byte values, which would be faster??

Or any other ideas? Has someone encountered this problem before and can give me some heads up on what would be the best solution here?

Thanks

wysota
15th September 2010, 16:43
I would expect them to be more or less equally fast in this particular situation.

SixDegrees
15th September 2010, 16:54
Impossible to say, without knowing the implementation details of the regexp package you're using. The only way to be sure would be testing in a profiler.

Note that there are a number of string-searching algorithms that are faster than brute-force linear search (look up the Boyer-Moore algorithm, for example). Good regexp packages tend to use such algorithms, but the regexp object itself introduces some overhead.

Honestly, I would code up the simplest thing I could think of, then run the whole program through a profiler ONLY IF performance was a demonstrable issue. Only at that point would I consider retooling the string search, and then only if it proved to be a significant bottleneck. Premature optimization is the root of all evil.

wysota
15th September 2010, 17:51
If he's searching for a two character long string then sophisticated algorithms are useless. Also the regexp won't have any benefits here, compilation of the state machine will add some overhead which will negate potential benefits of using the regular expression. I'd go for the simplest available solution which is using QByteArray::indexOf().

_Stefan
15th September 2010, 18:03
I created a little test application, to see the difference between searching methods.

I am catching the stream data from a QIODevice, so that gives me a QByteArray.
I tried fourways of searching in the QByteArray::indexOf, passing a QRegExp, QString, QByteArray and a const char*.

Here are the results, if I do 100.000 simple searches for CRLFCRLF in a QByteArray (times in msec)
testRegExp: 812
testString: 266
testByte: 204
testChar: 140

So I guess the most simple and fast way, is to just use the const char* overload of indexOf.
I'm guessing that is going to be fast enough.

Thanks for thinking with me ;)
Maybe this test result can help someone else someday!