Hi all,

I was working on a project to parse C++ code, when I encountered strange regex behaviour.

I created a regex to match an if statement and it's following 'true' section.
Qt Code:
  1. "\b(if\s*\(([^\(\)]*|\([^\(\)]*\))*\)\s*)(\{([^\{\}]*|\{[^\{\}]*\})*\}|[^;\{\}]*;)"
To copy to clipboard, switch view to plain text mode 

let me explain what's going on here; the regex searches for an the occurence of 'if' followed by 0 or more whitespace, followed by a '(' and a matching ')' character, followed by 0 or more whitespace. This will be stored in the first backreference.
Then it searches for the first '{' and a matching '}' character, if these could not be found, it searches to find the first ';'.

So when using this regexp, one should be able to find the if condition by checking the first captured text. ( cap(1) )
In practice however, I have found that I sometimes get a positive match, meaning QRegExp.indexIn returnes >= 0, but that the matchedLength of the regexp is in fact 0, and the captured texts are also empty. On other occasions everything works as expected..

As regular expressions can be hard to get right, I thought I'd post this here first before filing a bug report @ Qt

please let me know if I have overlooked something!

thanks in advance!

- Arjan