Results 1 to 6 of 6

Thread: QRegExp matchedLength seems incorrect

  1. #1
    Join Date
    Jul 2009
    Posts
    4
    Qt products
    Qt4
    Platforms
    MacOS X Unix/X11 Windows

    Question QRegExp matchedLength seems incorrect

    Hi all,

    I was working on a project to parse C++ code, when I encountered strange regex behaviour.

    I created a regex to match an if statement and it's following 'true' section.
    Qt Code:
    1. "\b(if\s*\(([^\(\)]*|\([^\(\)]*\))*\)\s*)(\{([^\{\}]*|\{[^\{\}]*\})*\}|[^;\{\}]*;)"
    To copy to clipboard, switch view to plain text mode 

    let me explain what's going on here; the regex searches for an the occurence of 'if' followed by 0 or more whitespace, followed by a '(' and a matching ')' character, followed by 0 or more whitespace. This will be stored in the first backreference.
    Then it searches for the first '{' and a matching '}' character, if these could not be found, it searches to find the first ';'.

    So when using this regexp, one should be able to find the if condition by checking the first captured text. ( cap(1) )
    In practice however, I have found that I sometimes get a positive match, meaning QRegExp.indexIn returnes >= 0, but that the matchedLength of the regexp is in fact 0, and the captured texts are also empty. On other occasions everything works as expected..

    As regular expressions can be hard to get right, I thought I'd post this here first before filing a bug report @ Qt

    please let me know if I have overlooked something!

    thanks in advance!

    - Arjan

  2. #2
    Join Date
    Jul 2008
    Location
    Germany
    Posts
    503
    Thanks
    11
    Thanked 76 Times in 74 Posts
    Qt products
    Qt4 Qt5
    Platforms
    Unix/X11 Windows

    Default Re: QRegExp matchedLength seems incorrect

    Hi, there is a hint in Qt docs:
    The C++ compiler transforms backslashes in strings, so to include a \ in a regexp, you will need to enter it twice, i.e. \\. To match the backslash character itself, you will need four: \\\\.
    Or did you just omit the double backslashes for better readability?

    Ginsengelf

  3. #3
    Join Date
    Jul 2009
    Posts
    4
    Qt products
    Qt4
    Platforms
    MacOS X Unix/X11 Windows

    Default Re: QRegExp matchedLength seems incorrect

    Quote Originally Posted by Ginsengelf View Post
    Hi, there is a hint in Qt docs:


    Or did you just omit the double backslashes for better readability?
    jup, I'll spare you the 'real' string

    As posted, the regexp does work, it's just that there are instances when it does not, when it behaves as stated in the start post. Any ideas on that?

  4. #4
    Join Date
    Jul 2008
    Location
    Germany
    Posts
    503
    Thanks
    11
    Thanked 76 Times in 74 Posts
    Qt products
    Qt4 Qt5
    Platforms
    Unix/X11 Windows

    Default Re: QRegExp matchedLength seems incorrect

    Could you post an example string where the expression does not work properly?

  5. #5
    Join Date
    Jul 2009
    Posts
    4
    Qt products
    Qt4
    Platforms
    MacOS X Unix/X11 Windows

    Default Re: QRegExp matchedLength seems incorrect

    I've been testing using the string "a a() { a; if( b ) b; if( c ) { c; } d; }".

    In a normal test, this works as expected ( even though valgrind complains about invalid reads in QRegExp::matchedLength / cap )

    In my program however, the match sometimes fails. I suspect it has something to do with the fact that the string is being looked at recursively, combined with the fact that the QRegExp objects are copied to a QVector.

    I have not been able to create a 'bare' test application in which this fails, so I suspect there is something wrong in the regex engine, which only fails in specific conditions.

  6. #6
    Join Date
    Jul 2009
    Posts
    4
    Qt products
    Qt4
    Platforms
    MacOS X Unix/X11 Windows

    Default Re: QRegExp matchedLength seems incorrect

    I have been able to create a testcase!

    If anyone sees what exactly is going on here, I would like to know

    you can find the source code here

    and here's the valgrind log:
    ==3771== Memcheck, a memory error detector.
    ==3771== Copyright (C) 2002-2008, and GNU GPL'd, by Julian Seward et al.
    ==3771== Using LibVEX rev 1884, a library for dynamic binary translation.
    ==3771== Copyright (C) 2004-2008, and GNU GPL'd, by OpenWorks LLP.
    ==3771== Using valgrind-3.4.1-Debian, a dynamic binary instrumentation framework.
    ==3771== Copyright (C) 2000-2008, and GNU GPL'd, by Julian Seward et al.
    ==3771== For more details, rerun with: -v
    ==3771==
    ==3771== My PID = 3771, parent PID = 3639. Prog and args are:
    ==3771== ./RegExpTest
    ==3771==
    ==3771== Invalid read of size 4
    ==3771== at 0x40CB52C: QRegExp::matchedLength() const (in /usr/lib/libQtCore.so.4.5.0)
    ==3771== by 0x804BF51: RegExpTest::RegExpTest(QString const&, int const&, int const&, RegExpTest::Type const&) (in /home/arjan/C++/RegExpTest/RegExpTest)
    ==3771== by 0x8049253: main (in /home/arjan/C++/RegExpTest/RegExpTest)
    ==3771== Address 0x4667048 is 1,728 bytes inside a block of size 1,764 free'd
    ==3771== at 0x4025DFA: free (vg_replace_malloc.c:323)
    ==3771== by 0x40D08CC: (within /usr/lib/libQtCore.so.4.5.0)
    ==3771== by 0x40D0A6B: QRegExp::setPattern(QString const&) (in /usr/lib/libQtCore.so.4.5.0)
    ==3771== by 0x804A14E: RegExpTest::findBlocks(int) (in /home/arjan/C++/RegExpTest/RegExpTest)
    ==3771== by 0x804BF51: RegExpTest::RegExpTest(QString const&, int const&, int const&, RegExpTest::Type const&) (in /home/arjan/C++/RegExpTest/RegExpTest)
    ==3771== by 0x8049253: main (in /home/arjan/C++/RegExpTest/RegExpTest)
    ==3771==
    ==3771== Invalid read of size 4
    ==3771== at 0x40D4BA2: QRegExp::capturedTexts() const (in /usr/lib/libQtCore.so.4.5.0)
    ==3771== by 0x40D4DD7: QRegExp::cap(int) const (in /usr/lib/libQtCore.so.4.5.0)
    ==3771== by 0x40D4E7F: QRegExp::cap(int) (in /usr/lib/libQtCore.so.4.5.0)
    ==3771== by 0x804A6F1: RegExpTest::findBlocks(int) (in /home/arjan/C++/RegExpTest/RegExpTest)
    ==3771== by 0x804BF51: RegExpTest::RegExpTest(QString const&, int const&, int const&, RegExpTest::Type const&) (in /home/arjan/C++/RegExpTest/RegExpTest)
    ==3771== by 0x8049253: main (in /home/arjan/C++/RegExpTest/RegExpTest)
    ==3771== Address 0x4667048 is 1,728 bytes inside a block of size 1,764 free'd
    ==3771== at 0x4025DFA: free (vg_replace_malloc.c:323)
    ==3771== by 0x40D08CC: (within /usr/lib/libQtCore.so.4.5.0)
    ==3771== by 0x40D0A6B: QRegExp::setPattern(QString const&) (in /usr/lib/libQtCore.so.4.5.0)
    ==3771== by 0x804A14E: RegExpTest::findBlocks(int) (in /home/arjan/C++/RegExpTest/RegExpTest)
    ==3771== by 0x804BF51: RegExpTest::RegExpTest(QString const&, int const&, int const&, RegExpTest::Type const&) (in /home/arjan/C++/RegExpTest/RegExpTest)
    ==3771== by 0x8049253: main (in /home/arjan/C++/RegExpTest/RegExpTest)
    ==3771==
    ==3771== Invalid read of size 4
    ==3771== at 0x40D4BB4: QRegExp::capturedTexts() const (in /usr/lib/libQtCore.so.4.5.0)
    ==3771== by 0x40D4DD7: QRegExp::cap(int) const (in /usr/lib/libQtCore.so.4.5.0)
    ==3771== by 0x40D4E7F: QRegExp::cap(int) (in /usr/lib/libQtCore.so.4.5.0)
    ==3771== by 0x804A6F1: RegExpTest::findBlocks(int) (in /home/arjan/C++/RegExpTest/RegExpTest)
    ==3771== by 0x804BF51: RegExpTest::RegExpTest(QString const&, int const&, int const&, RegExpTest::Type const&) (in /home/arjan/C++/RegExpTest/RegExpTest)
    ==3771== by 0x8049253: main (in /home/arjan/C++/RegExpTest/RegExpTest)
    ==3771== Address 0x4667044 is 1,724 bytes inside a block of size 1,764 free'd
    ==3771== at 0x4025DFA: free (vg_replace_malloc.c:323)
    ==3771== by 0x40D08CC: (within /usr/lib/libQtCore.so.4.5.0)
    ==3771== by 0x40D0A6B: QRegExp::setPattern(QString const&) (in /usr/lib/libQtCore.so.4.5.0)
    ==3771== by 0x804A14E: RegExpTest::findBlocks(int) (in /home/arjan/C++/RegExpTest/RegExpTest)
    ==3771== by 0x804BF51: RegExpTest::RegExpTest(QString const&, int const&, int const&, RegExpTest::Type const&) (in /home/arjan/C++/RegExpTest/RegExpTest)
    ==3771== by 0x8049253: main (in /home/arjan/C++/RegExpTest/RegExpTest)
    ==3771==
    ==3771== ERROR SUMMARY: 37 errors from 3 contexts (suppressed: 33 from 1)
    ==3771== malloc/free: in use at exit: 0 bytes in 0 blocks.
    ==3771== malloc/free: 4,856 allocs, 4,856 frees, 849,164 bytes allocated.
    ==3771== For counts of detected errors, rerun with: -v
    ==3771== All heap blocks were freed -- no leaks are possible.
    Last edited by Arjan; 16th July 2009 at 20:03.

Similar Threads

  1. QRegExp Help
    By Ahmad in forum Qt Programming
    Replies: 2
    Last Post: 28th May 2007, 01:13
  2. need help for my QRegExp
    By patcito in forum Qt Programming
    Replies: 1
    Last Post: 27th May 2006, 17:29

Tags for this Thread

Bookmarks

Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts
  •  
Digia, Qt and their respective logos are trademarks of Digia Plc in Finland and/or other countries worldwide.