Results 1 to 10 of 10

Thread: How do I use QRegExp to split an expression

  1. #1
    Join Date
    Dec 2006
    Posts
    426
    Thanks
    8
    Thanked 18 Times in 17 Posts
    Qt products
    Qt4
    Platforms
    Unix/X11

    Default How do I use QRegExp to split an expression

    Hi,

    I need to split string such as "Stage1 <= 4.4e-05 || Stage == 1.2 && Comp >= 1.4e+03 || A+e-C > D" to get all variables in the expression.

    In the example string, the result should be "Stage1", "4.4e-05", "Stage", "1.2", "Comp", "1.4e+03", "A", "e", "C", "D"

    I am using QRegExp rx( "[+\\-*/(),<>&=| ]" ) to split it.

    However, it also split "4.4e-05" and "1.4e+03". How can I write the QRegExp to split it without breaking scientific notation.

    Thanks!


    Here is sample code

    Qt Code:
    1. #include <QStringList>
    2. #include <QRegExp>
    3. #include <QDebug>
    4.  
    5. int main()
    6. {
    7. QString str( "Stage1 <= 4.4e-05 || Stage == 1.2 && Comp >= 1.4e+03 || A+e-C > D" );
    8. qDebug() << "str =" << str;
    9.  
    10. QRegExp rx( "[+\\-*/(),<>&=| ]" );
    11. QStringList strList = str.split( rx, QString::SkipEmptyParts );
    12.  
    13. qDebug() << "strList =" << strList;
    14. }
    To copy to clipboard, switch view to plain text mode 

  2. #2
    Join Date
    Jan 2008
    Location
    Alameda, CA, USA
    Posts
    5,230
    Thanks
    302
    Thanked 864 Times in 851 Posts
    Qt products
    Qt5
    Platforms
    Windows

    Default Re: How do I use QRegExp to split an expression

    You could try making two passes. In the first pass, do not use "+" or "-" in your reg exp. Take the string list result from pass 1 and examine each entry to see if it matches a reg exp for a number (you can search online for suitable regular expressions). If it matches, keep it. If not, then submit the substring to a second pass that splits on your original reg exp in Line 10.

    It is hard to write a single regular expression that will match arbitrary string expressions in a single pass. This is why the lex / yacc and flex / bison tools exist.

  3. #3
    Join Date
    Dec 2006
    Posts
    426
    Thanks
    8
    Thanked 18 Times in 17 Posts
    Qt products
    Qt4
    Platforms
    Unix/X11

    Default Re: How do I use QRegExp to split an expression

    Quote Originally Posted by d_stranz View Post
    You could try making two passes. In the first pass, do not use "+" or "-" in your reg exp. Take the string list result from pass 1 and examine each entry to see if it matches a reg exp for a number (you can search online for suitable regular expressions). If it matches, keep it. If not, then submit the substring to a second pass that splits on your original reg exp in Line 10.

    It is hard to write a single regular expression that will match arbitrary string expressions in a single pass. This is why the lex / yacc and flex / bison tools exist.
    I did use two passes, but still fail. I use the following string for test
    "a-aa+bb+4.4e-05-1.2e+2"

    It should split into "a", "aa", "bb", "4.4e-05", "1.2e+2", but it doesn't. Please help.

    Here is my code
    Qt Code:
    1. #include <QStringList>
    2. #include <QRegExp>
    3. #include <QDebug>
    4.  
    5. static QStringList getFormulaVarList( const QString& formula )
    6. {
    7. QString::SplitBehavior behavior = QString::SkipEmptyParts;
    8.  
    9. QRegExp opRx( "[*/()<>&=| ]" );
    10. QRegExp plusMinus( "[+\\-]" );
    11.  
    12. QStringList strList;
    13. foreach( const QString& str, formula.split( opRx, behavior ) ) {
    14. bool ok;
    15. str.toDouble( &ok );
    16. if ( !ok ) {
    17. strList << str.split( plusMinus, behavior );
    18. }
    19. }
    20. strList.removeDuplicates();
    21.  
    22. QStringList result;
    23. foreach( const QString& str, strList ) {
    24. bool ok;
    25. str.toDouble( &ok );
    26. if ( !ok && !str.startsWith( "math.", Qt::CaseInsensitive ) ) {
    27. result << str;
    28. }
    29. }
    30. result.removeDuplicates();
    31.  
    32. return result;
    33.  
    34. } // getFormulaVarList
    35.  
    36. int main( int argc, char** argv )
    37. {
    38. //QString formula( "Stage1 <= 4.4e-05 || Stage == 1.2 && Comp >= 1.4e+03 || A+e-C > D" );
    39. //QString formula( "a * aa+4.4e-05 + math.log( b )" );
    40. //QString formula( "a * aa + 4.4e-05 + math.log( b )" );
    41. QString formula( argv[ 1 ] );
    42.  
    43. qDebug() << "formula =" << formula;
    44.  
    45. qDebug() << "items =" << getFormulaVarList( formula );
    46. }
    To copy to clipboard, switch view to plain text mode 

  4. #4
    Join Date
    Jan 2008
    Location
    Alameda, CA, USA
    Posts
    5,230
    Thanks
    302
    Thanked 864 Times in 851 Posts
    Qt products
    Qt5
    Platforms
    Windows

    Default Re: How do I use QRegExp to split an expression

    I think you should read the documentation on QRegularExpression and especially read the Perl tutorial linked to in that doc. Your regular expressions are much too simple to match the kind of strings you have as input, and I do not think you can do it with a single regular expression.

    "a-aa+bb+4.4e-05-1.2e+2"
    You might also read this.
    You can see that even parsing simple expressions like this one takes a lot of code to recognize the symbols, operators, and constants and convert that into tokens.

  5. #5
    Join Date
    Dec 2006
    Posts
    426
    Thanks
    8
    Thanked 18 Times in 17 Posts
    Qt products
    Qt4
    Platforms
    Unix/X11

    Default Re: How do I use QRegExp to split an expression

    That code doesn't seem to be right.

    I download the C/C++ code, and use "4.4e-05 + 1" to test, it gives
    Result = -0.6
    .....

    The problem is to split "+" and "-" if they are math operator, but don't split scientific notation, such as "4.4e-05"


    Added after 48 minutes:


    I use 3 passes and it appears to work

    Qt Code:
    1. #include <QStringList>
    2. #include <QRegExp>
    3. #include <QDebug>
    4.  
    5. static QStringList getFormulaVarList( const QString& formula )
    6. {
    7. QString::SplitBehavior behavior = QString::SkipEmptyParts;
    8.  
    9. // split the formular in 3 passes
    10. QRegExp noScientificRx( "\\d+[e|E][+-]\\d+" );
    11. QRegExp opRx( "[*/()<>&=| ]" );
    12. QRegExp plusMinusRx( "[+\\-]" );
    13.  
    14. QStringList varList;
    15. // pass 1 - remove scientific notation
    16. foreach( const QString& scientific, formula.split( noScientificRx, behavior ) ) {
    17. // pass 2 - remove math operator
    18. foreach( const QString& str, scientific.split( opRx, behavior ) ) {
    19. bool ok;
    20. str.toDouble( &ok );
    21. if ( !ok ) {
    22. // pass 3 - remove +/-
    23. varList << str.split( plusMinusRx, behavior );
    24. }
    25. }
    26. }
    27. varList.removeDuplicates();
    28.  
    29. QStringList result;
    30. foreach( const QString& str, varList ) {
    31. // finally remove numbers and math objects
    32. bool ok;
    33. str.toDouble( &ok );
    34. if ( !ok && !str.startsWith( "math.", Qt::CaseInsensitive ) ) {
    35. result << str;
    36. }
    37. }
    38. result.removeDuplicates();
    39.  
    40. return result;
    41.  
    42. } // getFormulaVarList
    43.  
    44. int main( int argc, char** argv )
    45. {
    46. //QString formula( "Stage1 <= 4.4e-05 || Stage == 1.2 && Comp >= 1.4e+03 || A+e-C > D" );
    47. //QString formula( "a * aa+4.4e-05 + math.log( b )" );
    48. //QString formula( argv[ 1 ] );
    49. QString formula;
    50. if ( argc == 1 ) {
    51. formula = "a*aa+4.4e-05+math.log( b )";
    52. } else {
    53. formula = argv[ 1 ];
    54. }
    55.  
    56. qDebug() << "formula =" << formula;
    57. qDebug() << "variables =" << getFormulaVarList( formula );
    58. }
    To copy to clipboard, switch view to plain text mode 


    Added after 5 minutes:


    I reduce to 2 passes

    Qt Code:
    1. #include <QStringList>
    2. #include <QRegExp>
    3. #include <QDebug>
    4.  
    5. static QStringList getFormulaVarList( const QString& formula )
    6. {
    7. QString::SplitBehavior behavior = QString::SkipEmptyParts;
    8.  
    9. // split the formular in 2 passes
    10. QRegExp scientificRx( "\\d+[e|E][+-]\\d+" );
    11. QRegExp opRx( "[*/()<>&=|+\\- ]" );
    12.  
    13. QStringList varList;
    14. // pass 1 - remove scientific notation
    15. foreach( const QString& noScientific,
    16. formula.split( scientificRx, behavior ) ) {
    17. // pass 2 - remove math operator
    18. foreach( const QString& str,
    19. noScientific.split( opRx, behavior ) ) {
    20. bool ok;
    21. str.toDouble( &ok );
    22. if ( !ok ) {
    23. varList << str;
    24. }
    25. }
    26. }
    27. varList.removeDuplicates();
    28.  
    29. QStringList result;
    30. foreach( const QString& str, varList ) {
    31. // finally remove numbers and math objects
    32. bool ok;
    33. str.toDouble( &ok );
    34. if ( !ok && !str.startsWith( "math.", Qt::CaseInsensitive ) ) {
    35. result << str;
    36. }
    37. }
    38. result.removeDuplicates();
    39.  
    40. return result;
    41.  
    42. } // getFormulaVarList
    43.  
    44. int main( int argc, char** argv )
    45. {
    46. //QString formula( "Stage1 <= 4.4e-05 || Stage == 1.2 && Comp >= 1.4e+03 || A+e-C > D" );
    47. //QString formula( "a * aa+4.4e-05 + math.log( b )" );
    48. //QString formula( argv[ 1 ] );
    49. QString formula;
    50. if ( argc == 1 ) {
    51. formula = "a*aa+4.4e-05+math.log( b )";
    52. } else {
    53. formula = argv[ 1 ];
    54. }
    55.  
    56. qDebug() << "formula =" << formula;
    57. qDebug() << "variables =" << getFormulaVarList( formula );
    58. }
    To copy to clipboard, switch view to plain text mode 
    Last edited by lni; 13th January 2016 at 06:48.

  6. #6
    Join Date
    Jan 2008
    Location
    Alameda, CA, USA
    Posts
    5,230
    Thanks
    302
    Thanked 864 Times in 851 Posts
    Qt products
    Qt5
    Platforms
    Windows

    Default Re: How do I use QRegExp to split an expression

    //QString formula( "Stage1 <= 4.4e-05 || Stage == 1.2 && Comp >= 1.4e+03 || A+e-C > D" );
    //QString formula( "a * aa+4.4e-05 + math.log( b )" );
    Well, good for you. What is the output from line 57 for these two inputs?

  7. #7
    Join Date
    Dec 2006
    Posts
    426
    Thanks
    8
    Thanked 18 Times in 17 Posts
    Qt products
    Qt4
    Platforms
    Unix/X11

    Default Re: How do I use QRegExp to split an expression

    "Stage1 <= 4.4e-05 || Stage == 1.2 && Comp >= 1.4e+03 || A+e-C > D"
    get ("Stage1", "Stage", "Comp", "A", "e", "C", "D")

    "a * aa+4.4e-05 + math.log( b )"
    get ("a", "aa", "b")

    This is what I need. I need all variables in the formula so I can give those inputs to the script engine.

  8. #8
    Join Date
    Oct 2009
    Posts
    483
    Thanked 97 Times in 94 Posts
    Qt products
    Qt4 Qt5
    Platforms
    Unix/X11 Windows

    Default Re: How do I use QRegExp to split an expression

    Forget QRegExp and QRegularExpression and spend a few tens of minutes to learn a real lexer generator such as Flex (or Flex++).

  9. #9
    Join Date
    Oct 2009
    Posts
    483
    Thanked 97 Times in 94 Posts
    Qt products
    Qt4 Qt5
    Platforms
    Unix/X11 Windows

    Default Re: How do I use QRegExp to split an expression

    I'm having a change of heart; I still recommend that you use a lexer generator, but here is a solution based on QRegularExpression.

    The thing to pay attention to is that, even if you are interested in identifiers only, you have to parse floating-point numbers too, because you need to identify which occurrences of "e" belong to a number and which are part of an identifier.

    Anyway, here is a function that prints all the numbers and identifiers in the string s:
    Qt Code:
    1. void printMatches(const QString &s) {
    2. static QRegularExpression varOrNumMatcher("[a-zA-Z_][a-zA-Z\\d]*|(?:\\.\\d+|\\d+\\.?\\d*)(?:[eE][+-]?\\d+)?");
    3. QRegularExpressionMatchIterator i = varOrNumMatcher.globalMatch(s);
    4. while (i.hasNext())
    5. qDebug() << i.next().capturedRef();
    6. }
    To copy to clipboard, switch view to plain text mode 
    Identifiers are made of underscores, ASCII letters and digits, and must not begin with a digit.
    Numbers have an optional integer part, an optional fractional part, and an optional exponent (introduced by 'e' or 'E' and an optional sign).
    This simplistic lexer does not parse negative numbers, and does not recognize built-in identifiers like "math" and "log", from your last example.
    You'll have to take it from here.

    Notice that a regular expression is OK for lexing, but cannot help you parse recursive expressions (such as arithmetic expressions with parentheses). You need a context-free grammar for that. See Flex and Bison.

  10. #10
    Join Date
    Jan 2008
    Location
    Alameda, CA, USA
    Posts
    5,230
    Thanks
    302
    Thanked 864 Times in 851 Posts
    Qt products
    Qt5
    Platforms
    Windows

    Default Re: How do I use QRegExp to split an expression

    I'm having a change of heart; I still recommend that you use a lexer generator
    That was my first thought, but there is so much overhead to learning those tools that the OP might as well spend the time using regular expressions in Qt. Agreed that if the expressions ever get more complex than the examples posted or if there are precedence rules / recursive expressions, a grammar will be needed. It appears that all the OP needs at this point are the identifiers and not an expression tree for evaluation.

Similar Threads

  1. Replies: 7
    Last Post: 6th February 2017, 19:10
  2. Split layout
    By akiross in forum Qt Programming
    Replies: 0
    Last Post: 19th August 2011, 20:26
  3. Replies: 1
    Last Post: 8th June 2011, 06:44
  4. split QByteArray
    By xproch13 in forum Newbie
    Replies: 2
    Last Post: 29th October 2010, 21:50
  5. Expression in QRegExp
    By lyucs in forum Qt Programming
    Replies: 4
    Last Post: 28th May 2009, 13:53

Bookmarks

Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts
  •  
Digia, Qt and their respective logos are trademarks of Digia Plc in Finland and/or other countries worldwide.