PDA

View Full Version : Remove all spaces in string except spaces in quotes



bnosam
21st June 2014, 14:35
Lets say I have a string like this:


@BEGIN:4; 17, 1;1, "This is an example of text."; 3; 2; 5;11, LABEL;end;@LABEL:7, 1;8;end;@END

What would be an easy way to remove all whitespace except in the part between the quotes ("This is an example of text.")?

Alundra
21st June 2014, 15:19
QString has what you want : http://qt-project.org/doc/qt-5/QString.html
You have to use Right, Left, IndexOf and LastIndexOf

bnosam
21st June 2014, 18:38
QString has what you want : http://qt-project.org/doc/qt-5/QString.html
You have to use Right, Left, IndexOf and LastIndexOf


Is there an easier way to do this, because potentially there could be many strings in it that I don't want the spaces removed from. I'm not even sure how to approach this.

Like:


@BEGIN:4; 17, 1;1, "This is some text right here."; 3; 18;1, "This is more text."; 3; 18;1, "Another set of text right here."; 3; 2; 5;11, LABEL;end;@LABEL:7, 1;8;end;@END

anda_skoa
22nd June 2014, 09:14
There are several ways to approach this.

One way is to "parse" the input.You walk through the string in a loop.
If the loop body encouters a quote it toggles a quote flag. If it encounters a space and the quote flag is not on, it removes the space.

Another way, as Alundra, suggested, is to search for the quotes.
You search for the first quote starting at position 0. You can then use replace to remove all space from the string before that. Then you search for the next quote using the other's position +1 as a start. You append that substring directly into the result. Then you repeat with the remaining input string.

Cheers,
_

jefftee
24th June 2014, 07:30
Lets say I have a string like this:


@BEGIN:4; 17, 1;1, "This is an example of text."; 3; 2; 5;11, LABEL;end;@LABEL:7, 1;8;end;@END

What would be an easy way to remove all whitespace except in the part between the quotes ("This is an example of text.")?
Not sure how elegant the following code is, but seems to work for the example string you gave, as well as other test strings I used:


// Example to remove spaces from string except when properly quoted
QString str = "@BEGIN:4; 17, 1;1, \"This is \\\"some text\\\" right here.\"; 3; 18;1, \"This is more text.\"; 3; 18;1, \"Another set of text right here.\"; 3; 2; 5;11, LABEL;end;@LABEL:7, 1;8;end;@END";
QString outstr;

int pos = 0;
int q1 = str.indexOf("\"");
// skip escaped quotes
while (q1 != -1 && str.mid(q1-1,2) == "\\\"")
q1 = str.indexOf("\"", q1+1);

if (q1 == -1)
{
// no quotes in string, so just remove all spaces
outstr += str.mid(0).remove(" ");
}
else
{
while (q1 != -1)
{
int q2 = str.indexOf("\"", q1+1);
// skip escaped quotes
while (q2 != -1 && str.mid(q2-1,2) == "\\\"")
q2 = str.indexOf("\"", q2+1);
if (q2 == -1)
{
// unbalanced quotes, so strip all spaces from current pos to end of string (or return an error, etc)
outstr += str.mid(pos).remove(" ");
q1 = -1; // cause loop to break because unbalanced quote
}
else
{
// found balanced quote, so strip spaces before quote and append
// quoted portion, then look for next quoted section and continue looping
outstr += str.mid(pos, q1-pos).remove(" ");
outstr += str.mid(q1,q2-q1+1);
pos = q2 + 1;
q1 = str.indexOf("\"", pos);
// skip escaped quotes
while (q1 != -1 && str.mid(q1-1,2) == "\\\"")
q1 = str.indexOf("\"", q1+1);
if (q1 == -1)
{
// no more quoted text, remove spaces from rest of string
outstr += str.mid(pos).remove(" ");
}
}
}
}

qDebug(" str='%s'", qPrintable(str));
qDebug("outstr='%s'", qPrintable(outstr));


For the case where the code encounters unbalanced quotes, I simply treat the portion of the string at the current position before/after the quote as not being quoted, which seems reasonable since it's not properly quoted. For your use case, you may want to return an error value, etc.

Output that shows the input string and the output string:


// input string
str='@BEGIN:4; 17, 1;1, "This is \"some text\" right here."; 3; 18;1, "This is more text."; 3; 18;1, "Another set of text right here."; 3; 2; 5;11, LABEL;end;@LABEL:7, 1;8;end;@END'
// output string
outstr='@BEGIN:4;17,1;1,"This is \"some text\" right here.";3;18;1,"This is more text.";3;18;1,"Another set of text right here.";3;2;5;11,LABEL;end;@LABEL:7,1;8;end;@END'


Hope that helps,

Jeff

jefftee
26th June 2014, 03:37
Is there an easier way to do this, because potentially there could be many strings in it that I don't want the spaces removed from. I'm not even sure how to approach this.

Like:


@BEGIN:4; 17, 1;1, "This is some text right here."; 3; 18;1, "This is more text."; 3; 18;1, "Another set of text right here."; 3; 2; 5;11, LABEL;end;@LABEL:7, 1;8;end;@END

Here is one more version that uses regular expressions and removes the kludge used to skip over escaped quotes (the regex doesn't match on escaped quotes):




// remove spaces in string outside of quotes

QString str = "@BEGIN:4; 17, 1;1, \"This is some text right here.\"; 3; 18;1, \"This is more text.\"; 3; 18;1, \"Another set of text right here.\"; 3; 2; 5;11, LABEL;end;@LABEL:7, 1;8;end;@END";
QString outstr;

qDebug(".................................................. .................................................. 1.........1.........1.........1.........1......... 1.........1.........1.........");
qDebug("..........1.........2.........3.........4......... 5.........6.........7.........8.........9......... 0.........1.........2.........3.........4......... 5.........6.........7.........");
qDebug("01234567890123456789012345678901234567890123456789 01234567890123456789012345678901234567890123456789 01234567890123456789012345678901234567890123456789 012345678901234567890123456789");
qDebug("%s", qPrintable(str));

int pos, q1, q2 = 0;

QRegularExpression re("(?<![\\\\])\"");

pos = 0;
q1 = str.indexOf(re);

if (q1 == -1)
{
// no quotes in string, so just remove all spaces
outstr += str.mid(0).remove(" ");
}
else
{
while (q1 != -1 && q2 != -1)
{
q2 = str.indexOf(re, q1+1);
if (q2 == -1)
{
// unbalanced quotes, so strip all spaces from current pos to end of string (or return an error, etc)
outstr += str.mid(pos).remove(" ");
}
else
{
// found balanced quote, so strip spaces before quote and append
// quoted portion, then look for next quoted section and continue looping
outstr += str.mid(pos, q1-pos).remove(" ");
outstr += str.mid(q1,q2-q1+1);
pos = q2 + 1;
q1 = str.indexOf(re, pos);
if (q1 == -1)
{
// no more quoted text, remove spaces from rest of string
outstr += str.mid(pos).remove(" ");
}
}
}
}

qDebug("%s", qPrintable(outstr));


And the output that shows the original string and the output string after removing spaces (with a header to show column offset):



.................................................. .................................................. 1.........1.........1.........1.........1......... 1.........1.........1.........
..........1.........2.........3.........4......... 5.........6.........7.........8.........9......... 0.........1.........2.........3.........4......... 5.........6.........7.........
01234567890123456789012345678901234567890123456789 01234567890123456789012345678901234567890123456789 01234567890123456789012345678901234567890123456789 012345678901234567890123456789
------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
@BEGIN:4; 17, 1;1, "This is some text right here."; 3; 18;1, "This is more text."; 3; 18;1, "Another set of text right here."; 3; 2; 5;11, LABEL;end;@LABEL:7, 1;8;end;@END
@BEGIN:4;17,1;1,"This is some text right here.";3;18;1,"This is more text.";3;18;1,"Another set of text right here.";3;2;5;11,LABEL;end;@LABEL:7,1;8;end;@END