PDA

View Full Version : Regular expression help!



ConkerX
30th August 2011, 19:19
Hi,
I'm need a help to make a regular expression.
I need a regular expression that validate a string with the following rules:

1- The first character allowed are a-z and A-Z.
2- Is allowed to put after the the first character a-z, A-Z, 0-9, . and -.
3- It not allowed sequence of period(.). (e.g. "qt..framework" its not valid)
5- The string can be infinity.
6- The end of string can't contain . or -.

I tryed to do my own regexp and the result was "(([a-zA-Z][-a-zA-Z0-9]*(\\.[-a-zA-Z0-9]+)*)[^-.]+$)".
My huge problem its to validate the end of string, checking if there is some - or ..
When i put "[^-.]+$" i have two problens one is that i get Intermediate result from
QRegExpValidator::validate() and second is when i put only a character to be
validate i got Intermediate result too.

Can someone help me?
Thanks :)

ars
30th August 2011, 20:16
When I get you right, your string can only consist of characters a-z, A-Z, 0-9 and "." and "-". According to 6 above, your string then ends in one of the characters a-z, A-Z or 0-9. So you could replace "[^-.]+$" by "[a-zA-Z0-9]$". Maybe that helps.

SixDegrees
30th August 2011, 20:22
Get rid of all those parentheses. You don't need them unless you're trying to capture sub-matches, and the way they're arranged they're not going to do that in any meaningful way.

ConkerX
30th August 2011, 20:49
When I get you right, your string can only consist of characters a-z, A-Z, 0-9 and "." and "-". According to 6 above, your string then ends in one of the characters a-z, A-Z or 0-9. So you could replace "[^-.]+$" by "[a-zA-Z0-9]$". Maybe that helps.

Thanks ars,
I already tryed this and still give me the "Intermediate" result when try for example "test-" or "t".

Here is how i substitute with your sugestion:
(([a-zA-Z][-a-zA-Z0-9]*(\\.[-a-zA-Z0-9]+)*)[a-zA-Z0-9]$)

Sry i posted the wrong regexp that i did.

This is the right one:
(([a-zA-Z][-a-zA-Z0-9]*(\\.[-a-zA-Z0-9]+)*)[^-.]+$)


Get rid of all those parentheses. You don't need them unless you're trying to capture sub-matches, and the way they're arranged they're not going to do that in any meaningful way.

Thank for the help SixDegrees,

Sry i posted the wrong regexp that i did.

This is the right one:
(([a-zA-Z][-a-zA-Z0-9]*(\\.[-a-zA-Z0-9]+)*)[^-.]+$)

I can't take off the parentheses if i take off i get wrong results.

Lykurg
30th August 2011, 20:58
You get the Intermediate state because you should get it. Because the input can become valid very likely by the next input. That is how a validator works. And why is that a problem? You can check for !QValidator::Acceptable. Or use QRegExp::exactMatch.

And to the problem if only one char is typed (if that input should be ok) you have to do something like that:
(your pattern | ^[0-9a-zA-Z]{1}$)

ConkerX
30th August 2011, 22:09
You get the Intermediate state because you should get it. Because the input can become valid very likely by the next input. That is how a validator works. And why is that a problem? You can check for !QValidator::Acceptable. Or use QRegExp::exactMatch.

And to the problem if only one char is typed (if that input should be ok) you have to do something like that:
(your pattern | ^[0-9a-zA-Z]{1}$)

Thank for the help Lykurg,

I didn't know about this method QRegExp::exactMatch, maybe i will use it.

Your pattern ^[0-9a-zA-Z]{1}$ dont fit with my rules. The user can type "test--" and this string is not valid.
With my pattern (([a-zA-Z][-a-zA-Z0-9]*(\\.[-a-zA-Z0-9]+)*)[^-.]+$) i almost get the right result.
I just have problem with the '.' and - at the end of string and with a single letter string.
e.g "a" with my pattern i got false with exactMatch and Intermediate with validate() and the "a" alone is a valid string.

Example of valid and not valid strings:
2test = not valid
-test = not valid
.test = not valid
test = not valid
test = valid
test2 = valid
testX = valid
test. = not valid
test-.-. = not valid
t = valid
test..t = not valid
te.s.t = valid
t.es-t = valid
tes--t = valid
t2est = valid
te3443st = valid

Lykurg
30th August 2011, 23:33
Oh boy, what I was trying to tell you was:
QRegExp rx("^([a-zA-Z]+([-a-zA-Z0-9]|\\.[-a-zA-Z0-9]+)*|[a-zA-Z]{1})$");
QStringList teststrings = QString("2test -test .test test test2 testX test. test-.-. t test..t te.s.t t.es-t tes--t t2est te3443st").split(" ");
for (int i = 0; i < teststrings.count(); ++i)
qWarning() << teststrings.at(i) << rx.exactMatch(teststrings.at(i));

"2test" false
"-test" false
".test" false
"test" true
"test2" true
"testX" true
"test." false
"test-.-." false
"t" true
"test..t" false
"te.s.t" true
"t.es-t" true
"tes--t" true
"t2est" true
"te3443st" true ...or even more easier:
QRegExp rx("^[a-zA-Z]+([-a-zA-Z0-9]|\\.[-a-zA-Z0-9]+)*$");

ConkerX
31st August 2011, 14:55
Oh boy, what I was trying to tell you was:
QRegExp rx("^([a-zA-Z]+([-a-zA-Z0-9]|\\.[-a-zA-Z0-9]+)*|[a-zA-Z]{1})$");
QStringList teststrings = QString("2test -test .test test test2 testX test. test-.-. t test..t te.s.t t.es-t tes--t t2est te3443st").split(" ");
for (int i = 0; i < teststrings.count(); ++i)
qWarning() << teststrings.at(i) << rx.exactMatch(teststrings.at(i));
...or even more easier:
QRegExp rx("^[a-zA-Z]+([-a-zA-Z0-9]|\\.[-a-zA-Z0-9]+)*$");

Thanks Lykurg for your help!

I got surprise with your result but i start to try with more words and i saw a error with some strings i forgot to add to the list.

Add those strings to your list:

test-- = Not valid (With your pattern is valid)
test- = Not valid (With your pattern is valid)
test.. = Not valid (With your pattern is Ok i just put to dont forget to test :D)


QStringList teststrings = QString("2test -test .test test test2 testX test. test-.-. t test..t te.s.t t.es-t tes--t t2est te3443st test-- test- test..").split(" ");

Thanks for your dedication in helping me! :p

Lykurg
31st August 2011, 15:49
Well, that is your part to add the exceptions for "-". Also should "aa.-bb" be valid?

ConkerX
31st August 2011, 16:26
That is my doubt how to don't allow a - or . at the end of string, i tryed with [^-.]+$ at the end of my pattern but the string "t" get invalid.

My pattern:

"(([a-zA-Z][-a-zA-Z0-9]*(\\.[-a-zA-Z0-9]+)*)[^-.]+$)"

The string "aa.-bb" its not valid too.

Lykurg
31st August 2011, 16:47
In your approach a single t is "captured" with the first [a-zA-Z] and then the pattern requires an other one by [^-.]+. That is the problem. So why do you don't try to understand my last solution and simple alter it by applying the same rules for . also for -? As a hint:

Characters to be deleted: --
Characters to be added: )(|-
May by you also need: 1}{ but I don't think so.

Go ahead!