Is there any ready-made string similarity check in Qt?
I've read about Levenshtein distance or soundex, is there anything like this built in?
According to this thread (sorry, German language) I've implemented a counting of small substrings that gives me quite reasonable results for Names (which I need to compare) when I take substrings of n=2 chars and an 80% threshold.
I'd like to know if there is a better routine fast enough that I can just take and use.
This is what I use right now (corrections on style and speed are welcome!)
header:
Code:
implementation:
Code:
//Iterates substrings in groups of n chars from a und finds these in b. //The number of hits is then divided by the length of the shorter string. //To properly take word beginnings and endings into account //spaces are being inserted before and after the strings. { if (a.isEmpty()||b.isEmpty()) return false; qreal hits=0; QString part; for (int i=0;i<a.count()-(n-1);i++) { part=a.mid(i,n); if (b.contains(part,caseSense)) hits++; } if (a.length()<b.length()) return (percentage < (100*hits/(a.length()-(n-1)))); else return (percentage < (100*hits/(b.length()-(n-1)))); }
For the name "Markus Bertram" I get these results:
- Bertram, Markus - 93,3
- Markus E. Bertram - 100
- Marcus Emil Bertram - 86,7
- marc bertram - 84,6 (case-insensitive)
- Martin Bertram - 73,3 (false)
- Martin Bergmann - 46,7 (false)