PDA

View Full Version : QUrl and EUC-JP



Ignacio Serantes
20th February 2009, 15:52
Hi!

I have the next bash script to url encoding japanese text:



echo "$UTF8_TEXT" | iconv -f UTF-8 -t EUC-JP | od --width=512 -t x1 -A n | sed -e 's/ 0a$//g' -e 's/ /\%/g'


I take a japanese name in "UTF-8", encoded in "EUC-JP" using "iconv" and, finally, generate a % encoding usind "od" and "sed". For example:

UTF-8 text: 宇多田ヒカル
encoded text: %b1%a7%c2%bf%c5%c4%a5%d2%a5%ab%a5%eb

I try using QUrl in Javascript for a week but I fail :(. My last try was:



var codecName = new QByteArray("EUC-JP");
var codec = new QTextCodec.codecForName(codecName);
var url = new QUrl("http://music.goo.ne.jp/lyric/db.php");
url.addQueryItem("a", codec.fromUnicode(artist));
url.addQueryItem("k", codec.fromUnicode(title));
url.addQueryItem("l", "");
url.addQueryItem("s", "");
url.addQueryItem("c", "");
url.addQueryItem("submit", "");


When I try to download the html page, server searching fails for incorrect enconding. The site require % encoding in EUC-JP to works.

For context, I'm trying to convert a bash script I'm using to download japanese lyrics from music.goo.ne.jp to an Amarok 2 script.

Thank's in advance.

Ignacio Serantes
22nd February 2009, 16:19
I solved my problem using Amarok.Lyrics.fromUtf8(), QUrl.setEncodedQuery() and a custom function.


encodedTitle = new QByteArray(Amarok.Lyrics.fromUtf8(title, "EUC-JP"));
var url = new QUrl(URL_SEARCH + "/lyric/db.php");
url.setEncodedQuery(new QByteArray("a=" + ba2p(encodedArtist));"




function ba2p(ba) {
var str = '';
for (i = 0; i < ba.length(); i++) {
str += '%' + toHex(ba.at(i));
}
return str;
}