liuyanghejerry
4th June 2011, 05:50
Hi, I use the XQuery in Qt to read the HTML, the query is:
declare variable $inputDoc external;
doc($inputDoc)/tbody
When I use it for:
<tbody>
<xx>asd</xx>
</tbody>
It is valid, but when I use it for:
<tbody>
<tr class="epRowTwo">
<td colspan="2" class="c"><img src="/images/cds/137/main.png" alt="Main Image">
</td>
</tr>
<tr class="epRowOne">
<td class="b">Title</td>
<td>
<div style="width:300px;">
風ã®ãƒ¡ãƒƒã‚»ãƒ¼ã‚¸ï¼ ã“ã®ã‚†ã³ã¨ã¾ã‚Œ (通常盤)
<br>
<div style="margin-left:2em;">
<b>English:</b> Message of the Wind / Follow Me (Regular Version)<br><b>Japanese (Romanized):</b> Kaze no Message / Kono Yubi Tomare (Regular Version)<br><b>Japanese (Trans):</b> Message of the Wind / Follow Me (Regular Version)<br>
</div>
</div>
</td>
</tr>
<tr class="epRowTwo">
<td class="b">Artist</td>
<td><div style="width:300px;">水橋舞 / ã‚ãよã—ãµã¿ãˆ (Mai Mizuhashi / Akiyoshi Fumie)</div></td>
</tr>
<tr class="epRowOne">
<td class="b">Catalog #</td>
<td>ZMCP-4082</td>
</tr>
<tr class="epRowTwo">
<td class="b">Release Date</td>
<td>2008-05-28</td>
</tr>
<tr class="epRowOne">
<td class="b">Language</td>
<td>Japanese</td>
</tr>
<tr class="epRowTwo">
<td class="b"># of Discs</td>
<td>1</td>
</tr>
<tr class="epRowOne">
<td class="b"># of Tracks</td>
<td>5</td>
</tr>
<tr class="epRowTwo">
<td class="b">Price/MSRP</td>
<td>1,365円</td>
</tr>
<tr class="epRowOne">
<td class="b">Run Time</td>
<td>20:06</td>
</tr>
<tr class="epRowTwo">
<td class="b">Your Rating</td>
<td>You must be logged in to rate.</td>
</tr>
<tr class="epRowOne">
<td class="b">Avg Rating</td>
<td><span class="cd-137">9.0000</span> (<span class="votescount-cd-">5</span>)</td>
</tr>
<tr class="epRowTwo">
<td class="b">Description</td>
<td>
<div style="width:300px;">
A Limited Edition CD+DVD version was also released on the same day.
</div>
</td>
</tr>
</tbody>
it is always invalid.
I just don't know why...
Is HTML a kind of XML?
declare variable $inputDoc external;
doc($inputDoc)/tbody
When I use it for:
<tbody>
<xx>asd</xx>
</tbody>
It is valid, but when I use it for:
<tbody>
<tr class="epRowTwo">
<td colspan="2" class="c"><img src="/images/cds/137/main.png" alt="Main Image">
</td>
</tr>
<tr class="epRowOne">
<td class="b">Title</td>
<td>
<div style="width:300px;">
風ã®ãƒ¡ãƒƒã‚»ãƒ¼ã‚¸ï¼ ã“ã®ã‚†ã³ã¨ã¾ã‚Œ (通常盤)
<br>
<div style="margin-left:2em;">
<b>English:</b> Message of the Wind / Follow Me (Regular Version)<br><b>Japanese (Romanized):</b> Kaze no Message / Kono Yubi Tomare (Regular Version)<br><b>Japanese (Trans):</b> Message of the Wind / Follow Me (Regular Version)<br>
</div>
</div>
</td>
</tr>
<tr class="epRowTwo">
<td class="b">Artist</td>
<td><div style="width:300px;">水橋舞 / ã‚ãよã—ãµã¿ãˆ (Mai Mizuhashi / Akiyoshi Fumie)</div></td>
</tr>
<tr class="epRowOne">
<td class="b">Catalog #</td>
<td>ZMCP-4082</td>
</tr>
<tr class="epRowTwo">
<td class="b">Release Date</td>
<td>2008-05-28</td>
</tr>
<tr class="epRowOne">
<td class="b">Language</td>
<td>Japanese</td>
</tr>
<tr class="epRowTwo">
<td class="b"># of Discs</td>
<td>1</td>
</tr>
<tr class="epRowOne">
<td class="b"># of Tracks</td>
<td>5</td>
</tr>
<tr class="epRowTwo">
<td class="b">Price/MSRP</td>
<td>1,365円</td>
</tr>
<tr class="epRowOne">
<td class="b">Run Time</td>
<td>20:06</td>
</tr>
<tr class="epRowTwo">
<td class="b">Your Rating</td>
<td>You must be logged in to rate.</td>
</tr>
<tr class="epRowOne">
<td class="b">Avg Rating</td>
<td><span class="cd-137">9.0000</span> (<span class="votescount-cd-">5</span>)</td>
</tr>
<tr class="epRowTwo">
<td class="b">Description</td>
<td>
<div style="width:300px;">
A Limited Edition CD+DVD version was also released on the same day.
</div>
</td>
</tr>
</tbody>
it is always invalid.
I just don't know why...
Is HTML a kind of XML?