PDA

View Full Version : XML, SAX2, QXmlContentHandler::characters( const QString& ch ) problem



yellowmat
5th October 2006, 14:54
Hi everybody !

I'm developping a program that reads a XML file and gets its data using SAX2. It works quite well, the function QXmlContentHandler::characters(const QString& ch) of my handler is called for each content between element but also at each line end ... maybe it takes care of the carriage return and/or line feed specific character ? Does someone have the same problem ?

Here is the content of my XML file (I have edited it using wordpad under Windows) :

<?xml version="1.0" encoding="ISO-8859-1"?>
<phonebook>
<person>
<type>Worker</type>
<name>NAME_1</name>
<firstname>FIRST_NAME_1</firstname>
<phone>0123456789</phone>
</person>
<person>
<type>Student</type>
<name>NAME_2</name>
<firstname>FIRST_NAME_2</firstname>
<phone>9876543210</phone>
</person>
</phonebook>


The code of my handler is :


bool CMySaxContentHandler::startElement(const QString& namespaceURI, const QString& localName,
const QString& qName, const QXmlAttributes& attribute)
{
szIndent += " ";
szMessage = szIndent + "<" + qName + ">";
qDebug("%s", szMessage.ascii());

return true;
}

bool CMySaxContentHandler::endElement(const QString& namespaceURI, const QString& localName, const QString& qName)
{
szMessage = szIndent + "</" + qName + ">";
qDebug("%s", szMessage.ascii());
szIndent.remove(0, 4);

return true;
}

bool CMySaxContentHandler::characters(const QString& ch)
{
szIndent += " ";
szMessage = szIndent + ch;
qDebug("%s", szMessage.ascii());
szIndent.remove(0, 4);

return true;
}

... where szIndent and szMessage are two QString members.

The code used to parse my XML file is the following :


QXmlSimpleReader xsr;
CMySaxContentHandler handler;
xsr.setContentHandler(&handler);
QXmlInputSource xis(QFile("test_without_dtd.xml"));
xsr.parse(&xis);


and it produces the following output :

<phonebook>


<person>


<type>
Worker
</type>


<name>
NAME_1
</name>


<firstname>
FIRST_NAME_1
</firstname>


<phone>
0123456789
</phone>


</person>


<person>


<type>
Student
</type>


<name>
NAME_2
</name>


<firstname>
FIRST_NAME_2
</firstname>


<phone>
9876543210
</phone>


</person>


</phonebook>


How could I know if the parameter of the characters handler function must be process if I am not sure it is an element content or something else ?

Thanks for your help.

jacek
5th October 2006, 17:45
Maybe it will be enough if you check with a regexp whether ch contains only whitespace followed by '\n'?

yellowmat
6th October 2006, 08:54
Ok, I have tryied what you adviced me and it is ok.

I did :


static int i = -1;
i = ch.find( QRegExp("\n"), 0 );

if( i == -1 )
{
szIndent += " ";
szMessage = szIndent + ch;
qDebug("%s", szMessage.ascii());
szIndent.remove(0, 4);
}


and the output is now :


<phonebook>
<person>
<type>
Worker
</type>
<name>
NAME_1
</name>
<firstname>
FIRST_NAME_1
</firstname>
<phone>
0123456789
</phone>
</person>
<person>
<type>
Student
</type>
<name>
NAME_2
</name>
<firstname>
FIRST_NAME_2
</firstname>
<phone>
9876543210
</phone>
</person>
</phonebook>
L

Thanks jacek