PDA

View Full Version : QTcpSocket write data with packet loss - Idiot proof sample does not work!



jboban
25th March 2013, 13:05
I have big Qt, Linux, C++ and Web programming experience, but this case blows up my brain! :mad:

This code works on many sites, many different machines with many different XML files, but does not work with some files on some machines without the rules. I have tried everything, blocking and non-blocking sockets, w and w/o socket flush.

On server side, I dump bad TCP stream with tcpdump (http://manpages.ubuntu.com/manpages/lucid/man8/tcpdump.8.html) eg.
tcpdump -w <bin_dump.log> host <client_IP> and then analyze them in Wireshark (http://www.wireshark.org/). As I can understand server missed one packet (maybe the first one) and could not receive the whole message.

On client side, I made binary dump of TCP stream sent and send it with
dd if=./tcp.bin > /dev/tcp/<server_ip>/<server_port> and everything is fine then. So, the problem is in my client side app.

Is such a cases the only one solution is to manually add one extra char into XML tag and change the size of the file, eg. 7604 bytes to 7605 bytes, and then the file was sent. But, the files smaller and bigger that the one can be sent w/o problems. There is no rules about size. Sometimes it happens on file with 823 bytes, sometimes with 12666 bytes.

Relevant code is:

/**
* Send XML file
* @param sMsg Message prefix
* @param sFile XML file name
*/
void TXmlSender::tcpSendXmlFile(const QString& sMsg, const QString& sFileName)
{
QString sXml;
xmlFileToString(sFileName, sXml);

TXmlTCPCli *tcpCli = new TXmlTCPCli(m_sHostIP, m_nHostPort, this);
connect(tcpCli, SIGNAL(onResponse(const QString&, const QString&)),
this, SLOT(slot_onSendXmlFile(const QString&, const QString&)));

QString sSend = m_sLokalID + ": " + sMsg + " " + sXml;
tcpCli->sendMsgReq(sSend);
}

TXmlTCPCli is trivial TCP client class and main part of them is:

/**
* SLOT: On connected to host
*/
void TXmlTCPCli::onConnected()
{
QTcpSocket *socket = qobject_cast<QTcpSocket *>(sender());
if (socket == 0) {
return;
}
QByteArray bArrMsg;
QDataStream out(&bArrMsg, QIODevice::WriteOnly);
out.setVersion(QDataStream::Qt_4_0);

out << (quint16)0;
out << m_sMsg;
out.device()->seek(0);
out << (quint16)(bArrMsg.size() - (int)sizeof(quint16));

while (!bArrMsg.isEmpty()) {
if (!socket->isWritable()) {
LOG("Error! Could not write to socket2!");
break;
}
qint32 sentNow = socket->write(bArrMsg);
if (sentNow < 0) {
LOG("Error! sendMsgImpl: Could not write to socket!");
break;
}
else {
bArrMsg.remove(0, sentNow);
// socket->flush();
LOG("sendMsgImpl: Sent " << sentNow << ", Rest: " << bArrMsg.size());
}
}
socket->disconnectFromHost();
}

/**
*/
void TXmlTCPCli::onDisconnected()
{
QTcpSocket *socket = qobject_cast<QTcpSocket *>(sender());
if (socket != 0) {
socket->deleteLater();
}
this->deleteLater();
}

When this code catch some files, server waits, and waits and then timeout and disconnect. The rest of the code works fine, and send TCP messages, receive TCP messages etc. This application also has QTcpServer listen on same port. Client machines was Mint32 Maya, Ubuntu 11.10 and Qt is v 4.8.1.

How to resolve this? How to debug QTcpSocket on lower level? Why many files sent normally but other only with change the size? Qt bug? Memory leak? Any hint? :rolleyes:

wysota
25th March 2013, 20:52
As I can understand server missed one packet (maybe the first one) and could not receive the whole message.
How did you reach this conclusion?

jboban
25th March 2013, 21:01
Wireshark tell me. Here is the screenshot. If you need I can send you the log file for Wireshark, too.
8853

wysota
25th March 2013, 21:38
If a segment was missed, it would be retransmitted. I think your log only tells you that wireshark has missed it, not that it wasn't transferred. If you have full conversation between the server and the client logged, you can verify that by returning ack segments.

jboban
25th March 2013, 21:59
Yes, but my server side application has missed it, too. That's my problem. Every facts tell me that the data was not arrive to the server. And, when I add just one extra char into source XML file, then he was sent. Really strange. Even that, when binary dump of TCP buffer send with dd it was send successful.

ChrisW67
25th March 2013, 22:39
Which byte is missing? The first byte of the size, the first byte of the string payload, the last byte of either, some other byte? Do you insert and receive a correct size figure? Is there an off-by-one error in the receiver?

wysota
25th March 2013, 22:54
Yes, but my server side application has missed it, too.
I don't believe that. If that was the case, no data would arrive at the server application at all. Not a single byte. And after reaching the number of bytes equal to the window size of your TCP connection (which is a bit less than 1kB) the sending end would halt transmission waiting for acknowledgements from the receiving side and would retransmit all data from the beginning if no such ack was received. The way TCP works, you either get all data from the beginning of the stream up to a certain point in time or you don't get anything. It is not possible that some data in the middle (or in the beginning) gets "skipped".


That's my problem. Every facts tell me that the data was not arrive to the server. And, when I add just one extra char into source XML file, then he was sent. Really strange. Even that, when binary dump of TCP buffer send with dd it was send successful.

Have a look at this thread: http://www.qtcentre.org/threads/34082-NetworqDebugger

Download the program, build it and run your connection through it (e.g. using the SOCKS5 mode). See what gets sent from your application into the network (the "Local" field). It's best if you set the proxy on a different machine than the one your client operates from, this will make sure the data actually leaves your client machine.

jboban
25th March 2013, 23:00
No byte, but packet of bytes is missed. Packet size was correct received. On client side I have:

sendMsgImpl: Sent 9198, Rest: 0

And on server side:

25.03.2013 22:56:30 onReadData::blockSize: 9196, Socket: 21
25.03.2013 22:56:30 onReadData::bytesAvailable: 1406
25.03.2013 22:56:30 onReadData::bytesAvailable: 2814
25.03.2013 22:56:30 onReadData::bytesAvailable: 4222
25.03.2013 22:56:30 onReadData::bytesAvailable: 5630
25.03.2013 22:56:30 onReadData::bytesAvailable: 7038

Then waits and waits and timeouts... Never reached the size.

Now, when I add just one extra char into source XML file on client side, the file sent fine and on server side I'v got:

25.03.2013 23:08:41 onReadData::blockSize: 9198, Socket: 20
25.03.2013 23:08:41 onReadData::bytesAvailable: 1406
25.03.2013 23:08:41 onReadData::bytesAvailable: 2814
25.03.2013 23:08:41 onReadData::bytesAvailable: 4222
25.03.2013 23:08:41 onReadData::bytesAvailable: 5630
25.03.2013 23:08:41 onReadData::bytesAvailable: 7038
25.03.2013 23:08:41 onReadData::bytesAvailable: 8446
25.03.2013 23:08:41 onReadData::bytesAvailable: 9198

wysota
25th March 2013, 23:19
Since you are getting data on the receive end, no packet is missed.


Where does the value from "blockSize" come from? Is it the first two bytes you read from the socket? Do you actually read any data from the socket?

By the way, your client code is potentially wrong in this regard that you disconnect from the peer without waiting for the data to be written to the device. The fact that write() returns doesn't mean the data was actually sent from your app. Either use the bytesWritten() signal or waitForBytesWritten() blocking call.

jboban
26th March 2013, 00:27
Since you are getting data on the receive end, no packet is missed.
That's the case in second case, when I manually add one extra char and resize source file with +1.

Where does the value from "blockSize" come from? Is it the first two bytes you read from the socket? Do you actually read any data from the socket?
Yes, first two bytes are blockSize:

out << (quint16)0;
out << m_sMsg;
out.device()->seek(0);
out << (quint16)(bArrMsg.size() - (int)sizeof(quint16)); // <-- Here is the blockSize
I actualy wait for blockSize data to be available and then read the whole message. Look at my last post onReadData::bytesAvailable. It's on server side in slot onReadData() connected to readyRead() socket signal.

By the way, your client code is potentially wrong in this regard that you disconnect from the peer without waiting for the data to be written to the device.
I'v tried that too, but got the same result. You wrong I guess, in non-blocking TCP communication I don't need to wait for bytes to be written. The manual for disconnectFromHost() (http://qt-project.org/doc/qt-4.8/qabstractsocket.html#disconnectFromHost) says:

Attempts to close the socket. If there is pending data waiting to be written, QAbstractSocket will enter ClosingState and wait until all data has been written.

wysota
26th March 2013, 00:47
That's the case in second case, when I manually add one extra char and resize source file with +1.
No. That's the case when application on the receiving end of a TCP stream receives data later in the stream than the supposedly "missed" data. We're talking transport layer here. What happens in the application layer is a different issue and wireshark has nothing to do with this since it doesn't understand your application protocol. I don't know how detailed your knowledge of the networking stack is but it is important what terms you use to refer to protocol data units -- "packets" are data units of the network layer (i.e. Internet Protocol (IP)), "segments" are data units of the TCP protocol (transport layer), let's not mix the two. Thus I can assure you no packets nor segments are missing from your TCP stream. It could be that the last segments were not received by the peer but that's easy to verify using wireshark - the sender should get a TCP segment with ACK flag set to the value of its own initial SYN incremented by the size of data sent, possibly followed by exchange of packets with a FIN flag set that closes the connection. You can use wireshark to check the last ACK segment you get from the server to see how much data was accepted by the server.


Yes, first two bytes are blockSize:

out << (quint16)0;
out << m_sMsg;
out.device()->seek(0);
out << (quint16)(bArrMsg.size() - (int)sizeof(quint16)); // <-- Here is the blockSize
That's the sending side, I'm asking about the receiving side.


I actualy wait for blockSize data to be available and then read the whole message. Look at my last post onReadData::bytesAvailable: ... It's on server side in slot onReadData() connected to socket readyRead() signal.
It doesn't mean you are reading any data. We only know it sits in your socket. If the receiving buffer is full, transmission is halted until space becomes available. Thus it is important that you actually read data from the socket. Since we don't have access to the code of your reader, I'm asking you whether you are calling read (or readAll) on the socket anywhere after you receive the readyRead() signal.

A totally separate thing is that you're mimicing inherently broken code of the fortune cookie server example that is not fit to be treated as a generic data transfer protocol. The code will simply fail to work e.g. in a case where the data you're trying to send is larger than 16382 (or even 16378) bytes. Since you're not checking the data size anywhere nor making anything to prevent such situation, your code is inherently broken as well.

jboban
26th March 2013, 05:13
it is important what terms you use to refer to protocol data units -- "packets" are data units of the network layer (i.e. Internet Protocol (IP)), "segments" are data units of the TCP protocol (transport layer), let's not mix the two.
Ok, I mean segments, or my data chunks. Now I confirm that "missed" data is the last segment sent. You right about that. Thank you.


It doesn't mean you are reading any data. We only know it sits in your socket.
Now I changed the server code and actually read all available bytes but all the same happens.

Again, when I try dd if=./tcp.bin > /dev/tcp/<server_ip>/<server_port> everything is fine. So, server side is just fine. The problem is on client side.

8855


The code will simply fail to work e.g. in a case where the data you're trying to send is larger than 16382 (or even 16378) bytes.
Where is that limitation? As I can see there is quint16 (64k) buffer size which gives me 32k files transfer with DataStream serialization.

What is your suggestion? How to change my protocol at network layer?

wysota
26th March 2013, 07:22
So, server side is just fine. The problem is on client side.
That's not so certain. Your TCP stack may be using different TCP options for both connections. Nevertheless I really suggest you use waitForBytesWritten() to make sure it is not at fault. You can always remove it if it doesn't help. If it doesn't work then tap into the stream between the sender and the receiver like I advised a couple of posts ago to see if all data leaves the sender.


Where is that limitation? As I can see there is quint16 (64k) buffer size which gives me 32k files transfer with DataStream serialization.
Ekhm... sorry, it was late when I was writing my post :) Yes, 64kB - 6B = 65530B. Still this problem (and all others) remains.


What is your suggestion? How to change my protocol at network layer?
It depends on what your application is doing. The most basic thing would be to send the size of the data into the socket (e.g. in network byte order) and then start streaming the data itself chunk by chunk.

jboban
26th March 2013, 08:57
tap into the stream between the sender and the receiver
waitForBytesWritten() was already tried without success. How can I use your NetworqDebugger? If I choose SOCKS5 (with modification of my client), I got the content I sent, but in console your program loops with a message "QNativeSocketEngine::write() was not called in QAbstractSocket::ConnectedState" and put CPU load to 100%. Otherwise, NetworqDebugger connect to my server side and send wrong data, eg. wrong packet size and then only 1 data byte.

The most basic thing would be to send the size of the data into the socket (e.g. in network byte order) and then start streaming the data itself chunk by chunk.
I thought that I'm already doing.

wysota
26th March 2013, 09:28
If I choose SOCKS5 (with modification of my client), I got the content I sent, but in console your program loops with a message "QNativeSocketEngine::write() was not called in QAbstractSocket::ConnectedState" and put CPU load to 100%.
It might suggest one of the ends disconnects before all the data is written.


Otherwise, NetworqDebugger connect to my server side and send wrong data, eg. wrong packet size and then only 1 data byte.
The proxy doesn't modify your data in any way. If it sends incorrect data, it means it has received incorrect data.


I thought that I'm already doing.
No. You are sending a blob containing two bytes of serialized integer followed by serialized byte array with data. You're not sending any chunks and you're not encoding the data size properly. You're totally ignoring the fact that size may not fit into 16 bits but you're then serializing all of the data that can be of arbitrary size (e.g. 16GB). I'm assuming you're doing the same thing on the other end -- reading two bytes, interpreting it as blob size and then waiting in a loop for this amount of data to come in. At best it will put your peers out of sync, at worst it will DOS your receiver (if no data ever comes in) and possibly your sender (as you may run out of RAM trying to send a very large chunk of data that is much larger than 64kB).

jboban
27th March 2013, 00:07
You're totally ignoring the fact that size may not fit into 16 bits but you're then serializing all of the data that can be of arbitrary size (e.g. 16GB).
This is specific case and I have not big files, but you are right. I have to redesign complete protocol and hope then it will work. Thank you very much for help and I will be free to ask for help again.

wysota
27th March 2013, 07:23
I hope that when you do that, your remaining problems with the transfer will be solved automatically as a side effect to improving the protocol.