PDA

View Full Version : Strange QTcpSocket behavior (debugged with Ethereal)



mdecandia
10th May 2007, 15:51
Hi All,
I've debugged Tcp communication done by QTcpSocket and I've found a problem already posted in Qt lists but nobody replies to. Every time I connect to a QTcpServer socket, the connection is done 2 times: the first time it's resetted, the second one it was done correctly.
The post I'm referring to is this one:

http://lists.trolltech.com/qt-interest/2006-10/thread00472-0.html

------------------------------------------------------------------------
Hi There,

Okay, actually I asked Trolltech support but not ticket number for over two
hours and I am in a rush now so probably someone have experienced the same
before.


There is an interesting thing going on TCP channels with both Qt version
4.1.4 and 4.2.0; the test platform was windows xp sp 2 on the client side
(multiple computers).

The implementation is using a QTcpSocket inside a class based on QThread. We
found during testing that the QTcpSocket opens a ghost connection in the
test case (as well as in our real life application).

For example, the ethereal shows the socket as it is first created from the
client to the server with the following bytestream (SYN type):
0000 00 10 f3 06 82 f9 00 0e 7b cb 99 72 08 00 45 00 ........ {..r..E.
0010 00 30 1a 83 40 00 80 06 c4 81 8a f9 02 c7 8a f9 .0..@... ........
0020 03 0a 05 14 07 d4 fd 2b 89 bc 00 00 00 00 70 02 .......+ ......p.
0030 c0 00 13 8b 00 00 02 04 05 b4 01 01 04 02 ........ ......

Then the server found the socket and responds (SYN, ACK):
0000 00 0e 7b cb 99 72 00 10 f3 06 82 f9 08 00 45 00 ..{..r.. ......E.
0010 00 2c 01 10 00 00 ff 06 9e f8 8a f9 03 0a 8a f9 .,...... ........
0020 02 c7 07 d4 05 14 6e 87 f8 82 fd 2b 89 bd 60 12 ......n. ...+..`.
0030 10 00 71 77 00 00 02 04 05 b4 8a f9 ..qw.... ....

Suddently, the client who built the connection drops it (RST):
0000 00 10 f3 06 82 f9 00 0e 7b cb 99 72 08 00 45 00 ........ {..r..E.
0010 00 28 1a 85 00 00 80 06 04 88 8a f9 02 c7 8a f9 .(...... ........
0020 03 0a 05 14 07 d4 fd 2b 89 bd fd 2b 89 bd 50 04 .......+ ...+..P.
0030 00 00 79 62 00 00 ..yb..

Then, the client builds another connection which goes fine. This means a
network overload and also confuses our server so we must find a solution for
the problem. Remarkably, when QTcpSocket was removed and native O/S calls
were used, the ghost socket problem disappeared. The TCP stack setting looks
to be okay and the success of the native calls verify that the problem is
related to the sample code and probably to Qt.

The issue is clearly reproducible and happens every single time after the
first connect, namely if and when connectToHost and waitForConnected used.

Please take a look on the following simple example code so that you may have
a test on your boxes. In the debug output, it is clearly visible that every
other port number is found. It is possible to reproduce the problem even in
step by step mode so this must is not timing issue.

QTcpSocket *Socket;
Socket = new QTcpSocket();
for (int i = 0; i < 10; i++) {

qDebug("Connecting to host...");
Socket->connectToHost(HostName, Port);
if (!Socket->waitForConnected(5000)) {
qDebug(QString("TEST_Connection error:
%1.").arg(Socket->errorString()).toAscii());
} else {
qDebug("Connection established");
Socket->dumpObjectInfo();
qDebug(QString("Port:").arg(Socket->localPort()).toAscii());

msleep(1000);

kone::bmtcp::Message *Message =
MessageFactory.getFactor().createHeartBeatMsg();

qint64 WrittenBytes = Socket->write((const
char*)Message->_buffer, Message->_bufferSize);
Socket->flush();

if (WrittenBytes != Message->_bufferSize) {
qDebug(QString("SendTcpMessage failed. Bytes
written: %1 instead of %2
bytes.").arg(WrittenBytes).arg(Message->_bufferSize).toAscii());
} else {
qDebug("Sending was OK");
}

msleep(1000);

Socket->disconnectFromHost();
Socket->close();
qDebug("Connection to server has been closed.");
}

msleep(1000);
}

It does not even help if I create (new) and delete the QTcpSocket every time
in the for cycle, the observed behavior is the same, two sockets created to
the host server and the first one gets terminated by the QTcpSocket based
client in about 1-2 ms.

Please find attached a small ethereal trace (endig .199 is the client and
.10 is the server).

Bye,
Sandor
------------------------------------------------------------------------

I've the same proble with Qt 4.2.3.

Has somebody experience with this?

Thanks,
Michele De Candia

slcotter
10th May 2007, 16:03
Hey, this is the exact same problem I'm having!

For me, I can live with the simple workaround of calling connectToHost(...) twice instead of just once.

mdecandia
10th May 2007, 16:13
I've resolved with your workaround but, IMHO, this is a programming absurdity.
Thanks

Michele

mdecandia
10th May 2007, 16:46
Continuing with ethereal debugging I have many CRC errors on TCP segments. If you want test it, the problem occurs with fortune example also.

Any idea?
Thanks,

Michele

mdecandia
11th May 2007, 09:01
I think that workaround doesn't resolve the problem. The connection is resetted some time like in the original code.

slcotter
11th May 2007, 16:43
I believe that. the program will print an error to the terminal saying that connectToHost has already been called, but for some reason it has a positive effect on establishing the connection. It's not 100%, but it's functional enough for me to move forward with my implementation. Luckily for me, I don't need it to be 100%. It would be nice, but I can live with this.

If you have the option, you may want to try the previous version of Qt, as I never encountered these issues while I was using that version(4.2.2). Though that was on different hardware.

Thoosle
26th May 2007, 18:55
One thing I've noticed, and can't see why it should make a difference, and haven't tested, is that the example code I've seen using the "connect" method followed by the "waitForConnected" method is that the socket was created on the stack not the free store. Again, I can't see why this would make a difference but it's something I've noticed. My code creates a socket using "new" and I did have the problem you are seeing with using "connect" followed by "waitForConnected".... I did see the tcp 3 packet handshake being screwed up, same as you.

What I did was I alter my code to call the "connect" but had connected the socket error signal to a slot to handle any error conditions emitted by the socket and didn't use the "waitForConnect" method at all. This way "waitForConnect", which seemed to be the cause of this problem, was eliminated(along with the weird handshake) and have the benefit of reporting/using the available socket errors.

IMHO, I think putting the error() signal to use is a better and more versatile way to accomplish what "waitForConnected" is intended to do but is for some reason causing this problem.

Finally, I suspect "waitForConnected" isn't broken, we're just doing something we shouldn't, but I have no idea what that is. Again, though, I think employing the error() signal is better anyway.

wysota
26th May 2007, 19:01
Could you prepare a minimal compilable example reproducing the problem?

Thoosle
26th May 2007, 19:22
My last wasn't entirely clear. In my socket client class I create a QTcpSocket object on the free store and using signals and slots connet the sockets "connected" signal to a slot and the socket error() signal to a slot to handle error conditions that might arise. I call "connectToHost" in a method that simply returns and does nothing else. When the connection is complete(successful) the "connected" signal is emitted and calls a slot that then continues the communication with the server. But, if the connection fails for some reason, the "connected" signal won't be emitted and the socket error() signal will call a slot that processes the error condition instead.

It's a little more work but adds a lot more utility and eliminates using "connect" followed by "waitForConnected" which for whatever reason is causing this weird initial connection handshake.

Thoosle
26th May 2007, 19:57
Hi Wysota,

The server I use closes down for the weekend until later tomorrow, maybe I'll write/test thi with something minimal then but in testing this yesterday I've commented out my code down to either of the following two cases.......it's as minimal as it gets....

in the parent object constructor...... QTcpSocket *pSocket = new QTcpSocket(this);

then in a parent object method the first case.....

pSocket->connectToHost( host, port);

if(!pSocket->waitForConnected(1000))
{
return false;
}

or the second case with just connectToHost() alone.....

pSocket->connectToHost(host, port)

comparing these two cases and looking at the initial tcp/ip 3 packet exchange witih Wireshark(ethereal) the latter works properly. The former results in the client first sending the initial packet with the SYN flag set and a return port of say 30000. But, this is follwed by it sending another packet with SYN but the return port is incremented by 1. It's as if the first attempt failed and so it attempts again. The third packet is the server sending SYN and ACK to the first request. The client then sends RST to close that connection!! All as if "waitForConnect" is somehow adversely affecting things andI have no idea why.

wysota
26th May 2007, 20:24
Can't you use... I don't know... www.google.com:80 as the server?

The reason I ask about a compilable example is that you can (1) verify your own code and make sure you can reproduce the problem without any other code influencing it and (2) we can work on the same code and try to reproduce the problem ourselves. In many cases just implementing the minimal example is enough as the author discovers the problem himself (as the code is minimal and doesn't try to do anything else - see reason1).

Thoosle
26th May 2007, 23:02
wysota,

Yes, of course what was I thinking??? I wrote the following:

#include <stdio.h>
#include <stdlib.h>
#include <QtNetwork>

int main(int argc, char *argv[])
{
QTcpSocket *pSocket = new QTcpSocket;

pSocket->connectToHost("www.google.com", 80);

if(pSocket->waitForConnected(1000))
{
printf("\nconnected!\n");
}

return EXIT_SUCCESS;
}

this runs and exits normally and shows "connected" printed out. And the captured packet exchange looks normal but I get the following two run time messages:

QObject::connect:Cannot connect (null)::aboutToQuit() to QHostInfoAgent::cleanup()
QObject::connect: Cannot connect (null)::destroyed(QObject *) to QHostInfoAgent::cleanup()

but if you comment out:

if(pSocket->waitForConnected(1000))
{
printf("\nconnected!\n");
}

then the code runs same as above but with no QObject... messages

wysota
26th May 2007, 23:06
I think there is nothing to worry about here. Try creating a QApplication object at the beginning of main. The warnings will probably disappear.

So the bottom line is the transmission is fine? If so, then compare this code to your original application and try to spot differences.

Thoosle
26th May 2007, 23:21
wysota,

Yes, the packet exchange seems solid now. Question: could it possibly be a server related thing like a timing issue of some sort with the host? The reason I ask is because in my app when I say I've commented everything down to a few lines of code, I'm not kidding, I did just that. I don't see anything else that could be involved, but, of course what I'm not seeing is probably where the answer is :) The only thing I haven't done is to try that code, with a different server. Which is next on my list...... I'll report if I find anything useful........thanks!!

Thoosle
26th May 2007, 23:38
wysota,

I just tried the entire offending project code, execpt I edited the host and port values in connectToHost() to point to google.com port 80 and connected with perfect handshake 5 times in a row. There is no way the server I've been connecting to would do that. No conclusions here just fyi....... Tomorrow I'm going to try the simplified code with the server in question when it comes back on line and see what that does.....

wysota
27th May 2007, 00:26
Does the "presumed faulty" server run IIS, by any chance?

Thoosle
27th May 2007, 03:34
I have no idea......

wysota
27th May 2007, 08:19
Forget it, it doesn't matter. I somehow assumed you were using the HTTP port which is surely a false assumption...

What kind of service (application) do you connect to? Did you write it yourself or is it some well known daemon?

Thoosle
27th May 2007, 22:46
Hello Witold,

First, FYI I just tried the minimal code with the server in question and it had a weird initial handshake 5 out of 6 times.....apparently it's not Qt at all but the server.......

I'll try to answer your question: I have a small lan here and it has a windows xp machine and Linux machine. I trade index futures on the XP machine with a java app(provided by the brokerage) that connects to the brokerage server in New York and I use this app to place orders. Also on the XP machine is a charting application(from a different company) to chart the futures data and I pay an annual fee for this charting app. This charting app connects(via local loop) using tcp/ip to the java trading app(all on the xp machine) to get historical and live streaming data.

The java app for trading is the interface to my brokerage and is the "server" that I'm connecting to. They(the brokerage) provide an API so I can write my own trading and charting apps if I so choose. The java trading app is the interface to the brokerage that takes care of my logging in and trading, etc and any app I might write will have to communicate with and through this trading app. And since this trading client app is a Java app I can also run it on my linux machine and it works great there.(haven't checked for this prob. on linux).

What I'm working on is a Linux/Qt charting app to replace the one I'm paying for. This would allow me to move my trading/charting to my Linux machine...one of the last things I do on that machine and part of my goal to get away from windows(and save myself a little money as well). So, in the meantime as I work on my app, the server that I'm connecting to is, as I've said, just the xp machine 3 feet to my right that's running the java trading client(server).

Thoosle
28th May 2007, 02:20
I think I have a better understanding of what the problem is although I can't say why it is.....only the server knows for sure.....

In looking at wireshark and referring to the initial tcp/ip handshake, when the initial SYN packet is sent I see now that it is taking approx 1 msec for the server to respond with SYN and ACK.. But, before the server responds(about 300usec after first attempt) the client has apparently concluded that the connection didn't work and so it increments it's listening port number and sends another SYN packet to again try to connect. So, by the time the server responds(1msec after initial attemp) the client is no longer listening on the original listening port it first sent. And after the server tries to respond to the first connection attempt the client sends RST to cancel that first attempt, since it's no longer listening there!! Shortly after that the server catches up and replies with SYN/ACK to the second attempt. From that point communication proceeds normally.

I think this is exactly what was happening to the original posters on this thread . For what ever reason, the server I'm connecting with is not responding to the first connection attempt fast enough!! I worked around this in my code by putting connectToHost in it's own method that simply invokes it and returns. The socket connected or error signals invoke either a connected method or a method to deal with the failed connection attempt.

The question at this point is.....is the server not responding fast enough or is the connection attempt unreasonable in it's expectations regarding timing? Anyone have thoughts on this? Is the server simply slow or do I need to configure something relative to the socket or Linux or whatever?

wysota
28th May 2007, 08:36
But, before the server responds(about 300usec after first attempt) the client has apparently concluded that the connection didn't work and so it increments it's listening port number and sends another SYN packet to again try to connect.
TCP doesn't work that way. Timeouts are much longer and SYN should be repeated on the same port. Using a different port suggests the previous try was abandoned..


And after the server tries to respond to the first connection attempt the client sends RST to cancel that first attempt, since it's no longer listening there!!
This is a regular behaviour. You could try to use iptables (or simmilar) to stop the client machine from sending RST (-j DROP instead of -j REJECT) and see what happens then.


Shortly after that the server catches up and replies with SYN/ACK to the second attempt. From that point communication proceeds normally.
So why doesn't it happen with the second attempt?


I think this is exactly what was happening to the original posters on this thread . For what ever reason, the server I'm connecting with is not responding to the first connection attempt fast enough!!
There is nothing like "fast enough" with TCP. The initial TCP timeout can often be changed by tweaking system settings, but it's surely bigger than 0.1s. Otherwise you could never connect to lagging or distant servers (connections through a satellite link would also not be possible).


The question at this point is.....is the server not responding fast enough or is the connection attempt unreasonable in it's expectations regarding timing? Anyone have thoughts on this? Is the server simply slow or do I need to configure something relative to the socket or Linux or whatever?

I suggest you try with different servers. If you were connecting to IIS, it has a well known "feature" of cheating the statistics and you might be experiencing that. But since you're connecting to some java software, it's either the software problem or the operating system. Does the same happen (on the same target box) when using Qt servers from networking examples?

Thoosle
28th May 2007, 16:42
I agree if the first connection attempt was given enough time.....I should have said fast enough for whatever it is that's causing the first attempt to throw in the towel and try again!! The thing is, these timing issues RE the first connection attempt failing then trying again, don't always happen. The initial exchange works fine sometimes but fails most of the time. And, I'm talking about the minimal code app I wrote for testing. If I comment out the statement with waitForConnected(1000) then the initial packet exchange seems rock solid, acting properly each time I try it.

Here's another clue, I configuired to run the java app on the Linux machine and the minimal test code to point to 127.0.0.1......wireshark shows rock solid connections even with the waitForConnected(1000) line in place...... suggesting this is possibly something to do with the XP machine....

Also, on the Xp machine I run Norton Internet Security....I turned it all off including the windows firewall(which was never enabled anyway) and got the failing exchange attempt test results as usual.

I haven't tried a test of Qt client/server example code from the Linux machine to the windows machine, I'm not setup to build Qt apps on windows at the moment but it's about the only test left!!

wysota
28th May 2007, 18:24
At least it clearly indicates the problem is with the host machine.

Thoosle
28th May 2007, 20:44
yes sir! I think it does. Witold, thanks for all the good comments and insight.....also, putting together this forum.....a really valuable resource for beginners like myself..