PDA

View Full Version : How to use QNetworkAccessManager to crawl a lot of webpage in the same time?



momo
11th August 2015, 03:27
Hello everyone,

I want to use QNetworkAccessManager to crawl some webpage in one website, the url of webpage is regular so I wrote some codes like this:


QNetworkAccessManager* pManager = new QNetworkAccessManager(this);
connect(pManager, SIGNAL(finished(QNetworkReply*)), this, SLOT(read(QNetworkReply*)));

int iPageCount = 2000;
for (int i = 0; i < iPageCount; ++i)
{
QNetworkRequest request;
request.setUrl(QString("http://www.abc.com/%1.html").arg(QString::number(i)));

pManager->get(request);
}

But the codes did not worked correctly, I can not get any data in read(QNetworkReply*).
Is there any mistake in my code? Or is there have another solution for my demand?

jefftee
11th August 2015, 06:47
Do you have an event loop running?

The requests will all be performed asynchronously, so until you return from whatever method you're executing above and re-enter your event loop, you won't receive any signals.

anda_skoa
11th August 2015, 08:33
But the codes did not worked correctly, I can not get any data in read(QNetworkReply*).
Is there any mistake in my code?

You forgot to post the code of that method. Hard to tell if you have a mistake in your code without seeing it.

Cheers,
_

momo
13th August 2015, 09:36
Do you have an event loop running?

The requests will all be performed asynchronously, so until you return from whatever method you're executing above and re-enter your event loop, you won't receive any signals.

You mean that I need to set a flag to make "for" loop blocked until I received the previous reply?


void class1::fun1()
{
QNetworkAccessManager* pManager = new QNetworkAccessManager(this);
connect(pManager, SIGNAL(finished(QNetworkReply*)), this, SLOT(read(QNetworkReply*)));

int iPageCount = 2000;
for (int i = 0; i < iPageCount; ++i)
{
QNetworkRequest request;
request.setUrl(QString("http://www.abc.com/%1.html").arg(QString::number(i)));

pManager->get(request);

m_bFlag = true;

while (m_bFlag)
{
qApp->processEvents();
}
}
}

void class1::read(QNetworkReply* pReply)
{
QByteArray array = pReply->readAll();
pReply->abort();
pReply->close();
pReply->deleteLater();
m_bFlag = false;
}



You forgot to post the code of that method. Hard to tell if you have a mistake in your code without seeing it.

Cheers,
_

That code just to read the byte array and called a function to parsed it, like this:


void class1::read(QNetworkReply* pReply)
{
QByteArray array = pReply->readAll();

pReply->abort();
pReply->close();
pReply->deleteLater();

parse(array);
}

void class1::parse(const QByteArray& array)
{
...
}

anda_skoa
13th August 2015, 11:06
You mean that I need to set a flag to make "for" loop blocked until I received the previous reply?

No.
The loop just schedules downloads, each call to QNetworkAccessManager::get() puts a request into the QNAM's internal queue which it then processes once event processing commences.

So you need to make sure that the execution flow returns to the event loop after the method is finished.



That code just to read the byte array and called a function to parsed it, like this:

So read(QNetworkReply*) is being called but "array" is empty?
Have you checked the error() of "pReply"?

Cheers,
_