How to use QNetworkAccessManager to crawl a lot of webpage in the same time?
Hello everyone,
I want to use QNetworkAccessManager to crawl some webpage in one website, the url of webpage is regular so I wrote some codes like this:
Code:
QNetworkAccessManager* pManager = new QNetworkAccessManager(this);
connect(pManager, SIGNAL(finished(QNetworkReply*)), this, SLOT(read(QNetworkReply*)));
int iPageCount = 2000;
for (int i = 0; i < iPageCount; ++i)
{
QNetworkRequest request;
request.
setUrl(QString("http://www.abc.com/%1.html").
arg(QString::number(i
)));
pManager->get(request);
}
But the codes did not worked correctly, I can not get any data in read(QNetworkReply*).
Is there any mistake in my code? Or is there have another solution for my demand?
Re: How to use QNetworkAccessManager to crawl a lot of webpage in the same time?
Do you have an event loop running?
The requests will all be performed asynchronously, so until you return from whatever method you're executing above and re-enter your event loop, you won't receive any signals.
Re: How to use QNetworkAccessManager to crawl a lot of webpage in the same time?
Quote:
Originally Posted by
momo
But the codes did not worked correctly, I can not get any data in read(QNetworkReply*).
Is there any mistake in my code?
You forgot to post the code of that method. Hard to tell if you have a mistake in your code without seeing it.
Cheers,
_
Re: How to use QNetworkAccessManager to crawl a lot of webpage in the same time?
Quote:
Originally Posted by
jefftee
Do you have an event loop running?
The requests will all be performed asynchronously, so until you return from whatever method you're executing above and re-enter your event loop, you won't receive any signals.
You mean that I need to set a flag to make "for" loop blocked until I received the previous reply?
Code:
void class1::fun1()
{
QNetworkAccessManager* pManager = new QNetworkAccessManager(this);
connect(pManager, SIGNAL(finished(QNetworkReply*)), this, SLOT(read(QNetworkReply*)));
int iPageCount = 2000;
for (int i = 0; i < iPageCount; ++i)
{
QNetworkRequest request;
request.
setUrl(QString("http://www.abc.com/%1.html").
arg(QString::number(i
)));
pManager->get(request);
m_bFlag = true;
while (m_bFlag)
{
qApp->processEvents();
}
}
}
void class1::read(QNetworkReply* pReply)
{
pReply->abort();
pReply->close();
pReply->deleteLater();
m_bFlag = false;
}
Quote:
Originally Posted by
anda_skoa
You forgot to post the code of that method. Hard to tell if you have a mistake in your code without seeing it.
Cheers,
_
That code just to read the byte array and called a function to parsed it, like this:
Code:
void class1::read(QNetworkReply* pReply)
{
pReply->abort();
pReply->close();
pReply->deleteLater();
parse(array);
}
void class1::parse(const QByteArray& array)
{
...
}
Re: How to use QNetworkAccessManager to crawl a lot of webpage in the same time?
Quote:
Originally Posted by
momo
You mean that I need to set a flag to make "for" loop blocked until I received the previous reply?
No.
The loop just schedules downloads, each call to QNetworkAccessManager::get() puts a request into the QNAM's internal queue which it then processes once event processing commences.
So you need to make sure that the execution flow returns to the event loop after the method is finished.
Quote:
Originally Posted by
momo
That code just to read the byte array and called a function to parsed it, like this:
So read(QNetworkReply*) is being called but "array" is empty?
Have you checked the error() of "pReply"?
Cheers,
_