Results 1 to 5 of 5

Thread: How to use QNetworkAccessManager to crawl a lot of webpage in the same time?

  1. #1
    Join Date
    Feb 2014
    Posts
    28
    Thanks
    6
    Qt products
    Qt4
    Platforms
    Windows

    Question How to use QNetworkAccessManager to crawl a lot of webpage in the same time?

    Hello everyone,

    I want to use QNetworkAccessManager to crawl some webpage in one website, the url of webpage is regular so I wrote some codes like this:

    Qt Code:
    1. QNetworkAccessManager* pManager = new QNetworkAccessManager(this);
    2. connect(pManager, SIGNAL(finished(QNetworkReply*)), this, SLOT(read(QNetworkReply*)));
    3.  
    4. int iPageCount = 2000;
    5. for (int i = 0; i < iPageCount; ++i)
    6. {
    7. QNetworkRequest request;
    8. request.setUrl(QString("http://www.abc.com/%1.html").arg(QString::number(i)));
    9.  
    10. pManager->get(request);
    11. }
    To copy to clipboard, switch view to plain text mode 

    But the codes did not worked correctly, I can not get any data in read(QNetworkReply*).
    Is there any mistake in my code? Or is there have another solution for my demand?

  2. #2
    Join Date
    Dec 2009
    Location
    New Orleans, Louisiana
    Posts
    791
    Thanks
    13
    Thanked 153 Times in 150 Posts
    Qt products
    Qt5
    Platforms
    MacOS X

    Default Re: How to use QNetworkAccessManager to crawl a lot of webpage in the same time?

    Do you have an event loop running?

    The requests will all be performed asynchronously, so until you return from whatever method you're executing above and re-enter your event loop, you won't receive any signals.
    I write the best type of code possible, code that I want to write, not code that someone tells me to write!

  3. #3
    Join Date
    Jan 2006
    Location
    Graz, Austria
    Posts
    8,416
    Thanks
    37
    Thanked 1,544 Times in 1,494 Posts
    Qt products
    Qt3 Qt4 Qt5
    Platforms
    Unix/X11 Windows

    Default Re: How to use QNetworkAccessManager to crawl a lot of webpage in the same time?

    Quote Originally Posted by momo View Post
    But the codes did not worked correctly, I can not get any data in read(QNetworkReply*).
    Is there any mistake in my code?
    You forgot to post the code of that method. Hard to tell if you have a mistake in your code without seeing it.

    Cheers,
    _

  4. #4
    Join Date
    Feb 2014
    Posts
    28
    Thanks
    6
    Qt products
    Qt4
    Platforms
    Windows

    Default Re: How to use QNetworkAccessManager to crawl a lot of webpage in the same time?

    Quote Originally Posted by jefftee View Post
    Do you have an event loop running?

    The requests will all be performed asynchronously, so until you return from whatever method you're executing above and re-enter your event loop, you won't receive any signals.
    You mean that I need to set a flag to make "for" loop blocked until I received the previous reply?
    Qt Code:
    1. void class1::fun1()
    2. {
    3. QNetworkAccessManager* pManager = new QNetworkAccessManager(this);
    4. connect(pManager, SIGNAL(finished(QNetworkReply*)), this, SLOT(read(QNetworkReply*)));
    5.  
    6. int iPageCount = 2000;
    7. for (int i = 0; i < iPageCount; ++i)
    8. {
    9. QNetworkRequest request;
    10. request.setUrl(QString("http://www.abc.com/%1.html").arg(QString::number(i)));
    11.  
    12. pManager->get(request);
    13.  
    14. m_bFlag = true;
    15.  
    16. while (m_bFlag)
    17. {
    18. qApp->processEvents();
    19. }
    20. }
    21. }
    22.  
    23. void class1::read(QNetworkReply* pReply)
    24. {
    25. QByteArray array = pReply->readAll();
    26. pReply->abort();
    27. pReply->close();
    28. pReply->deleteLater();
    29. m_bFlag = false;
    30. }
    To copy to clipboard, switch view to plain text mode 

    Quote Originally Posted by anda_skoa View Post
    You forgot to post the code of that method. Hard to tell if you have a mistake in your code without seeing it.

    Cheers,
    _
    That code just to read the byte array and called a function to parsed it, like this:
    Qt Code:
    1. void class1::read(QNetworkReply* pReply)
    2. {
    3. QByteArray array = pReply->readAll();
    4.  
    5. pReply->abort();
    6. pReply->close();
    7. pReply->deleteLater();
    8.  
    9. parse(array);
    10. }
    11.  
    12. void class1::parse(const QByteArray& array)
    13. {
    14. ...
    15. }
    To copy to clipboard, switch view to plain text mode 

  5. #5
    Join Date
    Jan 2006
    Location
    Graz, Austria
    Posts
    8,416
    Thanks
    37
    Thanked 1,544 Times in 1,494 Posts
    Qt products
    Qt3 Qt4 Qt5
    Platforms
    Unix/X11 Windows

    Default Re: How to use QNetworkAccessManager to crawl a lot of webpage in the same time?

    Quote Originally Posted by momo View Post
    You mean that I need to set a flag to make "for" loop blocked until I received the previous reply?
    No.
    The loop just schedules downloads, each call to QNetworkAccessManager::get() puts a request into the QNAM's internal queue which it then processes once event processing commences.

    So you need to make sure that the execution flow returns to the event loop after the method is finished.

    Quote Originally Posted by momo View Post
    That code just to read the byte array and called a function to parsed it, like this:
    So read(QNetworkReply*) is being called but "array" is empty?
    Have you checked the error() of "pReply"?

    Cheers,
    _

Similar Threads

  1. Replies: 9
    Last Post: 10th August 2015, 16:54
  2. What's the best way to put a Qt GUI on a webpage?
    By MattPhillips in forum Qt Programming
    Replies: 10
    Last Post: 25th June 2012, 12:31
  3. Tab functionality on Webpage
    By Fetch in forum Newbie
    Replies: 0
    Last Post: 29th November 2010, 20:01
  4. Replies: 3
    Last Post: 17th February 2010, 13:26
  5. opening webpage using qt....
    By anupamgee in forum Qt Programming
    Replies: 8
    Last Post: 20th April 2009, 12:13

Bookmarks

Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts
  •  
Digia, Qt and their respective logos are trademarks of Digia Plc in Finland and/or other countries worldwide.