Results 1 to 14 of 14

Thread: Reading large unsorted file

Hybrid View

Previous Post Previous Post   Next Post Next Post
  1. #1
    Join Date
    Jan 2006
    Location
    Graz, Austria
    Posts
    8,416
    Thanks
    37
    Thanked 1,544 Times in 1,494 Posts
    Qt products
    Qt3 Qt4 Qt5
    Platforms
    Unix/X11 Windows

    Default Re: Reading large unsorted file

    Quote Originally Posted by GSS View Post
    I'll give a try to the the QtConcurrent.
    No point in doing that, QtConcurrent is for parallelizing multiple tasks, you only have one task.

    Quote Originally Posted by Lesiok View Post
    As I see You don't reset map2. So after next line map2 contains all lines from begin to current.
    I think that first line in whie loop should be :
    Qt Code:
    1. map2.clear();
    To copy to clipboard, switch view to plain text mode 
    No, every loop iteration is a single record.

    The loop needs to work with map3 and only with map3.

    Something like
    Qt Code:
    1. map3[list1[0].insertMulti(list1[1], list1[2]);
    To copy to clipboard, switch view to plain text mode 
    i.e. retrieve or create the inner map for key "list1[0]" then insert the current inner pair, allowing multiple values for key "list1[1]"

    Since the result is written back into files again, one could probably even avoid the encoding and decoding and just work with QByteArray instead of QString.

    Is the machine's physical RAM large enough to do that without swapping?
    Such a long time sounds like as if the machine started swapping.

    Cheers,
    _
    Last edited by anda_skoa; 18th July 2016 at 11:08.

  2. #2
    Join Date
    Jul 2016
    Posts
    6
    Thanks
    3
    Qt products
    Qt5
    Platforms
    MacOS X Windows

    Default Re: Reading large unsorted file

    Quote Originally Posted by Lesiok View Post
    As I see You don't reset map2. So after next line map2 contains all lines from begin to current.
    I think that first line in whie loop should be :
    Qt Code:
    1. map2.clear();
    To copy to clipboard, switch view to plain text mode 
    Well spotted. This reduced the running time to 40 minutes.

    Quote Originally Posted by anda_skoa View Post
    The loop needs to work with map3 and only with map3.

    Something like
    Qt Code:
    1. map3[list1[0].insertMulti(list1[1], list1[2]);
    To copy to clipboard, switch view to plain text mode 
    i.e. retrieve or create the inner map for key "list1[0]" then insert the current inner pair, allowing multiple values for key "list1[1]"

    Since the result is written back into files again, one could probably even avoid the encoding and decoding and just work with QByteArray instead of QString.

    Is the machine's physical RAM large enough to do that without swapping?
    Such a long time sounds like as if the machine started swapping.

    Cheers,
    _
    I'm still running with your suggestions so I cannot tell if it will take less than the 40 minutes.

    Regarding memory, it doesn't seem to be an issue. The file has almost 400 MB and the application uses less than 3 GB while running. Anyway, I'm not familiar with the concepts of "swap memory".

    Using QByteArray seems to be interesting but how could I then sort the entries and put everything back to files?

    Thanks!

    EDIT: Maybe memory is an issue. I just got a "Segmentation fault" and only half of the lines were read. Perhaps this happened now and not with the previous attempts because I have other applications running now.

    EDIT2:
    Quote Originally Posted by anda_skoa View Post
    The loop needs to work with map3 and only with map3.

    Something like
    Qt Code:
    1. map3[list1[0].insertMulti(list1[1], list1[2]);
    To copy to clipboard, switch view to plain text mode 
    I run for only 1/4 of the lines and it took 10 minutes, which means 40 minutes as the correction suggested by Lesiok.
    Last edited by GSS; 18th July 2016 at 13:21.

  3. #3
    Join Date
    Jan 2006
    Location
    Graz, Austria
    Posts
    8,416
    Thanks
    37
    Thanked 1,544 Times in 1,494 Posts
    Qt products
    Qt3 Qt4 Qt5
    Platforms
    Unix/X11 Windows

    Default Re: Reading large unsorted file

    Quote Originally Posted by Lesiok View Post
    Sorry anda_skoa, you're wrong.
    Maybe, but I don't think so :-)

    Quote Originally Posted by Lesiok View Post
    map2 is declared outside the loop thus every iteration appends next elements.
    yes

    Quote Originally Posted by Lesiok View Post
    Adding a clear shortened significantly duration of process.
    Yes, and it also makes map3 only contain the last map2 for each key, which will alway be only one entry since it is cleared for every line in the file.

    My understanding was that the programs should keep all entries for a "map3" key, not just the last pair.

    Quote Originally Posted by GSS View Post
    Regarding memory, it doesn't seem to be an issue. The file has almost 400 MB and the application uses less than 3 GB while running. Anyway, I'm not familiar with the concepts of "swap memory".
    Swap is a file or partition on disk to which the operating system "swaps" memory that is currently not used by the running application, e.g. memory used by another application which is currently not running.
    See https://en.wikipedia.org/wiki/Swap_memory

    Getting a system into a state where it starts swapping heavily wil slow it "to its knees", since there is a lot of I/O overhead with a slow (compared to RAM) memory device.

    Quote Originally Posted by GSS View Post
    Using QByteArray seems to be interesting but how could I then sort the entries and put everything back to files?
    I am not sure what you mean, QFile is a QIODevice subclass, it has read/write functions for QByteArray.

    Cheers,
    _

  4. The following user says thank you to anda_skoa for this useful post:

    GSS (18th July 2016)

  5. #4
    Join Date
    Jul 2016
    Posts
    6
    Thanks
    3
    Qt products
    Qt5
    Platforms
    MacOS X Windows

    Default Re: Reading large unsorted file

    Quote Originally Posted by anda_skoa View Post
    Yes, and it also makes map3 only contain the last map2 for each key, which will alway be only one entry since it is cleared for every line in the file.

    My understanding was that the programs should keep all entries for a "map3" key, not just the last pair.
    You are absolutely right. It has to be as you say or it will replace and not append.

    Quote Originally Posted by anda_skoa View Post
    I am not sure what you mean, QFile is a QIODevice subclass, it has read/write functions for QByteArray.
    Sorry, I though you were referring to QFile::readAll() and work from there.

    I will explore the use of QByteArray to avoid the conversion to QString.

    Thank you very much!

  6. #5
    Join Date
    Mar 2008
    Location
    Kraków, Poland
    Posts
    1,540
    Thanked 284 Times in 279 Posts
    Qt products
    Qt4
    Platforms
    Unix/X11 Windows

    Default Re: Reading large unsorted file

    Quote Originally Posted by anda_skoa View Post
    No, every loop iteration is a single record._
    Sorry anda_skoa, you're wrong. map2 is declared outside the loop thus every iteration appends next elements. Adding a clear shortened significantly duration of process.

Similar Threads

  1. Advice: Reading large text file.
    By enricong in forum Qt Programming
    Replies: 7
    Last Post: 16th July 2011, 12:11
  2. High performance large file reading on OSX
    By mikeee7 in forum Qt Programming
    Replies: 2
    Last Post: 15th October 2009, 14:18
  3. To large exe file
    By wydesenej in forum Installation and Deployment
    Replies: 8
    Last Post: 24th January 2009, 21:44
  4. large file handling
    By sakthi in forum Qt-based Software
    Replies: 1
    Last Post: 30th October 2008, 00:34
  5. large file management
    By sakthi in forum Qt Programming
    Replies: 1
    Last Post: 22nd October 2008, 08:13

Tags for this Thread

Bookmarks

Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts
  •  
Qt is a trademark of The Qt Company.