Results 1 to 10 of 10

Thread: QFile::exists( filename ) slow

  1. #1
    Join Date
    Dec 2006
    Posts
    426
    Thanks
    8
    Thanked 18 Times in 17 Posts
    Qt products
    Qt4
    Platforms
    Unix/X11

    Default QFile::exists( filename ) slow

    Hello, I have about 3,000,000 small files in a directory tree that needs to be ftped through network.

    I use QNetworkAccessManager to handle the job. About 1,000,000 files have been ftped to another computers, so I tar those files and untar into the target computer, and I do
    Qt Code:
    1. foreach( filename, fileList ) {
    2.  
    3. if ( !QFile::exists( filename ) ) {
    4.  
    5. // start ftp filename using QNetworkAccessManager.
    6.  
    7. }
    8.  
    9.  
    10. }
    To copy to clipboard, switch view to plain text mode 

    However, QFile::exists( filename ) takes a lot of time (about 0.02 second ) per file. If I kills the program and restart it, those files that have been checked by QFile::exists( filename ) will return quickly ( 0.001 seconds ), that is 20 times faster.

    How can I solve this problem?

    Thanks.
    Last edited by lni; 23rd February 2015 at 16:31.

  2. #2
    Join Date
    Jan 2006
    Location
    Warsaw, Poland
    Posts
    33,359
    Thanks
    3
    Thanked 5,015 Times in 4,792 Posts
    Qt products
    Qt3 Qt4 Qt5 Qt/Embedded
    Platforms
    Unix/X11 Windows Android Maemo/MeeGo
    Wiki edits
    10

    Default Re: QFile::exists( filename ) slow

    The difference comes from the fact that the second time your system reads from cache without directly accessing the disk. Performance may heavily depend on the filesystem.
    Your biological and technological distinctiveness will be added to our own. Resistance is futile.

    Please ask Qt related questions on the forum and not using private messages or visitor messages.


  3. #3
    Join Date
    Dec 2006
    Posts
    426
    Thanks
    8
    Thanked 18 Times in 17 Posts
    Qt products
    Qt4
    Platforms
    Unix/X11

    Default Re: QFile::exists( filename ) slow

    Quote Originally Posted by wysota View Post
    The difference comes from the fact that the second time your system reads from cache without directly accessing the disk. Performance may heavily depend on the filesystem.
    But I am not reading the content of the file, I just merely check if the file exists.

    I am using CentOS 6.6. Is there such thing as file index in Linux?

    Thanks

    EDIT: I reboot the machine. QFile::exists() is still quick for those file that have been previously checked.
    Last edited by lni; 23rd February 2015 at 18:04.

  4. #4
    Join Date
    Jan 2006
    Location
    Warsaw, Poland
    Posts
    33,359
    Thanks
    3
    Thanked 5,015 Times in 4,792 Posts
    Qt products
    Qt3 Qt4 Qt5 Qt/Embedded
    Platforms
    Unix/X11 Windows Android Maemo/MeeGo
    Wiki edits
    10

    Default Re: QFile::exists( filename ) slow

    Quote Originally Posted by lni View Post
    But I am not reading the content of the file, I just merely check if the file exists.
    This still involves accessing and caching the inode.

    I am using CentOS 6.6. Is there such thing as file index in Linux?
    Depends what software you have installed.

    EDIT: I reboot the machine. QFile::exists() is still quick for those file that have been previously checked.
    A soft reboot will likely not invalidate the on-disk cache.
    Your biological and technological distinctiveness will be added to our own. Resistance is futile.

    Please ask Qt related questions on the forum and not using private messages or visitor messages.


  5. #5
    Join Date
    Dec 2006
    Posts
    426
    Thanks
    8
    Thanked 18 Times in 17 Posts
    Qt products
    Qt4
    Platforms
    Unix/X11

    Default Re: QFile::exists( filename ) slow

    Quote Originally Posted by wysota View Post
    This still involves accessing and caching the inode.


    Depends what software you have installed.


    A soft reboot will likely not invalidate the on-disk cache.
    I am not familiar with the system or kernel. Could you please tell me if there is a way to improve the access time, or what software I can install to help?

    It appears if I do QFile::exists() on all the 3 millions files, then they all will be quick to be accessed afterward. I don't think the 3 millions files will all be kept in cache, do they?

    Many thanks.
    Last edited by lni; 23rd February 2015 at 21:00.

  6. #6
    Join Date
    Jan 2006
    Location
    Warsaw, Poland
    Posts
    33,359
    Thanks
    3
    Thanked 5,015 Times in 4,792 Posts
    Qt products
    Qt3 Qt4 Qt5 Qt/Embedded
    Platforms
    Unix/X11 Windows Android Maemo/MeeGo
    Wiki edits
    10

    Default Re: QFile::exists( filename ) slow

    Quote Originally Posted by lni View Post
    I am not familiar with the system or kernel. Could you please tell me if there is a way to improve the access time, or what software I can install to help?
    There is no general instant acme-improve-my-access-times solution. Checking if 1M files exists will take time if you do it again and again every time you run your program. What you can do is list all the files in the directory and check your files against that list. Preferably doing that while other files get transfered over network.

    It appears if I do QFile::exists() on all the 3 millions files, then they all will be quick to be accessed afterward. I don't think the 3 millions files will all be kept in cache, do they?
    You are not accessing the files but the directory they reside in. And sure, it will all fit into the cache easily.
    Your biological and technological distinctiveness will be added to our own. Resistance is futile.

    Please ask Qt related questions on the forum and not using private messages or visitor messages.


  7. #7
    Join Date
    Dec 2006
    Posts
    426
    Thanks
    8
    Thanked 18 Times in 17 Posts
    Qt products
    Qt4
    Platforms
    Unix/X11

    Default Re: QFile::exists( filename ) slow

    I give up the file exists check and decide to save the data into MySql database.

    However it ends up I have a 5 GB database and growing. How can I compress or decrease the size when saving to database? Thanks!

    Here is my pseudo code:

    Qt Code:
    1. QByteArray compressed = qCompress( fileContent.toAscii(), 9 ); // fileContent is QString
    2. QByteArray b64 = compessed.toBase64();
    3.  
    4. QString sql = QString( "INSERT INTO myTable ( data ) values ( '%1' );" ).arg( QString( b64 ) );
    5.  
    6. query.exec( sql );
    To copy to clipboard, switch view to plain text mode 

  8. #8
    Join Date
    Jan 2006
    Location
    Warsaw, Poland
    Posts
    33,359
    Thanks
    3
    Thanked 5,015 Times in 4,792 Posts
    Qt products
    Qt3 Qt4 Qt5 Qt/Embedded
    Platforms
    Unix/X11 Windows Android Maemo/MeeGo
    Wiki edits
    10

    Default Re: QFile::exists( filename ) slow

    We don't know what you are saving into the database so it is hard to suggest anything. But what exactly was the reason to using QFile::exists() anyway?
    Your biological and technological distinctiveness will be added to our own. Resistance is futile.

    Please ask Qt related questions on the forum and not using private messages or visitor messages.


  9. #9
    Join Date
    Dec 2006
    Posts
    426
    Thanks
    8
    Thanked 18 Times in 17 Posts
    Qt products
    Qt4
    Platforms
    Unix/X11

    Default Re: QFile::exists( filename ) slow

    Quote Originally Posted by wysota View Post
    We don't know what you are saving into the database so it is hard to suggest anything. But what exactly was the reason to using QFile::exists() anyway?
    I try to save text files into database, so essentially it is a QString.

    I want to download more than 3,000,000 small text files from network, QFile::exists was used to checked if the files already exists, then I don't need to download it again. Previously those files were saved into a directory tree. Now I save them into database, and I am getting 5GB database and counting. I am trying to find a way to compress the QString (or QByteArray) as much as possible.

  10. #10
    Join Date
    Jan 2006
    Location
    Warsaw, Poland
    Posts
    33,359
    Thanks
    3
    Thanked 5,015 Times in 4,792 Posts
    Qt products
    Qt3 Qt4 Qt5 Qt/Embedded
    Platforms
    Unix/X11 Windows Android Maemo/MeeGo
    Wiki edits
    10

    Default Re: QFile::exists( filename ) slow

    Why not simply have a registry of files you already downloaded and only download those that are not in the registry? You don't need to check if each file exists or not. Just read what you already downloaded and proceed from there. As for the database, saving individual files into individual records of a database is basically a bad idea. If all files are text files then it is best to compress them all together using some smart data structure tailored to compressing text. Much depends what you want to do with the files once you download them.

    Edit:Better yet, why not simply use rsync instead of writing your own software
    Last edited by wysota; 25th February 2015 at 23:32.
    Your biological and technological distinctiveness will be added to our own. Resistance is futile.

    Please ask Qt related questions on the forum and not using private messages or visitor messages.


Similar Threads

  1. Replies: 3
    Last Post: 19th September 2014, 07:23
  2. Replies: 2
    Last Post: 6th May 2013, 08:06
  3. Replies: 4
    Last Post: 9th May 2011, 09:52
  4. Replies: 2
    Last Post: 21st February 2011, 14:52
  5. Replies: 3
    Last Post: 28th March 2009, 15:37

Bookmarks

Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts
  •  
Digia, Qt and their respective logos are trademarks of Digia Plc in Finland and/or other countries worldwide.