Results 1 to 12 of 12

Thread: QT SQL advice

  1. #1
    Join Date
    Dec 2010
    Posts
    55
    Thanks
    1
    Qt products
    Qt4
    Platforms
    Unix/X11

    Default QT SQL advice

    My software is doing the following
    1. read in several large txt data files
    2. perform statistics

    I do this by parsing through the large txt file and inserting into a sqlite db for each txt file. On subsequent runs, I check if the db file exists, if it does, then I read the db file instead

    I have a main in-memory db.
    After reading each txt file (and creating the db), I execute ATTACH and add each db into the main db. By doing all of this in a couple transactions, I've been able to get this to go fairly quickly.
    I set the cache size very large on each db to try to improve performance

    My statistics that I perform are mainly various SUM queries based on different selections. Then doing some averages. This goes extremely slowly. I'm trying to figure out how to speed this up.

    Questions
    1. Do I need to "close" each db at some point to ensure the cache is "flushed"?
    2. If I just created the DB, I would think most if not all of it should be in cache. When I do an "ATTACH" and access the DB via the "main in-memory db", does it use the cache?
    3. When I run my program (and the DB's are already created), the entire db needs to be read from the HD since my sums basically will touch every element at least once.
    4. Is using a SQLITE database even the best approach? Should I just store everything in data structures instead?
    Running:
    RHEL 5.4
    Python 2.7.2
    Qt 4.7.4
    SIP 4.7.8
    PyQt 4.7

  2. #2
    Join Date
    Mar 2009
    Location
    Brisbane, Australia
    Posts
    7,729
    Thanks
    13
    Thanked 1,610 Times in 1,537 Posts
    Qt products
    Qt4 Qt5
    Platforms
    Unix/X11 Windows
    Wiki edits
    17

    Default Re: QT SQL advice

    If the actual Sql is the speed problem (you can check by executing the same query outside your program) then good places to look are any WHERE clauses you have, and correct joins between tables. For large data sets well placed indexes are your friend. Without specifics it is hard to be more targetted.

    If this is actually a Qt issue then you need to show what your code is doing that is so slow

  3. #3
    Join Date
    Dec 2010
    Posts
    55
    Thanks
    1
    Qt products
    Qt4
    Platforms
    Unix/X11

    Default Re: QT SQL advice

    Actually, currently I am not doing any joins. I haven't created any indices yet, I forgot about that. Hopefully that will help. But I did have a few general questions:
    1. I setup my DB with a large cache size. At what point does everything get written to disk? Immediately once the transaction is completed?
    2. If I use one db connection to create my db (and have a large cache), then attach it to a new db, new connection, will the cache be inherited or is the whole thing read from disk?
    3. I don't really understand what the transaction does? Does it just keep everything in cache and wait to write to disk until you commit? I understand how this would help with multiple INSERTs but is that useful with multiple SELECT queries?
    Running:
    RHEL 5.4
    Python 2.7.2
    Qt 4.7.4
    SIP 4.7.8
    PyQt 4.7

  4. #4
    Join Date
    Mar 2009
    Location
    Brisbane, Australia
    Posts
    7,729
    Thanks
    13
    Thanked 1,610 Times in 1,537 Posts
    Qt products
    Qt4 Qt5
    Platforms
    Unix/X11 Windows
    Wiki edits
    17

    Default Re: QT SQL advice

    There are no transactions around select queries, only queries that modify data. The transaction logging is used by Sqlite (or any similar RDBMS) to maintain a consistent set of data files in the face of multiple users and possible abnormal termination. When a transaction is committed the pending changes will be permanently written to the main data file on disk before control is returned. Sqlite uses temporary files to track transactions in progress.

    I do not know if there is one cache or many ... but you should not need to know. The cache exists to minimise the need to read recently used data from disk by keeping the most recently used blocks in memory. The Sqlite cache content is managed internally. If you have huge tables and do operations that read them completely (e.g. Select max(blah) from foo; with no indexes) then no cache smaller than the table will help much.

  5. #5
    Join Date
    Dec 2010
    Posts
    55
    Thanks
    1
    Qt products
    Qt4
    Platforms
    Unix/X11

    Default Re: QT SQL advice

    Well, I have lots of large txt files that I need to parse.
    I'm thinking that if I parse it once and save it to a database file, then it will be faster to read the database file in future runs of the program instead of reparsing the txt file. I can also delete the large txt files.

    So I'm inserting data into database files (with a large cache).
    Then attaching them to a memory database to do processing.

    So on that first run, when the data should all be in cache, when I do the attach and read the database via the memory database, does that read from memory or from the disk?

    my database structure is like:
    id INTEGER
    range INTEGER
    group1 INTEGER
    value REAL

    I create an index (id, degrees, group)

    When I read, I do a :
    SELECT DISTINCT id

    then for each id,
    SELECT SUM(value) FROM database WHERE group=x AND id=id AND (range > y AND range < z)

    The table has about 400000 rows
    That are about 200 distinct ids and Four groups so 800 select sum queries that I then manually average
    It is summing about 500 numbers each time although this can vary between 0 and 2000
    Every select query will read different data.
    this select query takes about 40ms
    so total about 4 seconds.
    Running:
    RHEL 5.4
    Python 2.7.2
    Qt 4.7.4
    SIP 4.7.8
    PyQt 4.7

  6. #6
    Join Date
    Mar 2008
    Location
    Kraków, Poland
    Posts
    1,536
    Thanked 284 Times in 279 Posts
    Qt products
    Qt4
    Platforms
    Unix/X11 Windows

    Default Re: QT SQL advice

    For this two queries You should have two indexes : first (id) and second (group,id).

  7. #7
    Join Date
    Dec 2010
    Posts
    55
    Thanks
    1
    Qt products
    Qt4
    Platforms
    Unix/X11

    Default Re: QT SQL advice

    I just remembered the "Group By" sql argument.
    so I think I can just do:
    SELECT SUM(value) FROM database WHERE group=x AND (range > y AND range < z) GROUP BY id
    Running:
    RHEL 5.4
    Python 2.7.2
    Qt 4.7.4
    SIP 4.7.8
    PyQt 4.7

  8. #8
    Join Date
    Mar 2008
    Location
    Kraków, Poland
    Posts
    1,536
    Thanked 284 Times in 279 Posts
    Qt products
    Qt4
    Platforms
    Unix/X11 Windows

    Default Re: QT SQL advice

    Quote Originally Posted by enricong View Post
    I just remembered the "Group By" sql argument.
    so I think I can just do:
    SELECT SUM(value) FROM database WHERE group=x AND (range > y AND range < z) GROUP BY id
    Of course but I think that it should looks like :
    Qt Code:
    1. SELECT id, SUM(value) FROM database WHERE group=x AND (range > y AND range < z) GROUP BY id
    To copy to clipboard, switch view to plain text mode 
    Without it you will not know what is the id of the sum.

  9. #9
    Join Date
    Jan 2006
    Location
    Warsaw, Poland
    Posts
    33,359
    Thanks
    3
    Thanked 5,015 Times in 4,792 Posts
    Qt products
    Qt3 Qt4 Qt5 Qt/Embedded
    Platforms
    Unix/X11 Windows Android Maemo/MeeGo
    Wiki edits
    10

    Default Re: QT SQL advice

    Could you please take your queries, launch an sqlite console against your database and execute each query prepending it with "EXPLAIN QUERY PLAN"? E.g. modify a "SELECT SUM(value) FROM database WHERE group=x AND (range > y AND range < z) GROUP BY id" to become "EXPLAIN QUERY PLAN SELECT SUM(value) FROM database WHERE group=x AND (range > y AND range < z) GROUP BY id". Paste the results here, please.
    Your biological and technological distinctiveness will be added to our own. Resistance is futile.

    Please ask Qt related questions on the forum and not using private messages or visitor messages.


  10. #10
    Join Date
    Dec 2010
    Posts
    55
    Thanks
    1
    Qt products
    Qt4
    Platforms
    Unix/X11

    Default Re: QT SQL advice

    Yes, I forgot the id when I typed up that message

    I get the following with explain:
    SCAN TABLE database
    USE TEMP B-TREE FOR GROUP BY


    This now runs in about 80ms so about 400X faster.
    Typically, I need to do this about 32 times. Over all the user is waiting about 10-15 seconds so its slowing down elsewhere.
    Ideally, I'd like an instant response but I think it's acceptable right now.
    Last edited by enricong; 5th November 2014 at 02:07.
    Running:
    RHEL 5.4
    Python 2.7.2
    Qt 4.7.4
    SIP 4.7.8
    PyQt 4.7

  11. #11
    Join Date
    Jan 2006
    Location
    Warsaw, Poland
    Posts
    33,359
    Thanks
    3
    Thanked 5,015 Times in 4,792 Posts
    Qt products
    Qt3 Qt4 Qt5 Qt/Embedded
    Platforms
    Unix/X11 Windows Android Maemo/MeeGo
    Wiki edits
    10

    Default Re: QT SQL advice

    If you get a table scan then you are missing an index on the field from WHERE clause (likely 'group' column in your case).
    Your biological and technological distinctiveness will be added to our own. Resistance is futile.

    Please ask Qt related questions on the forum and not using private messages or visitor messages.


  12. #12
    Join Date
    Dec 2010
    Posts
    55
    Thanks
    1
    Qt products
    Qt4
    Platforms
    Unix/X11

    Default Re: QT SQL advice

    I had a bug where I was creating the index. now its about 50% faster down to about 30-40ms from 80.
    Running:
    RHEL 5.4
    Python 2.7.2
    Qt 4.7.4
    SIP 4.7.8
    PyQt 4.7

Similar Threads

  1. need advice on gui thread
    By dpn in forum Qt Programming
    Replies: 8
    Last Post: 16th September 2013, 09:16
  2. General Advice
    By KillGabio in forum Newbie
    Replies: 6
    Last Post: 2nd February 2012, 03:15
  3. XML advice
    By stefan in forum Newbie
    Replies: 1
    Last Post: 24th December 2011, 14:12
  4. Advice for an Application
    By salmanmanekia in forum Newbie
    Replies: 1
    Last Post: 19th April 2010, 12:06
  5. Need Advice: Best IDE for Mac OSX
    By JimDaniel in forum Qt Programming
    Replies: 6
    Last Post: 18th October 2008, 23:14

Bookmarks

Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts
  •  
Digia, Qt and their respective logos are trademarks of Digia Plc in Finland and/or other countries worldwide.