Page 1 of 2 12 LastLast
Results 1 to 20 of 23

Thread: Fast serialization/deserialization

  1. #1
    Join Date
    Jan 2006
    Location
    travelling
    Posts
    1,116
    Thanks
    8
    Thanked 127 Times in 121 Posts
    Qt products
    Qt4
    Platforms
    Unix/X11 Windows

    Default Fast serialization/deserialization

    For some reason, my application must create a big hierarchical model on launch and maintain it until it finishes execution. The problem I'm facing is that this model is generated from a set of files which are parsed resulting in a loading longer than 10 seconds which is not very affordable... Thus I thought about serializing the data into a formated plain-text file and reading it on next launches instead of parsing the files again. This brings a performance enhancements but not sufficient IMO (gain : 2-3 seconds). My current approach is to load a full file to memory and to parse line one by one (each of them "generating" a tree node). I thought about two ways to improve performance but I don't know how to apply these concepts :
    1. Change the loading/parsing approach (is sequential reading faster than full reading?)
    2. Allocate memory for all nodes to reduce time wasted in allocations/relocations inside the deserialization loop, the main problem being that this nodes are not of the same type...
    Any hints on how to do this? BTW any other idea that would lead to significant performance enhancement is welcome...
    Current Qt projects : QCodeEdit, RotiDeCode

  2. #2
    Join Date
    Mar 2006
    Posts
    7
    Qt products
    Qt3 Qt4
    Platforms
    Unix/X11

    Default Re: Fast serialization/deserialization

    Well, serializatión and persistence are not trivial. Take a look on:

    http://www.s11n.net/

  3. #3
    Join Date
    Jan 2006
    Location
    Warsaw, Poland
    Posts
    33,359
    Thanks
    3
    Thanked 5,015 Times in 4,792 Posts
    Qt products
    Qt3 Qt4 Qt5 Qt/Embedded
    Platforms
    Unix/X11 Windows Android Maemo/MeeGo
    Wiki edits
    10

    Default Re: Fast serialization/deserialization

    What about using QDataStream?

  4. #4
    Join Date
    Jan 2006
    Location
    travelling
    Posts
    1,116
    Thanks
    8
    Thanked 127 Times in 121 Posts
    Qt products
    Qt4
    Platforms
    Unix/X11 Windows

    Default Re: Fast serialization/deserialization

    Quote Originally Posted by wysota View Post
    What about using QDataStream?
    I once tried it and also examined it's output (Qt Assistant db's if anyone cares...) and two things came to me :
    • The output seems to be twice as big (reading them in a plain text editor you can see a space between every character... maybe because strings are saved as Unicode instead of local 8 bit...)
    • Is it really faster than QTextStream ???
    Besides I don't ATM use QTextStream for deserialization but only for serialization (which is quite fast). As far as I understood, most of the CPU time is wasted doing memory allocation for nodes and temporary parsing variables (mostly QByteArray and QList<QByteArray>). Thus my question is : how can I reserve an amount of memory to speed up loading? (the tree typically takes around 60Mb...)

    Well, serializatión and persistence are not trivial. Take a look on:
    http://www.s11n.net/
    Looks good but I don't feel like adding dependencies... Moreover my problem is not really in I/O but rather in memory management and speed enhancements around it... Ease of use is not my primary focus here. See, I'm ready to use a dirty hack if it does not break portability and shrinks loading under 3-4 seconds...
    Current Qt projects : QCodeEdit, RotiDeCode

  5. #5
    Join Date
    Jan 2006
    Location
    Warsaw, Poland
    Posts
    33,359
    Thanks
    3
    Thanked 5,015 Times in 4,792 Posts
    Qt products
    Qt3 Qt4 Qt5 Qt/Embedded
    Platforms
    Unix/X11 Windows Android Maemo/MeeGo
    Wiki edits
    10

    Default Re: Fast serialization/deserialization

    Quote Originally Posted by fullmetalcoder View Post
    The output seems to be twice as big (reading them in a plain text editor you can see a space between every character... maybe because strings are saved as Unicode instead of local 8 bit...)
    Then compress all strings and store them as byte arrays.

    Is it really faster than QTextStream ???
    No, but it should occupy less space and is platform independent.

    Thus my question is : how can I reserve an amount of memory to speed up loading? (the tree typically takes around 60Mb...)
    You can use something which is called "placement new" - first you reserve a pool in memory and then when you call new, the memory doesn't have to be allocated so the constructor is called immediately. And when you free (delete) an object, its memory goes back to the pool.

  6. #6
    Join Date
    Jan 2006
    Location
    travelling
    Posts
    1,116
    Thanks
    8
    Thanked 127 Times in 121 Posts
    Qt products
    Qt4
    Platforms
    Unix/X11 Windows

    Default Re: Fast serialization/deserialization

    Quote Originally Posted by wysota View Post
    You can use something which is called "placement new" - first you reserve a pool in memory and then when you call new, the memory doesn't have to be allocated so the constructor is called immediately. And when you free (delete) an object, its memory goes back to the pool.
    Sounds great but how do I do that???
    Current Qt projects : QCodeEdit, RotiDeCode

  7. #7
    Join Date
    Jan 2006
    Location
    Warsaw, Poland
    Posts
    33,359
    Thanks
    3
    Thanked 5,015 Times in 4,792 Posts
    Qt products
    Qt3 Qt4 Qt5 Qt/Embedded
    Platforms
    Unix/X11 Windows Android Maemo/MeeGo
    Wiki edits
    10

  8. #8
    Join Date
    Feb 2006
    Location
    Romania
    Posts
    2,744
    Thanks
    8
    Thanked 541 Times in 521 Posts
    Qt products
    Qt4
    Platforms
    Unix/X11 Windows

    Default Re: Fast serialization/deserialization

    This is pretty clear and proved helpful to me ( see the Placement Syntax section ).
    http://publib.boulder.ibm.com/infoce...c05cplr199.htm

    regards

  9. #9
    Join Date
    Jan 2006
    Location
    travelling
    Posts
    1,116
    Thanks
    8
    Thanked 127 Times in 121 Posts
    Qt products
    Qt4
    Platforms
    Unix/X11 Windows

    Default Re: Fast serialization/deserialization

    Thanks for your links, they helped me understanding this topic a little better. Yet, my problem isn't solved...

    1. I really can't afford placement new syntax
    2. I want my pool to be used by the whole app, including possible plugins...
    I have crafted a decent memory pool (maybe the Troll could consider adding one to Qt to save use such efforts and potential leaks...) and did some test using overloaded global new/delete operators. It seems to go quite fine but how can I make sure that they will be used by plugins? And is there a way to make them replace those used by external libraries such as Qt (I use quite a lot of string manipulation which are way too allocation consuming...)?
    Current Qt projects : QCodeEdit, RotiDeCode

  10. #10
    Join Date
    Jan 2006
    Location
    Warsaw, Poland
    Posts
    33,359
    Thanks
    3
    Thanked 5,015 Times in 4,792 Posts
    Qt products
    Qt3 Qt4 Qt5 Qt/Embedded
    Platforms
    Unix/X11 Windows Android Maemo/MeeGo
    Wiki edits
    10

    Default Re: Fast serialization/deserialization

    I don't think it would be very smart to do that (I mean to replace all new calls with placement new). Maybe you could explain what you're trying to achieve and we'll find a solution together?

  11. #11
    Join Date
    Jan 2006
    Location
    travelling
    Posts
    1,116
    Thanks
    8
    Thanked 127 Times in 121 Posts
    Qt products
    Qt4
    Platforms
    Unix/X11 Windows

    Default Re: Fast serialization/deserialization

    Quote Originally Posted by wysota View Post
    I don't think it would be very smart to do that (I mean to replace all new calls with placement new). Maybe you could explain what you're trying to achieve and we'll find a solution together?
    I don't want to use placement new at all... What I want is to override new/delete calls at application level so that everything goes to a pre-allocated memory pool. If I manage to do this I'll have a significant speed up (pool-managed allocation is 5-10 times faster than "on the fly" allocation). My question was more about the possibility of effectively overriding global memory allocators and a few tests showed that I can achieve what I want (I'm running under Linux). Thus, the next question is on the portability of this method...
    Current Qt projects : QCodeEdit, RotiDeCode

  12. #12
    Join Date
    Jan 2006
    Location
    Warsaw, Poland
    Posts
    33,359
    Thanks
    3
    Thanked 5,015 Times in 4,792 Posts
    Qt products
    Qt3 Qt4 Qt5 Qt/Embedded
    Platforms
    Unix/X11 Windows Android Maemo/MeeGo
    Wiki edits
    10

    Default Re: Fast serialization/deserialization

    Quote Originally Posted by fullmetalcoder View Post
    I don't want to use placement new at all... What I want is to override new/delete calls at application level so that everything goes to a pre-allocated memory pool.
    Hmm... isn't it what placement new does?

    If I manage to do this I'll have a significant speed up (pool-managed allocation is 5-10 times faster than "on the fly" allocation). My question was more about the possibility of effectively overriding global memory allocators and a few tests showed that I can achieve what I want (I'm running under Linux). Thus, the next question is on the portability of this method...
    Yes, you can achieve that, but the question is - is it worth the effort? You could use the stack as the "allocator" instead of heap as it is much faster (there is no real allocation, the stack is already allocated for the process).Also trying to optimize the algorithm itself might prove simpler to achieve.

  13. #13
    Join Date
    Apr 2006
    Location
    San Francisco, CA
    Posts
    186
    Thanks
    55
    Thanked 12 Times in 11 Posts
    Qt products
    Qt4
    Platforms
    MacOS X Windows

    Default Re: Fast serialization/deserialization

    Sounds like you just want to override global new?

    Qt Code:
    1. void* operator new (size_t size)
    2. {
    3. void *p=my_pool_alloc(size);
    4. if (p==0) // did malloc succeed?
    5. throw std::bad_alloc(); // ANSI/ISO compliant behavior
    6. return p;
    7. }
    To copy to clipboard, switch view to plain text mode 

    I'm not sure plugins would go to this new though, I imagine operator overrides are resolved at compile time?
    I definitely agree that thousands of small new calls are very bad - has to bug the OS every time for memory, that's awful/adds up.
    But of course, you'll probably run into all sorts of trouble if you ever need to dynamically increase your pool size. I personally would prefer more restricted overrides of new (class-limited) rather than global. Then again, your current needs are probably different.
    Contest deadline? :P
    Software Engineer



  14. #14
    Join Date
    Jan 2006
    Location
    Warsaw, Poland
    Posts
    33,359
    Thanks
    3
    Thanked 5,015 Times in 4,792 Posts
    Qt products
    Qt3 Qt4 Qt5 Qt/Embedded
    Platforms
    Unix/X11 Windows Android Maemo/MeeGo
    Wiki edits
    10

    Default Re: Fast serialization/deserialization

    I have a question... What happens if you want to allocate your pool using "new"? What happens when you delete an object? I don't think overriding operator new is enough. You'd have to override delete as well and I don't think you can do that in a reliable way.

  15. #15
    Join Date
    Jan 2006
    Location
    travelling
    Posts
    1,116
    Thanks
    8
    Thanked 127 Times in 121 Posts
    Qt products
    Qt4
    Platforms
    Unix/X11 Windows

    Default Re: Fast serialization/deserialization

    Quote Originally Posted by wysota View Post
    I have a question... What happens if you want to allocate your pool using "new"? What happens when you delete an object? I don't think overriding operator new is enough. You'd have to override delete as well and I don't think you can do that in a reliable way.
    1. As long as the global pool is not set the global overridden new/delete operator call malloc/free
    2. I've of course overriden the global delete operator as well and it does work
    Current Qt projects : QCodeEdit, RotiDeCode

  16. #16
    Join Date
    Jan 2006
    Location
    Warsaw, Poland
    Posts
    33,359
    Thanks
    3
    Thanked 5,015 Times in 4,792 Posts
    Qt products
    Qt3 Qt4 Qt5 Qt/Embedded
    Platforms
    Unix/X11 Windows Android Maemo/MeeGo
    Wiki edits
    10

    Default Re: Fast serialization/deserialization

    Quote Originally Posted by fullmetalcoder View Post
    I've of course overriden the global delete operator as well and it does work
    Are you sure this is safe? Did you override delete[] and new[] as well? There are many things that seem to work but fail under special conditions.

  17. #17
    Join Date
    Jan 2006
    Location
    travelling
    Posts
    1,116
    Thanks
    8
    Thanked 127 Times in 121 Posts
    Qt products
    Qt4
    Platforms
    Unix/X11 Windows

    Default Re: Fast serialization/deserialization

    Quote Originally Posted by wysota View Post
    Are you sure this is safe? Did you override delete[] and new[] as well? There are many things that seem to work but fail under special conditions.
    I'm truing something and until now it seems quite safe. However I know there might be issues, especially with some platforms/compilers which would lead to a different behaviour of operators but I think it is worth trying... We'll see.

    And no, I did not override new[] and delete[] because :
    1. I don't use them anywhere
    2. the default implementation call operator new(size_t) passing it n * sizeof(Type_X) so it does not really matter
    Did I do something wrong here?
    Current Qt projects : QCodeEdit, RotiDeCode

  18. #18
    Join Date
    Jan 2006
    Location
    Warsaw, Poland
    Posts
    33,359
    Thanks
    3
    Thanked 5,015 Times in 4,792 Posts
    Qt products
    Qt3 Qt4 Qt5 Qt/Embedded
    Platforms
    Unix/X11 Windows Android Maemo/MeeGo
    Wiki edits
    10

    Default Re: Fast serialization/deserialization

    Quote Originally Posted by fullmetalcoder View Post
    I don't use them anywhere
    But you want to force alien code to use your operators as well, so it might be important to override it.

    Did I do something wrong here?
    I think you're fine, but it might be worth reimplementing [] operators as well. You never know what different compilers will do. For example allocating using new [] and deleting using delete (without []) crashes Windows applications but not Linux ones, so I suspect there might be some differences here.

  19. #19
    Join Date
    Jan 2006
    Location
    travelling
    Posts
    1,116
    Thanks
    8
    Thanked 127 Times in 121 Posts
    Qt products
    Qt4
    Platforms
    Unix/X11 Windows

    Default Re: Fast serialization/deserialization

    Quote Originally Posted by wysota View Post
    But you want to force alien code to use your operators as well, so it might be important to override it.


    I think you're fine, but it might be worth reimplementing [] operators as well. You never know what different compilers will do. For example allocating using new [] and deleting using delete (without []) crashes Windows applications but not Linux ones, so I suspect there might be some differences here.
    Good point.

    Everything looked fine until now but unfortunately I'm facing kinda big trouble : my pool is not thread safe and any time threading appears (even in QLibrary for example) I get a segfault in QMutex...

    As the docs says, I've tried using a QMutex and a QMutexLocker in alloc() and dealloc() functions but I end up with this :
    QMutex::lock: Deadlock detected in thread -1208572208
    and the app hangs forever (without consuming CPU however)...

    Any hint? Or will I be forced to keep poor performances?
    Current Qt projects : QCodeEdit, RotiDeCode

  20. #20
    Join Date
    Apr 2006
    Location
    San Francisco, CA
    Posts
    186
    Thanks
    55
    Thanked 12 Times in 11 Posts
    Qt products
    Qt4
    Platforms
    MacOS X Windows

    Default Re: Fast serialization/deserialization

    Can you dump out which thread/function holds which mutex?
    I would make sure that the alloc/dealloc functions work fine from multiple threads from simple test cases first, before tackling the entire application, in case there are problems with it. Debugging multithreading problems is some of the hardest things to debug.

    Then again, if internal Qt things like QLibrary are actually now using your memory allocator, I would not expect everything to work perfectly...
    Software Engineer



Similar Threads

  1. fast writing of large amounts of data to files
    By TheKedge in forum Qt Programming
    Replies: 1
    Last Post: 13th February 2007, 16:33
  2. Replies: 6
    Last Post: 8th January 2007, 10:24
  3. Fast Keyboard, help !
    By Alex63 in forum Qt Programming
    Replies: 2
    Last Post: 27th June 2006, 18:18
  4. Fast image drawing/scaling in Qt 3.3
    By eriwik in forum Qt Programming
    Replies: 1
    Last Post: 21st June 2006, 10:45

Bookmarks

Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts
  •  
Digia, Qt and their respective logos are trademarks of Digia Plc in Finland and/or other countries worldwide.