Quote Originally Posted by wysota View Post
I somehow doubt he'll be able to map 67 gigabytes to memory. I'd say that reading 9 million floats (which gives at least 36 megabytes of data) just has to take time, especially if one reads it value at a time. A trivial optimization is to read all values at once and then iterate over them.

Ordering of data in the file is poor as well, skipping 2kB of data (why 2 and not 8? 2k of 4 byte values is 8kB) before each read significantly limits the disk cache hit ratio, even for a 32MB cache it will give you lots of cache misses. Nothing will change that, even mapping the file to memory (unless you have enough physical RAM to actually be able to map the whole file at once). If you can't change the file structure then I suggest you invest in a faster disk (SSD?) or more RAM (96GB should do).
You are absolutely right. It should be "pos += 8000".

I am restructuring the data so it will read data block by block to maximize the disk cache. What is the best block size I should use? Is there a way to query this parameter in the program?

Many thanks!