Hi all,
I need a md5 sum of a file with a big size. I found this:
Code:
qDebug() << hashData.toHex(); }
but with a file size > 8 GB it's a memory overkill!
Are there any other methods?
Thanks!
Chris
Printable View
Hi all,
I need a md5 sum of a file with a big size. I found this:
Code:
qDebug() << hashData.toHex(); }
but with a file size > 8 GB it's a memory overkill!
Are there any other methods?
Thanks!
Chris
QCryptographicHash::hash() is static "helper" function, and what You are looking for is QCryptographicHash::addData().
This was answered here i.e. http://www.qtcentre.org/threads/3567...an-entire-file
That works, thanks for the hint!
Is there a way to calculate md5 sum from a dvd (/dev/cdrom) with this method? It musts read the dvd raw data, not files on disc!
Open and read the /dev/cdrom device directly: you get the bytes from one end of the disc to the other.
How? QFile doesn't work!
Sure it does, but QIODevice::atEnd() is not reliable on block special files so you have to adopt a slightly different approach:
This will fail to open if there is no disc in the drive. In a real program you would use a larger buffer, distinguish between bytesRead == 0 and -1 etc.
Thanks for the hint, but it doesn't work for me! The md5 sum from image and dvd are not the same! On command line they are identical! Or is the code wrong?
How many bytes did you manage to read from the device?
...the while loop reads all data from dvd!?
Ok, but how? ...sorry!
Read the docs to learn what QFile::read() returns. Then use your newly acquired knowledge to count how many bytes in total were read from the file.
Ok, I used this:
Code:
QCryptographicHash hash(QCryptographicHash::Md5); char buf[2048]; int bytesRead; qint64 overallBytesRead = 0; while ((bytesRead = in.read(buf, 2048)) > 0) { overallBytesRead += bytesRead; hash.addData(buf, 2048); } in.close(); qDebug() << "overall bytes read:" << overallBytesRead; qDebug() << hash.result().toHex(); } else { qDebug() << "Failed to open device!"; }
After completed, overallBytesRead is "8738865152", but the file size of image is "8738846720". He reads more bytes as on dvd is! I'm confused! :confused:
Line 11 should be :Code:
hash.addData(buf, bytesRead);
This is not a failing of Qt, but a failing of understanding the data you are handling. The DVD/CD will almost always contain extra padding data at the end of the supplied image to meet requirements of the specifications. You should read (up to) as many bytes from the DVD as are in the image you are trying to compare with. So, for example, a Gentoo image:
Code:
// Original image file chrisw@newton ~ $ dd if=install-amd64-minimal-20110609.iso bs=2048 | md5sum 64900+0 records in // <<< this is the number of blocks in the image 64900+0 records out 132915200 bytes (133 MB) copied, 4.39815 s, 30.2 MB/s 3acf53667fcf1d03e98068ee4af5f4a3 - // Reading all data from the raw device... fails chrisw@newton ~ $ dd if=/dev/cdrom bs=2048 | md5sum 64963+0 records in 64963+0 records out b0700288a316b71dee09ed87dce3b160 - // <<<< Not good 133044224 bytes (133 MB) copied, 39.3402 s, 3.4 MB/s // reading correct number of blocks from the device matches chrisw@newton ~ $ dd if=/dev/cdrom bs=2048 count=64900 | md5sum 64900+0 records in 64900+0 records out 3acf53667fcf1d03e98068ee4af5f4a3 - // <<< Sweet :) 132915200 bytes (133 MB) copied, 37.9759 s, 3.5 MB/s
Okay, I understand! But how can I do this in my code? This works for me:
Code:
QCryptographicHash hash(QCryptographicHash::Md5); qint64 imageSize = fileInfo.size(); char buf[2048]; int bytesRead; qint64 overallBytesRead = 0; while ((bytesRead = in.read(buf, 2048)) > 0) { overallBytesRead += bytesRead; hash.addData(buf, 2048); if (overallBytesRead == imageSize) { break; } } in.close(); qDebug() << "overall bytes read:" << overallBytesRead; qDebug() << hash.result().toHex(); } else { qDebug() << "Failed to open device!"; }
Is that right?
This works because the image will be an exact number of 2048 byte blocks. If you change that buffer size then you might need to handle reading a last partial block in the read at line 11 and line 13. For example if you used at 10000 byte buffer with my Gentoo image of 132,915,200 bytes the last block will be 5200 bytes and you do not want to read more than that or your hash will be affected.
I'd be inclined to do it this way:
Code:
QCryptographicHash hash(QCryptographicHash::Md5); qint64 imageSize = fileInfo.size(); const int bufferSize = 10000; char buf[bufferSize]; int bytesRead; int readSize = qMin(imageSize, bufferSize); while (readSize > 0 && (bytesRead = in.read(buf, readSize)) > 0) { imageSize -= bytesRead; hash.addData(buf, bytesRead); readSize = qMin(imageSize, bufferSize); } in.close(); qDebug() << hash.result().toHex(); } else { qDebug() << "Failed to open device!"; }
There is always more than one way to do it.
The smartest thing would probably be to detect the end of the image data instead of relying on having the image size given upfront. Padding probably begins with some fixed pattern or there is some ioctl call that can return the real data size. Otherwise it wouldn't be possible to create the image in the first place.
In the case of my Gentoo CDROM the extra 63 blocks on the CD are all zero. Unfortunately, so are at least the last 100 blocks of the original image. The end of my VirtualBox additions CD image and a data DVD is similar. I don't claim to know the complete in-and-outs of the various standards but it does seem that zero padding is done at several stages in the mastering and writing of an image (-pad option in mkisofs for example).
If the point of the exercise is to compare an image to the version on a disc then knowing the image size up front is hardly unreasonable.
Okay, I know the image size at this point. I read "image size" / 2048 blocks from DVD in a "for" loop. That works, check sums are identical!
Thank you very much to all for your help!!!:D