PDA

View Full Version : How To Extract A File



deekayt
30th November 2006, 10:49
If I have all binary info about a file ( .jpg or .doc or any file ) in a .txt file how do I display
it in that associated program without knowing which format is it .
Say for example I have complete binary info including header and other meta info of test.pdf in binary format in a
outdata.txt file. How do I display test.pdf after reading the outdata.txt.( in windows XP and with C /C ++ coding)
Or say I have complete binary info including header and other meta info of image.jpg in binary format in a
outdata.txt file. How do I display image.jpg after reading the outdata.txt.

wysota
30th November 2006, 14:53
Can you elaborate on that? What do you mean by "complete binary info"? Do you also have the data file itself?

deekayt
30th November 2006, 16:15
I shall attach an example
The binary information of gun.jpg is in dk11.txt
Now I want to show gun.jpg thru dk11.txt.
I amnot aware that the info is of jpg file.Once you show dk11.txt (doubleclick on the dk11.txt) it will open in notepad/wordpad and you see all garbled info.
This is where I want to have a small code which first checks what file type is there in binary form in this text file and opens it in the associated program in windows XP.

wysota
30th November 2006, 22:03
But this is still a jfif (commonly known as jpeg) file... The suffix of the file name doesn't change the type of the file. If you use a stupid operating system which relies on file extensions to guess file types then there is not much you can do. You'd have to implement your own mime-type recognition (often called mime-magic, as it "magically" guesses the file type by looking at the header) or use a library that does that. For example apache or cups (not sure about that though) use mime-magic to guess file types and they use a special file which contains signatures (simmilar to virus signatures) of file types. They then perform operations mentioned in the signature for each file type and stop when they find a match. In this particular example you could search for a "JFIF" string starting at byte 06h of the file. For PNG images you'd look for string "PNG" starting at byte 01h and a IHDR string shortly after (it's not neccessary to look for the IHDR string, but it'll make it more probable that you're facing a PNG file and not a plain text file starting with "xPNG".

BTW. All files are "binary files" and contain "binary data" - even text files, so the term "binary data" doesn't really mean anything.

jacek
30th November 2006, 22:23
AFAIR there's a library that can detect file type, but I don't remember it's name (I think I've seen it on Freshmeat).

wysota
30th November 2006, 22:55
$ ldd `which file`
linux-gate.so.1 => (0xbfffe000)
libmagic.so.1 => /usr/lib/libmagic.so.1 (0xb7f13000)
libz.so.1 => /lib/libz.so.1 (0xb7f00000)
libc.so.6 => /lib/i686/libc.so.6 (0xb7dd3000)
/lib/ld-linux.so.2 (0xb7f3d000)

I think this says it all ;)

deekayt
5th December 2006, 19:01
I need to use the code in windows based QT ( C ++)
I am not sure how to use your code in that.Is it forlinux .
Where will I write in the code, Presently I open the file with following code


void steg::showdatafile()/*** THIS MODULE SHOWS THE DATA FILE *****/
{
if(QFile::exists(ui.datalineEdit->text()))
{
QProcess p;
QStringList s;
s << "url.dll,FileProtocolHandler" << ui.datalineEdit->text() ;
p.startDetached(QString("rundll32.exe") , s );
}
else
QMessageBox::warning(this, tr("FILE ERROR"),
tr("<p><b> ERROR IN DISPLAYING DATA FILE.</b>"
"</p><p><b>CHOOSE THE DATA FILE AND MAKE SURE THE FILE EXISTS OR CHOOSE SEPARATE FILE</b>"));

}

that is ----I am using FileProtocolHandler.But it is going to treat my file as text file where as my requirement is as told above.

wysota
5th December 2006, 19:27
Well... Windows is the operating system I was talking about in one of my previous posts in this thread. If I were you, I'd check if libmagic compiles on your system. It's probably a GNU utility, so it will probably work.

See here: http://gnuwin32.sourceforge.net/packages/file.htm