PDA

View Full Version : Display contents of .txt file



karmo
21st January 2010, 11:42
I display whole .txt file:


QFile fp("file.txt");
fp.open(IO_ReadOnly);
while (!fp.atEnd() ) {
fp.readLine(buff, sizeof(buff));
textEdit->append(buff);
}
fp.close();

but I would like to display last 20 lines or next to last 20 lines. How can I achive this?

high_flyer
21st January 2010, 12:50
one way I can think of is by first reading the lines in to a QStringList, and then showing only the last 20 items of the list.
Another way is to read the whole file in one string, and then search back in the string for 20 '\n' (new lines).

squidge
21st January 2010, 13:01
Assuming a line length will never exceed 132 characters, you could seek to the end of the file, then seek backwards 2,640 bytes (132*20), and then read forwards to the end. Count the number of lines in your buffer, and if it's >= 20 then show the last 20, else read another 2,640 bytes and try again. That way you don't have to read the entire file.

high_flyer
21st January 2010, 14:34
Assuming a line length will never exceed 132 characters, you could seek to the end of the file, then seek backwards 2,640 bytes (132*20), and then read forwards to the end. Count the number of lines in your buffer, and if it's >= 20 then show the last 20, else read another 2,640 bytes and try again. That way you don't have to read the entire file.

This method requires that you know how the file is beying structured, and wont be usfull if you don't know that.
If you have that information (the lenght of a line) then you can target the correct place to start reading from by subtracting the the size of 20 lines from the file's end.

JD2000
21st January 2010, 15:30
This will give you the number of lines in the file:


while((c = fgetc("file.txt")) != EOF)
if(c == '\n') lines++;

subtract 20 from the result!

high_flyer
21st January 2010, 16:34
This will give you the number of lines in the file:
But that is not what is needed.
He wants to display the last 20, not know how many there are.
Or, he needs to know where the N's-20 line begins in the file.

JD2000
21st January 2010, 17:06
But if there are say 25 lines in the file, you know that you want to start with line 5.
so do a


void rewind ( fp );
int posn = 5;
while((c = fgetc("file.txt")) != EOF){
if(c == '\n') lines++;
if(lines >= posn) // now at the required position in file - print the rest
}

wysota
22nd January 2010, 10:20
But if there are say 25 lines in the file, you know that you want to start with line 5.
But you have to read the file twice. Good luck if the file is 4GB in size and contains about 50M lines.

Reading the whole file into memory is out of the question as well.

I'd suggest to index the file by reading it and remembering positions of each newline character. Once you reach the end you know exactly how far to seek back. Then you only need to reread the part of the file you are interested in.

boudie
22nd January 2010, 11:01
If it's needed on Linux and it doesn't have to be multi-platform, then you can use 'tail -n xxx' in a sub-process to get the last xxx-lines.

JD2000
22nd January 2010, 13:54
But you have to read the file twice. Good luck if the file is 4GB in size and contains about 50M lines.

Reading the whole file into memory is out of the question as well.



I am currently using a Windows PC, the search facility classes large files as 1 MB and huge as 5 MB. I doubt that there is any problem reading such a file into memory.

On my system the txt files are typicay much smaller and this routine works well enough on them.

I admit that the double pass is a small draw back but if you want a professional solution you can download the GNU code and adapt that.

The source is at http://git.savannah.gnu.org/gitweb/?p=coreutils.git;a=blob;f=src/tail.c;h=43fd6d4128f17f008d9ac327cb9072297714f33b; hb=0482f19. and it's a mere 1700 lines of code!

wysota
22nd January 2010, 15:33
I am currently using a Windows PC, the search facility classes large files as 1 MB and huge as 5 MB. I doubt that there is any problem reading such a file into memory.
The difference between 1MB and 4GB is 4000 times. And I believe on some filesystems you can have even larger files.


On my system the txt files are typicay much smaller and this routine works well enough on them.
But the thread author is not implementing a clone of your system. And your system works only because nobody tried passing a much larger file to it just yet. But eventually someone will, you will see.

Coises
22nd January 2010, 17:20
I would like to display last 20 lines or next to last 20 lines.

Unless you have independent information about the structure of the file (such as line lengths), you’ll need either to read the whole file, saving some information until you get to the end, or to read it backwards by using the QFile::size and QFile::seek functions.

If you read the whole file, begin by creating a circular buffer (http://en.wikipedia.org/wiki/Circular_buffer) to contain the number of elements before the end (counting the last element) at which you need to start displaying. If the file is random access, you can store the return values of QFile::pos; if the file might not be random access, or if you are confident the files will rarely be large, you can store the contents of the lines themselves (e.g., as QStrings). When you reach the end, use QFile::seek to reposition to the correct location and read the data if you’ve stored positions, or simply display the appropriate elements of the buffer if you’ve stored data.

To read backwards, pick a sensible buffer size (in this case you’ll be buffering bytes, not positions or QStrings), then QFile::seek to QFile::size minus the buffer size. Read into the buffer using QFile::readData and parse it yourself for line endings. Remember that the first line in the buffer is not necessarily complete! If you have enough lines, display them; if not, seek buffer size bytes earlier than before (but not to a position less than zero), allocate, read and parse another buffer, and so on.

squidge
22nd January 2010, 19:38
This method requires that you know how the file is beying structured, and wont be usfull if you don't know that.

Actually, it doesn't. If the line length is only 20 bytes (for example), then you'll just read 2640/20 = 132 lines which is more than enough. If they are 200 bytes then you'll only read 13 lines, so you'll keep going back until the number of lines >= 20 (see my original post). The same goes for if the lines are all different sizes.

Worse case is that you read 2.5KB too much, which I don't think is a big deal on modern computers. It's much better than reading a whole 4MB file to determine how many lines are in the file.

wysota
23rd January 2010, 08:46
Actually, it doesn't. If the line length is only 20 bytes (for example), then you'll just read 2640/20 = 132 lines which is more than enough. If they are 200 bytes then you'll only read 13 lines, so you'll keep going back until the number of lines >= 20 (see my original post). The same goes for if the lines are all different sizes.

Worse case is that you read 2.5KB too much, which I don't think is a big deal on modern computers. It's much better than reading a whole 4MB file to determine how many lines are in the file.

I don't feel comfortable with assuming that any line won't exceed 132 characters. Why 132 and not 133? Or 80? Or 900? To me the only reliable and general way is to read through the whole file, unfortunately. If you do it from the beginning or from the end this is irrelevant. I understand the thread author wants to be able to display arbitrary 20 lines, not only the last 20 or second to last 20 or first 20.

squidge
23rd January 2010, 16:18
Lines can exceed 132 characters, I just assumed that for the simple reason of buffering. If the line are longer, the method will still work, it'll just require more read() calls. But it'll only work if you want the last 20 lines. As soon as you say "20 lines starting from any point in the file" then you really have no choice but to start from the beginning, as you don't know the number of lines, and thus can't optimise.

wysota
23rd January 2010, 17:37
Lines can exceed 132 characters, I just assumed that for the simple reason of buffering.
But if you don't know the maximum line length, how much would you read?

squidge
23rd January 2010, 20:08
But if you don't know the maximum line length, how much would you read?

Depends on the requirement. If it was 20 lines, then I would read 20 * 132 as a starting point. If I don't see 20 new lines in the buffer, then I would guess at another figure depending on how many lines I did see (appending to the data already in the buffer), and keep doing that until I matched the requirement.

wysota
23rd January 2010, 20:39
Effectively reading the whole file ;) For instance if the whole file contained only a single line.