PDA

View Full Version : break a file



mickey
24th February 2008, 11:12
Hello,
I have a text file with eg. 600 lines. I have to break it into 3 and create 3 new file each with 200 lines. What's the better way?. Furthermore: 600 lines is a lucky case because it could be 599. In this case I have to spilt it into 3 parts more equal as possbile.....

Thanks-

wysota
24th February 2008, 11:21
INT = 599 / 3 = 199
FRAC = 599 % 3 = 2

if FRAC !=0 then INT = INT + 1

First two files get INT lines each and the last one gets the rest.

mickey
24th February 2008, 11:38
yes but what I was wondering was how retrieve the number of lines....what's the most quick method. Simpling counting the lines before split?

wysota
24th February 2008, 11:44
The most straightforward way is not to retrieve the number of lines at all. If the way lines are divided doesn't matter then you can split files the same way as you deal cards - one line to file 1, next to file 2, next to file 3, next to file 1, etc. If you need first 200 lines to go to file 1, next 200 to file 2 and the remaining 199 to file 3, then you'll have to count lines either by reading all lines into a list and counting it (if the file is small) or scan the file for new line characters and count them (and add 1).

mickey
24th February 2008, 11:53
EDIT: I have to put a percentuge to file1, to file2 and to file3 (eg. I have int p1=0.5, p2=0.4, p3=0.1), but important the choice of the lines must be at random......

wysota
24th February 2008, 12:17
Two choices then:
1. read all lines into memory, shuffle and write back to files at given percentage
2. read lines one by one and write to a random file but always check that you don't go over the desired percentage limit (you'll have to keep the number of lines already stored in each file). In the end you should reach the desired percentages.