Hi, I would need help in getting this logic done.

I am having an input file in below format:

AAAAA HsfAgr345S 001 pos gfdgojog
AAAAA HsfAgr346S 001 pos gfdgojog
AAAAA HsfAgr347S 001 pos gfdgojog
BBBBB POSgfgh571 002 ipo postalap
BBBBB POSgfgh572 002 ipo postalap
BBBBB POSgfgh573 002 ipo postalap
BBBBB POSgfgh574 002 ipo postalap
BBBBB POSgfgh575 002 ipo postalap

  1. I have to split the input file to two output files based on 001 and 002. i.e, I am writing 001 to separate file, and 002 to another file. - I am Done with this

  2. Not more than 50 records per file. If the input file exceeds more than 50 records, I am writing to another 001 or 002 file by appending 'n'(1...n) at the end of filename - I am done with this

  3. All records for a member(either AAAAA or BBBBB) must be included on the same output file
    (ie. If the 50th record would split up a member's records then split the file at the last record of the previous member) - I Need Help on this logic.

Let us consider, if BBBBB POSgfgh573 002 ipo postalap is the 50 th line of output file, I have to take all BBBBB records(last five line lines of input file) and write it to next output file.

Please help me on this logic.

Assuming you have random access to the input file, you could do something like this: When you read a line that represents a new member (i.e., it's different than the last line's member), hang onto the data in memory for a second. Scan the file, counting lines that represent that member. If there are too many lines for that member to fit in what remains of the current file, close it and start a new one; otherwise, keep using the same file. Either way, write out the in-memory record, seek back to the next one, and continue reading records.

If you have the memory available, you could also keep reading in records and only write them out once you have the last one for a member.

All of this is assuming member records are stored contiguously; if they can come in any (dis)order, that's another conversation.

You have already done it. According to your sample file, all AAAAAs belong to the group 001; all BBBBBs belong to the 002 group. Mission accomplished. If the layout of your example is correct.

If not, gusano79 is approaching it correctly. In addition, the only way you are going to know if there is room in the current output file is to keep a total of the rows you have added.

You will need to keep the number of rows in the output file in memory as well as the number of the member rows you need to add to that to that file.

You also did not address the issue if member has >50 rows.

Good luck!

Edited 3 Years Ago by seblake

Yes gusano, I am trying with your suggestion. Will post back the result soon

This article has been dead for over six months. Start a new discussion instead.