I have one big XML file (600 MB - 850 MB) in format "cells_yyyymmdd_hhmi.xml" I would like to specify that everyday i will have new file with new date. So, there should be general way to read it and cut it.

For exmaple, i have file of 7th January. Its, cells_20140107_154016

Goal is to split into small parts by shell script and do operation. It will be great if anyone can give input to check the filesize and if it is too big, make 4 parts instead of 3 parts.

I did so far:

head -1125000 cells_20140107_154016.xml > PART1.xml
echo "</details></cells>" >> PART1.xml

echo "<cells><details>" >> PART2.xml
sed -n '1125001,2250000p' cells_20140107_154016.xml >> PART2.xml
echo "</details></cells>" >> PART2.xml

echo "<cells><details>" >> PART3.xml
sed -n '2250001,3480000p' cells_20140107_154016.xml >> PART4.xml

The main task is to make it in general.

Expected output:

head -1125000 filename.xml > PART1.xml
echo "</details></cells>" >> PART1.xml

echo "<cells><details>" >> PART2.xml
sed -n '1125001,2250000p' filename.xml >> PART2.xml
echo "</details></cells>" >> PART2.xml

echo "<cells><details>" >> PART3.xml
sed -n '2250001,3480000p' filename.xml >> PART4.xml

I hope i am clear.

Thanks in advanced for your time and input.

Edited by PriteshP23: code

4 Years
Discussion Span
Last Post by rubberman

Look for the command "split". The man page tells you exactly how to do what you want, and it will do it all at once.

This topic has been dead for over six months. Start a new discussion instead.
Have something to contribute to this discussion? Please be thoughtful, detailed and courteous, and be sure to adhere to our posting rules.