Hello,

I have one big XML file (600 MB - 850 MB) in format "cells_yyyymmdd_hhmi.xml" I would like to specify that everyday i will have new file with new date. So, there should be general way to read it and cut it.

For exmaple, i have file of 7th January. Its, cells_20140107_154016

Goal is to split into small parts by shell script and do operation. It will be great if anyone can give input to check the filesize and if it is too big, make 4 parts instead of 3 parts.

I did so far:

head -1125000 cells_20140107_154016.xml > PART1.xml
echo "</details></cells>" >> PART1.xml

echo "<cells><details>" >> PART2.xml
sed -n '1125001,2250000p' cells_20140107_154016.xml >> PART2.xml
echo "</details></cells>" >> PART2.xml

echo "<cells><details>" >> PART3.xml
sed -n '2250001,3480000p' cells_20140107_154016.xml >> PART4.xml

The main task is to make it in general.

Expected output:

head -1125000 filename.xml > PART1.xml
echo "</details></cells>" >> PART1.xml

echo "<cells><details>" >> PART2.xml
sed -n '1125001,2250000p' filename.xml >> PART2.xml
echo "</details></cells>" >> PART2.xml

echo "<cells><details>" >> PART3.xml
sed -n '2250001,3480000p' filename.xml >> PART4.xml

I hope i am clear.

Thanks in advanced for your time and input.

Look for the command "split". The man page tells you exactly how to do what you want, and it will do it all at once.

Be a part of the DaniWeb community

We're a friendly, industry-focused community of developers, IT pros, digital marketers, and technology enthusiasts meeting, networking, learning, and sharing knowledge.