954,546 Members — Technology Publication meets Social Media
Username:
Password:
Lost login information?
Have something to say? Contribute New Article Reply to this Article

split txt file into smalles txt files

Hi,
i am new at shell scripting and i need your help. i have a very big txt file that contains articles,authors,etc.
Each article has this form:
.I .

What I need is a script tha will split this file into smaller ones.The script should find the character '.I' and put the characters after '.I' into a new .txt file named after the article's serial number.
Please help me...:S

yupi
Newbie Poster
3 posts since Mar 2008
Reputation Points: 10
Solved Threads: 0
 

DON'T USE YET, I found a problem, when this line is away it should be solved =)

file="/path/to/file"
data=""
data=`cat $file`
serial=""
for i in "$file"; do
 if [ `expr index "$i" .` = 1 ]
 then
   serial=`echo $i | tr -dc '[0-9]'`
 else
  echo `echo $i | tr -dc '[0-9]'` >> $serial.txt
 fi
done


I didn't test this so be aware =D
the for loop goes trough every line of the file specified in line 1,
the if loop checks weither the line starts with a . (dot),
if it starts with a dot, it get's the serial number
otherwise it will append the article information to a file with the serial number as a name.


I hope it works =D

Kruptein
Posting Whiz in Training
258 posts since Sep 2009
Reputation Points: 25
Solved Threads: 11
 
#/bin/bash
file="/home/darragh/Bureaublad/test.txt"
data=`cat $file`
data=${data#*<}
n=0
until [ `echo ${data#*>}` = "." ]
do
 if [ $n = 0 ]
 then
  serial=${data%%>*}
  n=1
 else
  echo `echo ${data%%>*}` >> $serial.txt
  n=0
 fi
 data=${data#*<}
done


this one works without any error

the only requirment is that the first "<" is followed by a serial number

(if you want to know how this works, please pm me, or answer on this topic)

Kruptein
Posting Whiz in Training
258 posts since Sep 2009
Reputation Points: 25
Solved Threads: 11
 

thank you very much...that's what i needed!!!
thanks:)

yupi
Newbie Poster
3 posts since Mar 2008
Reputation Points: 10
Solved Threads: 0
 

I just found out, that there are some minor issues which will not affect your data:
-loop keeps going even if all files are made (I'm fixing that one)
-in terminal it will drop an error that it expects a unary = or something but this will not harm the code

Kruptein
Posting Whiz in Training
258 posts since Sep 2009
Reputation Points: 25
Solved Threads: 11
 

Okay this is the last change to the code: (without errors or endless loops)

#/bin/bash
file="/home/darragh/Bureaublad/test.txt"
data=`cat $file`
data=${data#*<}
n=0
x=0
while [ $x = 0 ]
do
 if [[ "$data" =~ \ |\' ]]
 then
  x=0
 else
  if [ $n = 1 ]
  then
   x=1
  fi
 fi
 if [ $n = 0 ]
 then
  serial=${data%%>*}
  n=1
 else
  echo `echo ${data%%>*}` >> $serial.txt
  n=0
 fi
 data=${data#*<}
done


I know you can make it shorter with all those if's but I have to go now,
I hope i helped

Kruptein
Posting Whiz in Training
258 posts since Sep 2009
Reputation Points: 25
Solved Threads: 11
 

This article has been dead for over three months

Post: Markdown Syntax: Formatting Help
You
View similar articles that have also been tagged: