Quick question about text processing

Please support our Shell Scripting advertiser: Programming Forums - DaniWeb Sister Site
Reply

Join Date: Jun 2006
Posts: 263
Reputation: Mushy-pea is an unknown quantity at this point 
Solved Threads: 1
Mushy-pea's Avatar
Mushy-pea Mushy-pea is offline Offline
Posting Whiz in Training

Quick question about text processing

 
0
  #1
Jan 22nd, 2007
Hello everyone. I need to write a Bash script to remove blocks of text from a given file. The idea is, the text to be removed will be marked by appearing between certain delimiter characters i.e.

Shell Scripting Syntax (Toggle Plain Text)
  1. Here's some text and ~this bit gets removed~, where tilda is the delimiter.
  2.  

Could someone tell me the kind of commands I need to research to do stuff with text like this? I can work out the rest. Any help appriciated.

Steven.
The one question you should not ask when teaching a new language structure is "Do you understand?". Do you understand?
Reply With Quote Quick reply to this message  
Join Date: May 2004
Posts: 178
Reputation: jim mcnamara is on a distinguished road 
Solved Threads: 10
jim mcnamara jim mcnamara is offline Offline
Junior Poster

Re: Quick question about text processing

 
0
  #2
Jan 26th, 2007
You need to use sed, and you have to learn about regular expressions.
based on your data a very specific (not generalized) solution is:
Shell Scripting Syntax (Toggle Plain Text)
  1. $> echo "Here's some text and ~this bit gets removed~, where tilda is the delimiter." |read var
  2. $> echo $var
  3. Here's some text and ~this bit gets removed~, where tilda is the delimiter.
  4. $> echo $var | sed 's/~[A-Za-z ]*.~//1'
  5. Here's some text and , where tilda is the delimiter.
Reply With Quote Quick reply to this message  
Join Date: Apr 2006
Posts: 149
Reputation: ghostdog74 is on a distinguished road 
Solved Threads: 40
ghostdog74 ghostdog74 is offline Offline
Junior Poster

Re: Quick question about text processing

 
0
  #3
Jan 27th, 2007
Originally Posted by Mushy-pea View Post
Could someone tell me the kind of commands I need to research to do stuff with text like this? I can work out the rest. Any help appriciated.
Tools such as sed/awk/perl are used for text processing. they make use of regular expressions alot. So you need to research on how regular expressions work. Text processing can also be done without regular expressions. If you have Python on your machine, here's a simple way to do what you want.
Shell Scripting Syntax (Toggle Plain Text)
  1. indices=[] #define array list to keep "~" indexes
  2. s = "Here's some text and ~this bit gets removed~, where tilda is the delimiter."
  3. for num,ch in enumerate(s):
  4. if ch == "~":
  5. indices.append(num)
  6. s = list(s) #turn s into a list, so we can make changes to it
  7. del s[indices[0]: indices[1]+1]
  8. print ''join(s)
Reply With Quote Quick reply to this message  
Join Date: Jan 2007
Posts: 10
Reputation: kuom is an unknown quantity at this point 
Solved Threads: 0
kuom kuom is offline Offline
Newbie Poster

Re: Quick question about text processing

 
0
  #4
Feb 2nd, 2007
If your needs are more complicated, you can use fancier tools like perl or python, or if your needs are fairly simple, sed will probably fit your needs just fine.

Take your example, you have a file "input.txt" with the text you posted, and you want to strip of it the un-wanted text. You can do this with a single line of sed command:

Shell Scripting Syntax (Toggle Plain Text)
  1. $ sed -ie 's/~[A-Za-z ]*.~//g' input.txt
And you will be left with the "clean" version of your input.txt, with the text removed. Look up "regular expression" if you do a lot of pattern matching, it will help you a lot.

-Josh
www.qbangsolutions.com
Reply With Quote Quick reply to this message  
Reply

This thread is more than three months old.
Perhaps start a new thread instead?
Message:



Similar Threads
Other Threads in the Shell Scripting Forum
Thread Tools Search this Thread



Tag cloud for Shell Scripting
About Us | Contact Us | Advertise | DaniWeb | Acceptable Use Policy | RSS Feed

©2003 - 2009 DaniWeb® LLC