944,116 Members | Top Members by Rank

Ad:
Jan 22nd, 2007
0

Quick question about text processing

Expand Post »
Hello everyone. I need to write a Bash script to remove blocks of text from a given file. The idea is, the text to be removed will be marked by appearing between certain delimiter characters i.e.

Shell Scripting Syntax (Toggle Plain Text)
  1. Here's some text and ~this bit gets removed~, where tilda is the delimiter.
  2.  

Could someone tell me the kind of commands I need to research to do stuff with text like this? I can work out the rest. Any help appriciated.

Steven.
Similar Threads
Reputation Points: 47
Solved Threads: 1
Posting Whiz in Training
Mushy-pea is offline Offline
271 posts
since Jun 2006
Jan 26th, 2007
0

Re: Quick question about text processing

You need to use sed, and you have to learn about regular expressions.
based on your data a very specific (not generalized) solution is:
Shell Scripting Syntax (Toggle Plain Text)
  1. $> echo "Here's some text and ~this bit gets removed~, where tilda is the delimiter." |read var
  2. $> echo $var
  3. Here's some text and ~this bit gets removed~, where tilda is the delimiter.
  4. $> echo $var | sed 's/~[A-Za-z ]*.~//1'
  5. Here's some text and , where tilda is the delimiter.
Reputation Points: 62
Solved Threads: 10
Junior Poster
jim mcnamara is offline Offline
179 posts
since May 2004
Jan 27th, 2007
0

Re: Quick question about text processing

Click to Expand / Collapse  Quote originally posted by Mushy-pea ...
Could someone tell me the kind of commands I need to research to do stuff with text like this? I can work out the rest. Any help appriciated.
Tools such as sed/awk/perl are used for text processing. they make use of regular expressions alot. So you need to research on how regular expressions work. Text processing can also be done without regular expressions. If you have Python on your machine, here's a simple way to do what you want.
Shell Scripting Syntax (Toggle Plain Text)
  1. indices=[] #define array list to keep "~" indexes
  2. s = "Here's some text and ~this bit gets removed~, where tilda is the delimiter."
  3. for num,ch in enumerate(s):
  4. if ch == "~":
  5. indices.append(num)
  6. s = list(s) #turn s into a list, so we can make changes to it
  7. del s[indices[0]: indices[1]+1]
  8. print ''join(s)
Reputation Points: 75
Solved Threads: 44
Junior Poster
ghostdog74 is offline Offline
156 posts
since Apr 2006
Feb 2nd, 2007
0

Re: Quick question about text processing

If your needs are more complicated, you can use fancier tools like perl or python, or if your needs are fairly simple, sed will probably fit your needs just fine.

Take your example, you have a file "input.txt" with the text you posted, and you want to strip of it the un-wanted text. You can do this with a single line of sed command:

Shell Scripting Syntax (Toggle Plain Text)
  1. $ sed -ie 's/~[A-Za-z ]*.~//g' input.txt
And you will be left with the "clean" version of your input.txt, with the text removed. Look up "regular expression" if you do a lot of pattern matching, it will help you a lot.

-Josh
www.qbangsolutions.com
Reputation Points: 10
Solved Threads: 0
Newbie Poster
kuom is offline Offline
10 posts
since Jan 2007

This thread is more than three months old

No one has posted to this discussion for at least three months. Please let old threads die and do not reply to them unless you feel you have something new and valuable to contribute that absolutely must be added to make the discussion complete. Otherwise, please start a new thread in this forum instead.
Message:
Previous Thread in Shell Scripting Forum Timeline: Spare few minutes
Next Thread in Shell Scripting Forum Timeline: Big Favour!!!





About Us | Contact Us | Advertise | Acceptable Use Policy
Forum Index | Build Custom RSS Feed


Follow us on Twitter


© 2011 DaniWeb® LLC