Hello everyone. I need to write a Bash script to remove blocks of text from a given file. The idea is, the text to be removed will be marked by appearing between certain delimiter characters i.e.

Here's some text and ~this bit gets removed~, where tilda is the delimiter.

Could someone tell me the kind of commands I need to research to do stuff with text like this? I can work out the rest. Any help appriciated.

Steven.

Recommended Answers

All 3 Replies

You need to use sed, and you have to learn about regular expressions.
based on your data a very specific (not generalized) solution is:

$> echo "Here's some text and ~this bit gets removed~, where tilda is the delimiter." |read var
$> echo $var
Here's some text and ~this bit gets removed~, where tilda is the delimiter.
$> echo $var | sed 's/~[A-Za-z ]*.~//1'
Here's some text and , where tilda is the delimiter.

Could someone tell me the kind of commands I need to research to do stuff with text like this? I can work out the rest. Any help appriciated.

Tools such as sed/awk/perl are used for text processing. they make use of regular expressions alot. So you need to research on how regular expressions work. Text processing can also be done without regular expressions. If you have Python on your machine, here's a simple way to do what you want.

indices=[] #define array list to keep "~" indexes
s = "Here's some text and ~this bit gets removed~, where tilda is the delimiter."
for num,ch in enumerate(s):
 	if ch == "~":
 		indices.append(num)
s = list(s) #turn s into a list, so we can make changes to it
del s[indices[0]: indices[1]+1] 
print ''join(s)

If your needs are more complicated, you can use fancier tools like perl or python, or if your needs are fairly simple, sed will probably fit your needs just fine.

Take your example, you have a file "input.txt" with the text you posted, and you want to strip of it the un-wanted text. You can do this with a single line of sed command:

$ sed -ie 's/~[A-Za-z ]*.~//g' input.txt

And you will be left with the "clean" version of your input.txt, with the text removed. Look up "regular expression" if you do a lot of pattern matching, it will help you a lot.

-Josh
www.qbangsolutions.com

Be a part of the DaniWeb community

We're a friendly, industry-focused community of developers, IT pros, digital marketers, and technology enthusiasts meeting, networking, learning, and sharing knowledge.