I have wikimedia tags in a text file. I need to get rid of these tags from starting to ending even when nesting. I'm using Perl.

I'm facing a difficulty in nested tags. I'll give two examples of these tags that I could not to remove.

Example 1:

{{ text
text
text {{ text
text}}
text }}

Example 2: instead of "{{" in example 1, we have the tags "]]".

[[ text
text [[ text
text]]
text ]]

The nesting can be to unlimited levels.

I hope someone to pinpoint me to solve this issue

Edited 4 Years Ago by algo_man: n/a

$s =~ s/{{.+}}//sg; # removes everything
$s =~ s/({{|}})//sg; # just removes the tags

Seems to be working fine for me unless you meant something different.

Edited 4 Years Ago by replic: n/a

Thank you very much.

This won't work with the following:

{{ text
text
text {{ text
text}}
text }}
Should be undeleted
Should be undeleted
Should be undeleted
Should be undeleted
{{ text
text
text {{ text
text}}
text }}
Should be undeleted
Should be undeleted
{{text}}

Edited 4 Years Ago by algo_man: n/a

As i thought i did not understand what you really wanted, my bad.
The only way i found to do this is:

$s =~ s/{{.+?}}//sg;
$s =~ s/.+}}//g;

The script below could what you wanted.

{{ text
text
text {{ text
text}}
text }}
Should be undeleted
Should be undeleted
Should be undeleted
Should be undeleted
{{ text
text
text {{ text
text}}
text }}
Should be undeleted
Should be undeleted
{{text}}

#!/usr/bin/perl
use warnings;
use strict;

while (<DATA>) {
    chomp;
    s/{{|}}//g;
    print $_, $/;
}

__DATA__
{{ text
text
text {{ text
text}}
text }}
Should be undeleted
Should be undeleted
Should be undeleted
Should be undeleted
{{ text
text
text {{ text
text}}
text }}
Should be undeleted
Should be undeleted
{{text}}

Your script seems to only remove the curly braces when the task was to remove the curly braces and the text between them. Threads that are this old without any reply from the OP should be left alone in most cases.

Thanks replic. I didn't get that what algo man wanted was to remove the curly braces and the text between them. Secondly, on the advise of leaving the post alone, you never can tell who "our" correct response might help. So as much as possible, I think one should just do the needful and allow others to benefit from it.
The script below should solve the problem perfectly:

#!/usr/bin/perl
use warnings;
use strict;

while (<DATA>) {
    chomp;
    if    ( /{{/ .. /}}/ ) { next }
    elsif (/.+?}}/)        { next }
    else                   { print $_, $/; }
}
__DATA__
{{ text
text
text {{ text
text}}
text }}
Should be undeleted
Should be undeleted
Should be undeleted
Should be undeleted
{{ text
text
text {{ text
text}}
text }}
Should be undeleted
Should be undeleted
{{text}}

OUTPUT

Should be undeleted
Should be undeleted
Should be undeleted
Should be undeleted
Should be undeleted
Should be undeleted

Edited 4 Years Ago by 2teez

This article has been dead for over six months. Start a new discussion instead.