I know this is going to be really simple, and I'm going to look like a fool when someone answers it, but...
I'm tryingto replace any occurrenxe of any of these characters - ?_ with a * in a String
I start with replaceAll("[ _-]", "*"); and all is well except for a ?
But then i tried
replaceAll("[ _-?]", "*");
replaceAll("[ _-\?]", "*");
replaceAll("[ _-\\?]", "*");
and just get invalid regex's

Can some regex expert please put meout of my misery?
Thanks
J

That depends on how replaceAll parses the string you provide.
Would you mind posting the code you use in replace all?
Besides, what language is this anyway? Many languages ship with regex_replace() type functionality by default.

Hi
This being the Java forum, it's Java. The realceAll is a standrd method in the String class.
The complete relevant code is:

   public String find(String template) {
      // template contains wild cards (*-_? or space) 
      ...
      // convert all other wild cards to consistent * chars
      template = template.replaceAll("[ _-]", "*"); // needs to convert ? as well
      ...
      (etc)

Oh excuse me, I was posting from the main page and in my haste I didn't even check the sub-forum, completely my fault.

replaceAll(" _-\\?", "*")

should be valid.

It's possible the character encoding is not what you expect it to be.
For example, if you are parsing an XML file or something.

  • please do you mean, can we starting with (then I can't use escape characters, e.g. slash "\")

  • just code simulation, out of corrrect answer on my side

.

import java.util.*;

public class MyStringArray {

    private List<String> strings;

    public MyStringArray(String[] strArr) {
        strings = new ArrayList<>();
        strings.addAll(Arrays.asList(strArr));
    }

    @Override
    public String toString() {
        StringBuilder sb = new StringBuilder();
        for (String s : strings) {
            sb.append(s).append(" ");
        }
        return new String(sb);
    }

    public static void main(String[] args) {
        String[] strArr = {
            "James _-?Cherrill _-?",
            "James _-'\'?Cherrill _-'\'?",
            "James _-\\?Cherrill _-\\?"};
        MyStringArray ms = new MyStringArray(strArr);
        //System.out.println(ms);

        String s = Arrays.toString(strArr);
        s = s.replace(" _-?", "*");
        System.out.println(s);

        String s1 = Arrays.toString(strArr);
        s1 = s1.replace(" _-'\'?", "*");
        System.out.println(s1);

        String s2 = Arrays.toString(strArr);
        s2 = s2.replace(" _-\\?", "*");
        System.out.println(s2);
    }
}

with output (see whats happens with slash James _-''?Cherrill _-''?)

[James*Cherrill*, James _-''?Cherrill _-''?, James _-\?Cherrill _-\?]
[James _-?Cherrill _-?, James*Cherrill*, James _-\?Cherrill _-\?]
[James _-?Cherrill _-?, James _-''?Cherrill _-''?, James*Cherrill*]

Edited 3 Years Ago by mKorbel

That regex lacks the [] to select any one of the enclosed chars, so it doesn't do the same thing at all.

The template String is just taken from an input field in a standard dialog, and only contains chars from the ASCII set.

Just ensure that if you need to replace the literal -, it is always either at the start or at the end of the replacement set. For e.g. txt.replaceAll("[- _?]", "*") works, txt.replaceAll("[ _?-]", "*") also works but txt.replaceAll("[ _-?]", "*") doesn't.

That's it! Thank you so much (and also for allowing me to win a small bet with myself that you would be the one to post the right answer first).
I was adding the various tries for ? at the end of the chars, which previously had the - as the last char. Simple adding the ? at the front instead fixed it.
I hadn't realised that the positioning of special; chars affected their interpretation like that. If you have a handy link to some suitable reference I'd appreciate it, if not don't worry, I'll find it myself.
Thanks again
James

ps OK, don't worry - I have the logic behind that clear now, thanks again.

Edited 3 Years Ago by JamesCherrill

@s.o.s

You can also escape the literal -

System.out.println("This is _ a - test?.".replaceAll("[_\\-?]", "+"));

produces

This is + a + test+.

But, including it at the front or end looks cleaner. ;-)

Edited 3 Years Ago by masijade

If you have a handy link to some suitable reference I'd appreciate it, if not don't worry, I'll find it myself.

The reason is that hyphen (or dash) character has a special meaning when used inside a character class (i.e. stuff between the square brackets) which is basically specifying the character range. You can get around it by either specifying it as the first or the last character in the character class or escaping it with a backslash. I can't find a definitive reference but this SO link covers the material in a pretty good way.

But, including it at the front or end looks cleaner

Agreed, especially given that Java doesn't have raw strings. ;)

Edited 3 Years Ago by ~s.o.s~

But, including it at the front or end looks cleaner. ;-)

The irony of this whole thing is that I was convinced it was the ? that was the problem, and totally forgot the - is also a special character.

Because I'm obsessed with readablilty/understandability I actually settled on
replaceAll("[ _\\-\\?]", "*");
so that I (or anyone else) will never fail to notice the special character(s) in future.
J

please for why reason (in most cases as I saw) is there added literal '+' as last char, e.g. "[ _\-?]+" instead of described "[ _\-?]"

for example

String str =  "    This is _-?a _-?test _-?.";
System.out.println("This is _ a - test?.".replaceAll("[ _\\-?]+", "+"));

generated the same output (This+is+a+test+.) as without char '+'

That means one or more of the preceding character (class). I.E. There must be at least one of the preceding character (class), but there can also be more than one, and, if there is more, than all of them will be matched.

This question has already been answered. Start a new discussion instead.