Which terms are you trying to match here? Post a sample text along with the output you are expecting.
Also, the trick to creating complex regular expressions is to build the regular expression incrementally rather than writing it in a single go only to find it doesn't work as expected.
~s.o.s~
Failure as a human
11,938 posts since Jun 2006
Reputation Points: 3,281
Solved Threads: 734
Never start off with your regular expression with greedy quantifiers unless you know what you are doing. The .* at the start of your expression gobbles up your entire line. Then it realizes that there are other patterns/characters to be matched i.e. the \\]\\s+(.+) and so the regex engine starts backtracking which is expensive.
> has anyone been able to figure out why my original regex is not
> matching?
Your regex is exhausted just before the "Threat" word i.e. your entire regex matches till the character 'T' of 'Threat'. Your use of $ at the end kills off the entire match and hence the engine doesn't report a match.
Also, your use of non-capturing parentheses (?:) confuses me; why use them? I've removed the non-capturing parentheses and added the .* at the end to get something like:
public class ScrapRegexTests {
public static void main(final String[] args) {
String hitInput = "[CHAT WINDOW TEXT] [Sun Nov 29 11:34:13] Guardian of Water attacks Kyton's Rebuke [BH] : *hit* : (20 + 108 = 128 : Threat Roll: 3 + 108 = 111)";
String hitRegex = "^.*\\]\\s+(.+)\\s+attacks\\s+(.+)\\s+:\\s\\*([a-z][a-z]+)\\*\\s+:\\s.*\\+\\s(\\d+)\\s.+\\s+:\\s(.*)";
Pattern pat = Pattern.compile(hitRegex);
Matcher matcher = pat.matcher(hitInput);
if(matcher.matches()) {
System.out.println("Regex matched : #" + matcher.group(1) + "#");
System.out.println("Regex matched : #" + matcher.group(2) + "#");
System.out.println("Regex matched : #" + matcher.group(3) + "#");
System.out.println("Regex matched : #" + matcher.group(4) + "#");
System.out.println("Unprocessed String : #" + matcher.group(5) + "#");
} else {
System.out.println("No match found!");
}
}
}
Regex matched : #Guardian of Water#
Regex matched : #Kyton's Rebuke [BH]#
Regex matched : #hit#
Regex matched : #108#
Unprocessed String : #Threat Roll: 3 + 108 = 111)#
I'm sure you can carry on from here. I *think* there are better ways of writing the same thing but I'd defer my solution till you post your working solution.
HTH.
~s.o.s~
Failure as a human
11,938 posts since Jun 2006
Reputation Points: 3,281
Solved Threads: 734
substring approach won't work given that the length of entire string along with the player names and the attack rating is not constant or known in advance. Go with the regex approach IMO.
~s.o.s~
Failure as a human
11,938 posts since Jun 2006
Reputation Points: 3,281
Solved Threads: 734
> Given text as posted above, what is the easiest way to parse the pieces
> inside parens?
You can split the given string on spaces which would yield tokens like 60 , Sor29/RDD10/Pal1] , TJ's , Radiant Sorcerer . The way you would interpret these tokens depends on you.
~s.o.s~
Failure as a human
11,938 posts since Jun 2006
Reputation Points: 3,281
Solved Threads: 734