I have written a regex that works well except that it always puts the last word on a line of it's own. Can anyone see where I have gone wrong, this is my first attempt at regex. The text is of variable length.

Basically my regex is trying to
1. Limit numbers of any character to 100
2. So that each element of the array holds as close to 100 characters without cutting into a word.

    string pattern=@"(.{1,100}\b\s)";
     Regex rex = new Regex(pattern);
     string[] Str = rex.Split(Detail);

Thanks for any help

Recommended Answers

All 4 Replies

Your expression is OK.
But you should be using Matches not Split.
Try this

string pattern = @"(.{1,100}\b\s)";
            Regex rex = new Regex(pattern);
            string[] Str;
            MatchCollection matches = rex.Matches(Detail);
            if (matches.Count > 0)
            {
                Str = new string[matches.Count];
                for (int i = 0; i < Str.Length; i++)
                {
                    Str[i] = matches[i].Value;
                }
            }

Your regex is looking for a group that ends in "\b\s" which is "first or last character of word followed by a space". Unless you have a blank space at the end of your string it wont match the last word. Replace the last section with an optional match that will match either \b\s or \z which is the end of line:
(.{1,100}(?:\b\s|\z)) . This will work the same as your previous code but will include the final word in the last group (provided it wouldn't be the One hundred and 1st word, of course).

Your regex is looking for a group that ends in "\b\s" which is "first or last character of word followed by a space". Unless you have a blank space at the end of your string it wont match the last word. Replace the last section with an optional match that will match either \b\s or \z which is the end of line:
(.{1,100}(?:\b\s|\z)) . This will work the same as your previous code but will include the final word in the last group (provided it wouldn't be the One hundred and 1st word, of course).

Your expression makes no difference to the result given with Split.
Also, with or without \s the result is the same.
Using the Matches method is the way to go.

I agree that using the Matches is a much better way to do it...regex.split seems to insert NULL items between each match with the OP's regular expression or my revised one.
My alteration does make a difference however.
If the string to be searched doesn't end with a blank space then the last word wont be matched. String.Split handles this by putting the last "unmatched" word in its own item at the end of the array whilst regex.Matches simply cuts it off because it doesn't match the regular expression. By adding the optional end of string match you ensure the last word in the string is included in your matches.

Be a part of the DaniWeb community

We're a friendly, industry-focused community of developers, IT pros, digital marketers, and technology enthusiasts meeting, networking, learning, and sharing knowledge.