I've had a real good look around at it seems to be quite difficult, all I can find is ones that claim to work in PHP, I tried them anyway and get invalid groupings, escape sequnses etc...

I cannot find one for C#.

I do not have the ipv6 to use TryParse, I need to match an ipv6 in a string.

Maybe someone has successfully made one or had better luck searching than me.

Thank you for reading.

Recommended Answers

All 18 Replies

I have this so far, which appears to work to an extent (probably for my needs) but it will also match 123::::abc and similar.

@"(?i)([0-9a-f]{1,4}|::)((:{0,2}[0-9a-f]{0,4}){0,7})"

Thanks guys I appreciate your time, they are good for validating a string representation of ipv6 but not for plucking them out of a string containing ipv6 addresses as far as I can tell.

I'll keep trying to modify them, but I'm not having much luck at the moment.

but not for plucking them out of a string containing ipv6 addresses

Can you show sample input?

I think I found a way I can uses it just to say whether or not a string is ipv6.

Of these it correctly identifies the starred as ipv6, the other two which are valid, it does not.
192.0.0.1
127.0.0.1
2001:db8:0:1
2001:db8:0:1
*FE80::0202:B3FF:FE1E:8329
123.0.0.1
*::1
127.0.0.1
125.0.0.1
222.0.0.1

But at least it's not identifying ipv4 as ipv6 like it was after I messed about with it.

Oh it's using ...

string patternipv6 = @"^((([0-9A-Fa-f]{1,4}:){7}([0-9A-Fa-f]{1,4}|:))|(([0-9A-Fa-f]{1,4}:){6}(:[0-9A-Fa-f]{1,4}|"
            + @"((25[0-5]|2[0-4]\d|1\d\d|[1-9]?\d)(.(25[0-5]|2[0-4]\d|1\d\d|[1-9]?\d)){3})|:))|(([0-9A-"
            + @"Fa-f]{1,4}:){5}(((:[0-9A-Fa-f]{1,4}){1,2})|:((25[0-5]|2[0-4]\d|1\d\d|[1-9]?\d)(.(25[0-5"
            + @"]|2[0-4]\d|1\d\d|[1-9]?\d)){3})|:))|(([0-9A-Fa-f]{1,4}:){4}(((:[0-9A-Fa-f]{1,4}){1,3})|"
            + @"((:[0-9A-Fa-f]{1,4})?:((25[0-5]|2[0-4]\d|1\d\d|[1-9]?\d)(.(25[0-5]|2[0-4]\d|1\d\d|[1-9]"
            + @"?\d)){3}))|:))|(([0-9A-Fa-f]{1,4}:){3}(((:[0-9A-Fa-f]{1,4}){1,4})|((:[0-9A-Fa-f]{1,4}){"
            + @"0,2}:((25[0-5]|2[0-4]\d|1\d\d|[1-9]?\d)(\.(25[0-5]|2[0-4]\d|1\d\d|[1-9]?\d)){3}))|:))|("
            + @"([0-9A-Fa-f]{1,4}:){2}(((:[0-9A-Fa-f]{1,4}){1,5})|((:[0-9A-Fa-f]{1,4}){0,3}:((25[0-5]|2"
            + @"[0-4]\d|1\d\d|[1-9]?\d)(\.(25[0-5]|2[0-4]\d|1\d\d|[1-9]?\d)){3}))|:))|(([0-9A-Fa-f]{1,4"
            + @"}:){1}(((:[0-9A-Fa-f]{1,4}){1,6})|((:[0-9A-Fa-f]{1,4}){0,4}:((25[0-5]|2[0-4]\d|1\d\d|[1"
            + @"-9]?\d)(\.(25[0-5]|2[0-4]\d|1\d\d|[1-9]?\d)){3}))|:))|(:(((:[0-9A-Fa-f]{1,4}){1,7})|((:"
            + @"[0-9A-Fa-f]{1,4}){0,5}:((25[0-5]|2[0-4]\d|1\d\d|[1-9]?\d)(\.(25[0-5]|2[0-4]\d|1\d\d|[1-"
            + @"9]?\d)){3}))|:)))(%.+)?$";

From cereal link.

(edit) actually IPAddress.TryParse says they are not ipv6 too, so the regex looks good (I'll have to stop believing things on the internet).

That is one monster pattern! Did you write that? (and if so, how long did that take). So is this solved, or are you still looking for possible solutions to parse IPv6

It's from the link cereal provided in post 3. It is marvellous, you're right about that.

I think it's probably solved now, but I wont mark it as such for another day or two, just incase there's something I forgot, which is often the case.

Well then I'll keep an eye on it then to see if I can offer help (if needed)

IPv6 addresses are generally represented in strings a a series of 5 16-bit hexadecimal values, separated by colons. The first hex number is separated from the rest by dual colons. There may also be a trailing subnet mask, such as /64. Example, my Linux system IPv6 address is (numbers changed to keep people off my network) ff80::215:77ff:ff81:fff8/64. Note that leading 0's are truncated, so instead of 0215 for the second term, it is 215. IPv4 addresses are represented in dot notation using decimal numbers for each term, not using colons. This will simplify parsing the strings considerably.

In my professional opinion, building up regex terms for this sort of purpose is not really good practice. Myself, I prefer to scan the string for "interesting" values, and then apply the appropriate parsing rules using some function that I have written and verified.

FWIW, I have been writing parsers and programming languages for almost 35 years. I used to teach graduate-level courses in the techniques needed to do so.

As I understand it ipv6 is 8 16 bit parts, which can be shorthanded if there is one or more parts which are 0.

1234:abcd:0000:0000:0000:0000:5678:efef
would become
1234:abcd::5678:efef
Where the double colons mean every part between them is 0.

Sorry Rubberman. Despite your impressive experience the first hex number is NOT separated from the rest by dual colons . Suzy's last post is correct.

See RFC5952 https://tools.ietf.org/html/rfc5952#section-2.2
One (and only one) sequence of zero fields can be replaced by a::
It's also recommended that :: is not used for just one zero field, and where there is a choice of sequences of zero fields, use :: for the longer one. You may use the :: as the start or end of an address.

Can't argue with that documentation (since that is the standard). I do laugh when I see this snippet

This flexibility has caused many problems for operators, systems engineers, and customers.

In reference to the omitting zeros (I can see why too, probably shouldn't have allowed that, would have made life a lot easier, and an easier standard to interpret).

Wasn't there also that leading zeros could be omitted (I can't remember the exact requirement and the standard provided I didn't see it)

Yes, leading or trailing zeros can be shortened

0:0:0:1:2:3:4:5 = ::1:2:3:4:5
1:2:3:4:0:0:0:0 = 1:2:3:4::
::1 = localhost

But isn't there also something like
0014:6789 = 14:6789

Yes.
In fact that's the case people usually mean when they talk about "leading zeros". Omitting leading zeros within a field is optional.
Suzie's last example was more a case of "shortening fields with only zeros"

I'd love to meet the members of the IETF who came up with the standard for IPv6 and ask them what they were thinking about when they came up with the rule of the omited zeros

I think they were thinking about people who had to type an address in, rather than the people who had to write code to handle them. But then any language/API worthy of your time will already have a standard method/class or whatever to parse ip addresses (eg Java's InetAddress.getByName)

Be a part of the DaniWeb community

We're a friendly, industry-focused community of developers, IT pros, digital marketers, and technology enthusiasts meeting, networking, learning, and sharing knowledge.