Hello,
I need to extract a particular value from this html snippet. As i would not like to use any external libraries the only way to achieve this using core java is using regular expressions. As i have never used regular expressions it would be great if you could suggest how the integer value could be retrieved from the below input.

<tr><td>GLOBALID=123245</td></tr>

I need to extract the integer value assigned to to GLOBALID.

Recommended Answers

All 5 Replies

I'm no REGEX expert, but standard Java does include classes for parsing XML (and therefore HTML) without any "external libraries". That's probabaly the safest way to ensure your parsing isn't going to fail on some obscure but legal example of real data.
http://docs.oracle.com/javase/tutorial/jaxp/index.html

I'm not a Java programmer but any pythonista would answer to use the beautifulsoup library. From what I read here and there, the java equivalent is named JSoup. I think you could try this library.

JSoup looks like an excellent solution, except for OP's "i would not like to use any external libraries". Personally I find nothing wrong with external libraries, provided they are open source and I can bundle their classes into my own distribution jar.

Hm, I missed that part of the post.

On the other hand you clould just hack it...

If there's just one <tr><td>GLOBALID= prefix in the text, and the </td>suffix is on the same line then you can simply use String's indexOf to find the prefix's position, then indexOf again to find the first suffix after that position, which will give you the two indexes that you need to substring the actual value.
Depending on the file you may first have to deal with distracting white space anywhere in that.

Be a part of the DaniWeb community

We're a friendly, industry-focused community of developers, IT pros, digital marketers, and technology enthusiasts meeting, networking, learning, and sharing knowledge.