How would I get a value in between the two quotes after value=?

So, value="hi my name is bob" />
would return: hi my name is bob
or value="Ouch! "that hurt" lol..." />
would return: Ouch! "that hurt" lol...

so basically I know the value=" TEXT_HERE " /> will always occur and I want the string inside of it. and yes, there is always a space before the /> at the end.
It is HTML code I am parsing, I have gotten everything except for this field to parse correctly.

Thanks for the help
-Austin

Recommended Answers

All 25 Replies

Hmm... Are you sure that the quote is in the way you are showing? The reason is that it would be an error on HTML because there can't be double quote inside double quote that way???

No, I just want a way to grab the data from within that layout.

i.e.

<div id="mess130268" class="mChatBG2 mChatHover"><span style="float:left;"><a class="mChatScriptLink" href="javascript://" onclick="insert_text('&#64;&nbsp;[b]Run You Camper[/b], ', false);" title="Respond to user"><span style="color: #ffFF15"><strong>&#64;</strong></span></a>&nbsp;<a href="./memberlist.php?mode=viewprofile&amp;u=18216" style="color: #ffFF15;" class="username-coloured">Run You Camper</a> - Fri Oct 28, 2011 9:17 am</span><span style="float:right;"><input type="hidden" id="edit130268" value="me n him brought down a whole igc lobby ourselves... :D" /> <a href="javascript://" onclick="mChat.del('130268');"><img src="./mchat/del.gif" alt="Delete" title="Delete" class="mChatImage" /></a></span><br /><div class="mChatMessage">me n him brought down a whole igc lobby ourselves... <img src="./images/smilies/icon_e_biggrin.gif" alt=":D" title="Very Happy" /></div></div>
					
					<div id="mess130269" class="mChatBG1 mChatHover"><span style="float:left;"><a class="mChatScriptLink" href="javascript://" onclick="insert_text('&#64;&nbsp;[b]Run You Camper[/b], ', false);" title="Respond to user"><span style="color: #ffFF15"><strong>&#64;</strong></span></a>&nbsp;<a href="./memberlist.php?mode=viewprofile&amp;u=18216" style="color: #ffFF15;" class="username-coloured">Run You Camper</a> - Fri Oct 28, 2011 9:17 am</span><span style="float:right;"><input type="hidden" id="edit130269" value="then godfather dashboarded cuz of it and ended game 30 secs early. -.- lol" /> <a href="javascript://" onclick="mChat.del('130269');"><img src="./mchat/del.gif" alt="Delete" title="Delete" class="mChatImage" /></a></span><br /><div class="mChatMessage">then godfather dashboarded cuz of it and ended game 30 secs early. -.- lol</div></div>

would return ONLY

me n him brought down a whole igc lobby ourselves... :D

and

then godfather dashboarded cuz of it and ended game 30 secs early. -.- lol

Yes, what will the regex be..it still messes with me head a lot. :S

And I would use a side tool, but I have a kind of strange setup going on right now and throwing more things into it, could make some problems

Well, the simplest case regex would be something like value\=\"(.+)\" /\> You would retrieve your text from group 1. Of course, you may have to tune that for variability.

Edit: Oops, used the wrong direction slash. Corrected.

I'm thinking of /value=\"([^\"]*)\"/ but it doesn't seem to work when I tested it on Java. Though, the same regex works fine on Ruby...

forgive me, but I am really bad with setting up regex's...

So if that is the regex I want to use, how would I post that?

I am trying to use that regex with the string "holder". holder is containing the HTML string.


here is my fail attempt so far.

Pattern pattern = Pattern.compile("value\=\"(.+)\" \\\>");
        Matcher matcher = pattern.matcher(holder);

        while (matcher.find()) {
            message[counter] = matcher.group();
            counter++;

        }

message[] is the string array i am storing the data in
which is erroring

And what is the error?

Edit: You have to escape all of the " and \ with additional \ in your string.

it says illegal escape...which I assume your talking about.

My head is literally exploding atm, could you write it out for me, been programming nonstop for like....6 hours now, and after this bit Ima call it a day.

Thanks :)

value\=\"(.+)\" \\\> would be... value=\"(.+)\" \/> The "=" and ">" don't need to be escaped.

But the problem with the regex above is that it will grab the last " it sees in the string.

i.e.
a string  --> <tag adb kad value="do rem" dne /> dioafn "dhgbvi" bebja
result from the regex will be
dorem" dne /> dioafn "dhgbvi

Just go through your string and add a backslash escape for each backslash and quote you are using in your expression. You'll have to get used to this at some point. Escaping regex pattern literals is a pain :)

Also note that I corrected my original pattern to use the forward slash in front of the ">".

I think you'll also want to use a reluctant qunatifier to keep it from grabbing multiple elements, so instead of (.+) use (.+?)

value\=\"(.+)\" \\\> would be... value=\"(.+)\" \\> But the problem with the regex above is that it will grab the last " it sees in the string.

i.e.
a string  --> <tag adb kad value="do rem" dne/> dioafn "dhgbvi" bebja
result from the regex will be
dorem" dne/> dioafn "dhgbvi

I'm lost...I thought I would be adding more \ to it instead of removing theM?

Okay, nothing is appearing at all.... for the message, the array is still holding all null values?

counter = 0;

        Pattern pattern = Pattern.compile("value=\"(.+)\" \\>");
        Matcher matcher = pattern.matcher(holder);

        while (matcher.find()) {
            message[counter] = matcher.group();
            counter++;
            System.out.println(message[counter]);
        }

And again I still don't follow you, this is my first time with regex basically.

The "=" and ">" don't need to be escaped.

True. I've gotten overly cautious about escaping all reserved characters in the expressions, but evidently it's okay with those as written even though = and > are used in some pattern constructs.

But the problem with the regex above is that it will grab the last " it sees in the string.

My last change above to use (.+?) should resolve that.

this? value=\"(.+?)\" /\>


or like value=/\/"(.+?)/\/" /\>

@aanders5: Change \\> to just /> and add the question mark in (.+?)

Ok, my bad, my over-escaping caution has led this astray. Just use Pattern.compile("value=\"(.+?)\" />");

counter = 0;

        Pattern pattern = Pattern.compile("value=\"(.+?)\" />");

        Matcher matcher = pattern.matcher(holder);

        while (matcher.find()) {
            message[counter] = matcher.group();
            counter++;
            System.out.println(message[counter]);
        }

this still returns all null values. :(

EDIT: the println(message[counter]); isnt' even returning, meaning that the while loops isn't even being entered. :(

This is set up correctly, yes/no?

holder = the string containing all the HTML code.
message[counter] = where each message is stored at
counter = the location in the array to store the messages into.


Do I need to do something with the group stuff?

OK, try this...

Pattern pattern = Pattern.compile("value=\"([^\"]*)\"", Pattern.CASE_INSENSITIVE);
    Matcher matcher = pt.matcher(holder);
    while (matcher.find()) {
      System.out.println(matcher.group());
    }

And please let me know what it looks like.

ok, running it right now

Ok, this enter's loop! But nothing is storing?

Pattern pattern = Pattern.compile("value=\"([^\"]*)\"", Pattern.CASE_INSENSITIVE);
        Matcher matcher = pattern.matcher(holder);
        while (matcher.find()) {
            System.out.println("entered loop");
            message[counter] = matcher.group();
            counter++;
            System.out.println(message[counter]);
        }

this was returned when parsing 10 results.

entered loop
null
entered loop
null
entered loop
null
entered loop
null
entered loop
null
entered loop
null
entered loop
null
entered loop
null
entered loop
null
entered loop

This works perfectly fine for me

String holder = "value=\"Hey there... blah blah\" /><more /><asd value=\"another\" /><more><asdf value=\"third line\" />";
Pattern pattern = Pattern.compile("value=\"(.+?)\" />");
Matcher matcher = pattern.matcher(holder);

while (matcher.find()) {
    System.out.println(matcher.group(1));
}

Group 1 is the portion of the match string that you were wanting to capture.

testing again....1 moment
Your code didn't enter the loop :(

however, Taywin's did enter the loop so long as group(1)

I'm not sure what to tell you then, because I took this fragment that you posted earlier this works just fine with both my pattern and Taywin's

String holder = "<div id=\"mess130268\" class=\"mChatBG2 mChatHover\"><span style=\"float:left;\"><a class=\"mChatScriptLink\" href=\"javascript<b></b>://\" onclick=\"insert_text('@&nbsp;Run You Camper, ', false);\" title=\"Respond to user\"><span style=\"color: #ffFF15\"><strong>@</strong></span></a>&nbsp;<a href=\"./memberlist.php?mode=viewprofile&amp;u=18216\" style=\"color: #ffFF15;\" class=\"username-coloured\">Run You Camper</a> - Fri Oct 28, 2011 9:17 am</span><span style=\"float:right;\"><input type=\"hidden\" id=\"edit130268\" value=\"me n him brought down a whole igc lobby ourselves... :D\" /> <a href=\"javascript<b></b>://\" onclick=\"mChat.del('130268');\"><img src=\"./mchat/del.gif\" alt=\"Delete\" title=\"Delete\" class=\"mChatImage\" /></a></span><br /><div class=\"mChatMessage\">me n him brought down a whole igc lobby ourselves... <img src=\"./images/smilies/icon_e_biggrin.gif\" alt=\":D\" title=\"Very Happy\" /></div></div> "+
"					"+
"					<div id=\"mess130269\" class=\"mChatBG1 mChatHover\"><span style=\"float:left;\"><a class=\"mChatScriptLink\" href=\"javascript<b></b>://\" onclick=\"insert_text('@&nbsp;Run You Camper, ', false);\" title=\"Respond to user\"><span style=\"color: #ffFF15\"><strong>@</strong></span></a>&nbsp;<a href=\"./memberlist.php?mode=viewprofile&amp;u=18216\" style=\"color: #ffFF15;\" class=\"username-coloured\">Run You Camper</a> - Fri Oct 28, 2011 9:17 am</span><span style=\"float:right;\"><input type=\"hidden\" id=\"edit130269\" value=\"then godfather dashboarded cuz of it and ended game 30 secs early. -.- lol\" /> <a href=\"javascript<b></b>://\" onclick=\"mChat.del('130269');\"><img src=\"./mchat/del.gif\" alt=\"Delete\" title=\"Delete\" class=\"mChatImage\" /></a></span><br /><div class=\"mChatMessage\">then godfather dashboarded cuz of it and ended game 30 secs early. -.- lol</div></div>";
Pattern pattern = Pattern.compile("value=\"(.+?)\" />");
//Pattern.compile("value=\"([^\"]*)\"", Pattern.CASE_INSENSITIVE);
Matcher matcher = pattern.matcher(holder);

while (matcher.find()) {
    System.out.println(matcher.group(1));
}

I can't speak for any data other than that tested.

How about replace value=\"(.+?)\" /> with value=\"(.+?)\"\\s*/> ?

I shall try that later tonight after excessive partying lmfao.
The data you had should have worked, mine was a constant flow as the pageSource updated, so I am not sure why it would not run for me.
What I am building is an off-site chat application that correlates to a real time chat on a forum, without having any administrative permissions on that forum. Gotta use webdriver to login with the user credentials to even access the chat lol

Once this works perfectly (and I am getting close I think)

I will "attempt" to convert it into android, you may have noticed me starting some rather simple topics in the Mobile Development area too. :) the problem is that I have no clue how Webdriver will act with android, I know they have tools for it so I know there is a way but I am nowhere that knowledgeable in JAVA, I am just a novice lol.

ttyl, and thanks both of you.
-Austin

Be a part of the DaniWeb community

We're a friendly, industry-focused community of developers, IT pros, digital marketers, and technology enthusiasts meeting, networking, learning, and sharing knowledge.