By using Regex I've got links from an HTML Source.

But now I want to save these links into a new array so that I can later use them. I mean, I want all the "m.Groups[]" in a new array.

Here's my code:

String html = getSource();
r = new Regex("class=lnk href\\s*=\\s*(?:\"(?<1>[^\"]*)\"|(?<1>\\S+))", RegexOptions.IgnoreCase | RegexOptions.Compiled);
for (m = r.Match(html); m.Success; m = m.NextMatch())
{ 
      MessageBox.Show("Link" + m.Groups[1].Value); 
}

Thanks in advance.

Recommended Answers

All 11 Replies

String html = getSource();
List<Group> newGroups = new List<Group>();
r = new Regex("class=lnk href\\s*=\\s*(?:\"(?<1>[^\"]*)\"|(?<1>\\S+))", RegexOptions.IgnoreCase | RegexOptions.Compiled);
for (m = r.Match(html); m.Success; m = m.NextMatch())
{ 
      MessageBox.Show("Link" + m.Groups[1].Value); 
//new modified code
newGroups.Add(m.Groups[index]); //not 1 as you say
}

Read my comments in the code I may got you wrong.

Thanks.

The "index" doesn't work. It say "index" doesn't exist.

I've modified the code, read the comments:

String html = getSource();
               // I want a list of "Strings"
                List<String> links = new List<String>();
                r = new Regex("class=lnk href\\s*=\\s*(?:\"(?<1>[^\"]*)\"|(?<1>\\S+))", RegexOptions.IgnoreCase | RegexOptions.Compiled);
                for (m = r.Match(html); m.Success; m = m.NextMatch())
                {
                    //MessageBox.Show("Link" + m.Groups[1].Value);
                    links.Add(m.Groups[1].Value);
                }

// This messagebox only show 1 link while the above messagebox showed all the links (that matched the regex).

                for (int i = 0; i < links.Count; i++)
                {
                    MessageBox.Show(links[i]);
                }

I hope you'll get me now.

Thanks again for your help.

Sure it won't work as index not defined but what I need to say is you everytime add the same Group instance which in the index 1 of the array of Group.

Hmm. But that's what MSDN gave me.

Anyways, I did this in VB and its working properly. Here's the code:

Can you look at that code and get ideas and then get me the C# code, please? Will be really appreciated.

Dim r As New Regex("class=lnk href\s*=\s*(?:""(?<1>[^""]*)""|(?<1>\S+))", RegexOptions.IgnoreCase Or RegexOptions.Compiled)
            Dim m As Match
           Dim matches As New List(Of String) // this is the new array
            // The code below code works fine
            For Each m In r.Matches(HTML CODE)
                matches.Add(m.Groups(1).Value)
            Next

Thanks!

Can you please give me the link? to know what's this about.

The regex scans for "<a href=link.com>Link</a>" and gets the link.

Here's link to MSDN, I am using this code:

http://msdn.microsoft.com/en-us/library/t9e807fx(VS.71).aspx

The code they gave me works very great for me. But I don't know how to save the matches in a new array. That's all I want to do.....Save the matches in a new array.

String html = richTextBox1.Text;
            List<string> matches = new List<string>();
            Regex r = new Regex("href\\s*=\\s*(?:\"(?<1>[^\"]*)\"|(?<1>\\S+))", RegexOptions.IgnoreCase | RegexOptions.Compiled);
            Match m;// = r.Match(
            for (m = r.Match(html); m.Success; m = m.NextMatch())
            {
                matches.Add(m.Value);
                MessageBox.Show(m.Value);
                //new modified code
                //newGroups.Add(m.Groups[index]); //not 1 as you say
            }
commented: thanks man ~ farooqaaa +2

Thanks a million :). That code worked!

Thanks again!

That's what you needed?? or it popup unneeded string?

No. I am not using the popup. I only used it to find out whether links were capture or not.

And it popup NEEDED strings :D

BTW, I am working on a bot. It visits a website (using webbrowser control) and then gets all the available links. And then visits each one and so on.

Thanks :)

You're welcome, just made sure it meets your need.

Be a part of the DaniWeb community

We're a friendly, industry-focused community of developers, IT pros, digital marketers, and technology enthusiasts meeting, networking, learning, and sharing knowledge.