how do you extract all email addresses from a string and put each extracted email address as an element of a simple array?

<?php 

// define and implement your php function here
function get_all_emails($text_field) {

// ---- Beginning of Green Section ---
// create an empty array
$emails = array(); 

// implement the function. this function should extract
// all emails from $text_field and put them in $emails
// ---- End of Green Section ---


return $emails;

}

// your main code

$example1 = "person-1@here.com, person_2@there.net; person.3@wayoverthere.com";

$response = get_all_emails($example1);

// use print_r to test the result
echo "<pre>";
print_r($response);
echo "</pre>";

?>

Recommended Answers

All 13 Replies

if you're seperating all emails by a certain character (like ; for example) you can just explode the string:

$emails = explode(';',$text_field);

Just to add onto GliderPilot's post, it's important that you make sure the string is uniform in how it separates e-mail addresses. In your code person-1@here.com and person_2@there.net are separated by a comma followed by a space, while person_2@there.net and person.3@wayoverthere.com are separated by a semicolon and a space. It should be one or the other, and you can probably ditch the space. So if you decide to use the semicolon as the delimiter it should look like "person-1@here.com;person_2@there.net;person.3@wayoverthere.com"

Or you could use preg_split to work with commas, semi-colons and spaces.

Description of Problem:

You and a teammate at your company are working on the same project, which is an email application written in PHP and has a MySQL database backbone. Your teammate has done most of the project but he is running out of time for the next release and needs help. She asks you to help her by writing a simple PHP module (piece of code) that parses a string -- coming from a text input field from the email users -- and extracts all the email addresses from that text.

To make your job easier, she tells you that you only need to write a function called get_all_emails(), which has one input parameter called $text_field -- a string containing multiple emails -- and one output parameter called $emails -- an array of emails. She also tells you that she has made sure the $text_field value can only have alphanumeric values, dashes, underscores, quotation marks, double quotation marks, comma, semicolons, period, @, spaces, less than and greater than signs. So any of the following values can be a valid $text_field that is fed to the function:

1.person-1@here.com, person_2@there.net; person.3@wayoverthere.com
2."Jane Le-Doe" <john.ledoe@somewhere_interesting.com>
3.jane.doe@ hello ,;;;; me@home.no<>where
4.to b @ not 2 be
5.this_email@is_obviously_not_valid

The third example has obvious problems with email addresses but is still a valid input to your "cleaner" function. The function you are writing for her should take each of the above input values, extract all email addresses and put them one by one in an array called $emails then return that array.

For the examples above, the output of the function would look like this:
1. array("person-1@here.com", "person_2@there.net", "person.3@wayoverthere.com")
2. array("john.ledoe@somewhere_interesting.com")
3. array("me@home.no")
4. array()
5. array("this_email@is_obviously_not_valid")

Here is a strategy she suggests that you follow to clean up each input string:

  1. Look for @ signs in the string. The @ sign indicates a potential email address.

  2. For each @ sign, look to the left until you reach a character that cannot be part of the left side of an email address. Out of all the characters we mentioned only alphanumerical values, dashes, underscores, and periods are allowed in either the left side or the right side of an email. Space, less than, greater than, comma, semicolon, single/double quotation marks and a second @ sign cannot be part of the left (user) or right (domain name) of an email address.

  3. Grab anything that is valid to the left of the @ sign, then do the same for the right side of each @ sign until you either reach a character that is not part of an email address, another @ sign or the end of the string.

  4. For each @ sign that has a valid left side (non-empty) and a valid right side (non-empty), insert the left@right part into the $emails array. Don't worry if the email is valid or not at this point. At this point of the project your teammate doesn't validate the actual emails as long as they follow what she just described: left@right where both left and right are non-empty strings. Please note that space is not a valid character in an email address, that's why jane.doe@ hello does not qualify as a valid email because the right side is empty (space is invalid). jane.doe@@here.com is not valid either because the right side of the first @ starts with an invalid character second @, which leaves the right part an empty string. However jane doe@here com is valid in this project because even though there are spaces in the email, the left side of @ is doe which is a non-empty left-side string, and here is a non-empty right-side string until we reach an invalid character.

Since your colleague has been programming PHP for a decade, she would probably use regular expressions to do this task, but she also knows that you probably don't have much experience with regular expressions at this point so she's happy with however you come up with a working solution. She tells you that you may want to take a look at some functions under php.net. Specifically she suggests taking a look at strpos, strlen, substr, explode, in_array, and some other string manipulation functions that may be useful for this project.

P.S. - The teammate, company, and emails are all made up, its just to help you give you an idea of what I am looking for.

this is simple php email checker, use this to get your task completed

function  checkEmail($email) 
    {
         if (!preg_match("/^([a-zA-Z0-9\._-])+([a-zA-Z0-9\._-] )*@([a-zA-Z0-9_-])+([a-zA-Z0-9\._-]+)+$/" , $email)) 
         {

              return false;
         }
         return true;
    }

So, do you have more code than what is posted above. Go through the above instructions step by step... Look for @ signs in the string, determine what is left and right of that,
use striptags() to remove <>, use str_replace to remove semi; commas, start with some of this and perhaps you will receive more help.

Use preg_split to tokenize the string based on invalid characters. Iterate through and check whether each resulting token is a valid email address. If it isn't remove it from the array.

<?php

$string = 'person-1@here.com, person_2@there.net; person.3@wayoverthere.com';
#$string = '"Jane Le-Doe" <john.ledoe@somewhere-interesting.com>';
#$string = 'jane.doe@ hello ,;;;; me@home.no<>where';
#$string = 'to b @ not 2 be';
#$string = 'this_email@is_obviously_not_valid';

$emails = get_all_emails($string);
echo '<pre>'; print_r($emails); echo '</pre>';


/**
 * Extract valid email addresses from string.
 *
 * @param string $string
 * @return array
 */
function get_all_emails($string)
{
    $emails = preg_split('/[\s,;\<>]/', $string);

    foreach($emails as $index => $email) 
    {
        if(! filter_var($email, FILTER_VALIDATE_EMAIL))
            unset($emails[$index]);
    }

    return $emails;
}
commented: Nice use of filter_var - very underused IMO +14

i advice you shour separate
$example1 = "person-1@here.com, person_2@there.net; person.3@wayoverthere.com";
in the same character like ( ; )
then you can use the php explode()
$emails = explode(';',$example1);

Just for fun, I coded up something for you. Yes preg match is much more elegant but does not meet all requirements for your project.

$str_1 = "person-1@here.com, person_2@there.net; person.3@wayoverthere.com ";
$str_2= '"Jane Le-Doe" <john.ledoe@somewhere_interesting.com> ';
$str_3= "jane.doe@ hello ,;;;; me@home.no<>where ";
$str_4 = "to b @ not 2 be ";
$str_5 = "this_email@is_obviously_not_valid";

$all = $str_1 . $str_2 . $str_3 . $str_4 . $str_5;

// there are much more elegant ways to do this using preg_match;
// that said.
function get_all_emails($str) {
    $return_array = array();
    // lets break our string into an array, in this case lets explode on a ' ' (space) in the string.
    // in testing out some code, you need to replace the commas, semis; and <> tags before you split the string, SO:
    $str = str_replace(',',' ',$str);
    $str = str_replace("<",' ', $str);
    $str = str_replace(">",' ', $str);
    $str = str_replace(";",' ', $str);
    // and any other characters you may want to filter out.  I am replacing them with a space because that is what we are going
    // to use to determine each item in the string.
    $parts = explode(' ', $str);
    // go through each part, and determine if there is an @ sign.
    foreach ($parts as $part) {
        // lets see what we are dealing with:
        //echo "part = " . $part . "<br />";
        // if it is blank throw it out, otherwise, continue
        if ($part != '') {
            echo "part = " . $part . "<br />";
            // lets strip html chars from the tag.-- see filter above, using strip_tags() actually condensed the me@home.no<>where
            // to me@home.nowhere, which is valid but not correct.  so, forget the strip_tags, those <> symbols needed
            // to be replaced on whatever you are splitting the string on, in this case I am using the space ' ' symbol.
            //$part = strip_tags($part);
            echo "part now = " . $part . "<br />";
            // adding in urtrivedi script from above, while in most cases it works, it does not work 100% of the time.
            /*****  Adding ultrivedi  function here for testing purposes  ****/
            // case in point 'this_email@is_obviously_not_valid'
            $ce = checkEmail($part);
            if ($ce) {
                echo "is valid email " . $part . "<br />";
            } else {
                echo "non valid email " . $part . "<br />";
            }
            /******  end checkEmail call continue on with 'part'  *******/
            // check this string to see if it has an @ symbol.
            if (strpos($part,'@') !== false) {
                echo "part contains @, possible email!<br >";
                // now lets explode this string and see if it contains 2 valid parts -- left, --right
                $part_array = explode('@',$part);
                // count how many pieces you have, if it is 1 or less not enough for a valid email address, skip it
                $left = $part_array['0'];
                $right = $part_array['1'];                  
                // validate that you have data in all parts
                echo "left = " . $left . " right = " . $right . "<br />";
                if ($left != '' && $right != '') {
                    // new, check for a well formatted right side as the 'this_email@is_obviously_not_valid' was passing up until now:
                    if (strpos($right,'.') !== false) {
                    // you still have the whole email in $part variable, so push to your return array
                        echo "VALIDATED, ADD TO ARRAY!<br />";
                    // one of the best functions ever, array_push
                        array_push($return_array,$part);
                    } else {
                        echo "not well formed " . $right . " : skip it<br />";
                    }
                } else {
                    echo "DID NOT VALIDATE " . $part . "<br />";
                }
            }
        }
    }

    return $return_array;
}
$x = get_all_emails($str_1);
echo "<br />";
echo "x count = " . count($x) . " valid addresses<br />";
echo "x= ".print_r($x);
echo "<br />";
$y = get_all_emails($str_2);
echo "<br />";
echo "y count = " . count($y) . " valid addresses<br />";
echo "y= ".print_r($y);
echo "<br />";
$z=get_all_emails($str_3);
echo "<br />";
echo "z count = " . count($z) . " valid addresses<br />";
echo "z= ".print_r($z);
echo "<br />";
$a = get_all_emails($str_4);
echo "<br />";
echo "a= ".print_r($a);
echo "a count = " . count($a) . " valid addresses<br />";
echo "<br />";
$b = get_all_emails($str_5);
echo "<br />";
echo "b count = " . count($b) . " valid addresses<br />";
echo "b= ".print_r($b);
echo "<br />";

$all = get_all_emails($all);
echo "<br />";
echo "all count = " . count($all) . " valid addresses<br />";
echo "all= <pre>".print_r($all). "</pre>";
echo "<br />";

function checkEmail($email) {
    if (!preg_match("/^([a-zA-Z0-9\._-])+([a-zA-Z0-9\._-] )*@([a-zA-Z0-9_-])+([a-zA-Z0-9\._-]+)+$/" , $email)) {
        return false;
    }
    return true;
}

I think I almost have it, but I am still missing a case where a malformed email like me@home.com will pass as me@home whereas it should not be a valid email. A valid email in 2012 always has a domain AND a top level TLD like gmail.com. Could you please help me change my pattern to ensure at least one dot and at least two characters (non numerical) after the dot.

<?php 

function get_all_emails ($text_field){
$emails = array();
preg_match_all("/[\._a-zA-Z0-9-]+@[\._a-zA-Z0-9-]+/i", $text_field, $output);
foreach($output[0] as $email) array_push ($emails, strtolower($email));
return $emails;
}

$example1 = 'me@home';

$response = get_all_emails($example1);

echo "<pre>";
print_r($response);
echo "</pre>";

?>

I am reposting it, have u seen my previous reply

function  checkEmail($email) 
    {
         if (!preg_match("/^([a-zA-Z0-9\._-])+([a-zA-Z0-9\._-] )*@([a-zA-Z0-9_-])+([a-zA-Z0-9\._-]+)+$/" , $email)) 
         {

              return false;
         }
         return true;
    }

According to this form, when your code is pasted in here, the array does not show:

<?php
function get_all_emails ($text_field){
    $emails = array();
    preg_match("/^([a-zA-Z0-9\._-])+([a-zA-Z0-9\._-] )*@([a-zA-Z0-9_-])+([a-zA-Z0-9\._-]+)+$/" , $text_field, $output);
    foreach($output[0] as $email) array_push ($emails, strtolower($email));
    return $emails;
    }
?>

<html>
<head>
<title>
    Mid Term - Alek Hein
</title>
</head>
<body>

<form action="mid-term.php" method="post">
    <table>
        <tr>
            <td>
                Enter a string containing emails:
            </td>
            <td>
                <textarea name="check" style="height:25; overflow:hidden"></textarea>
                <input type="submit" name="submit" value="submit">
            </td>
        </tr>
    </table>
</form>
</body>
</html>

<?php 
if(isset($_POST['submit'])){
    $response = get_all_emails($_POST["check"]);

    echo "<pre>";
    print_r($response);
    echo "</pre>";

    }
?>

Please go to this website: http://students.cpcc.edu/~ahein001/mid_term/mid-term.php to understand what I mean by no array.

Sorry for my late reply, but I just came across this thread trying to find a more efficient way than I currently am doing, but here's what I have so far:

// Store valid emails
$emails = array();

// Use regex to find all email-like matches
$regexp = '/([a-z0-9_\.\-])+\@(([a-z0-9\-])+\.)+([a-z0-9]{2,4})+/i';
preg_match_all($regexp, $body, $matches);

if (isset($matches[0]))
{
    // For each valid match ...
    foreach ($matches[0] AS $email)
    {
        // Remove any invalid characters, etc.
        // Not really needed for this regex example
        // $email = filter_var($email, FILTER_SANITIZE_EMAIL);

        // Make sure it's a valid email
        if (filter_var($email, FILTER_VALIDATE_EMAIL))
        {            
            // Add it to the list of valid emails
            $emails[] = $email;
        }
    }
}
Be a part of the DaniWeb community

We're a friendly, industry-focused community of developers, IT pros, digital marketers, and technology enthusiasts meeting, networking, learning, and sharing knowledge.