i need some help with logic here. i am reading contents with ob_get_contents. then i need to search for and fetch the contents and id's in a custom tag. eg: <translate id="4" token_id="0" variant_id="1">contents............</translate>.
i wanted to do that with preg_match_all but it get complicated trying to fetch the id's. does anyone have a better idea how best to go about it?

Recommended Answers

All 3 Replies

If the order of the id's is always the same, you can use

<translate id="(\d+)" token_id="(\d+)" variant_id="(\d+)">(.*?)</translate>

If it is XML output I would parse it using the xml parsing functions.

However, if it's not entirely XML (or not XML at all), then using preg_match_all should be fine. Here's some sample code to show you:

$b = <<<'ENDOFEXAMPLE'
<translate id="4" token_id="0" variant_id="1">contents............</translate>
blah blah blah
<translate id="9" token_id="8" variant_id="7">other contents...</translate>
ENDOFEXAMPLE;

preg_match_all("/<translate[^>]*\\ id=\\\"([^>]*)\\\"[^>]*\\ token_id=\\\"([^>])*\\\"[^>]*\\ variant_id=\\\"([^>])*\\\"[^>]*>([^<]*)<\\/translate>/imU",$b,$matches);

print_r($matches);

?>

that outputs:

/*
Array
(
    [0] => Array
        (
            [0] => <translate id="4" token_id="0" variant_id="1">contents............</translate>
            [1] => <translate id="9" token_id="8" variant_id="7">other contents...</translate>
        )

    [1] => Array
        (
            [0] => 4
            [1] => 9
        )

    [2] => Array
        (
            [0] => 0
            [1] => 8
        )

    [3] => Array
        (
            [0] => 1
            [1] => 7
        )

    [4] => Array
        (
            [0] => contents............
            [1] => other contents...
        )

)
*/

You can see that the output is an array. Starting at index 1, the contents of that array are:

index = 1: an array of ids
index = 2: an array of token_ids
index = 3: an array of variant_ids
index = 4: the contents of the <translate> tag

The way I've written it, the order of the attributes of the <translate> is fixed. In other words, id="" must come before token_id="" which in turn must come before variant_id="". There can be other attributes inserted in between, but the overall order must be as I've described.

If you don't know that the order will be like that, you'd have to run several preg_match_all statements in succession, like so:

<?php

$b = <<<'ENDOFEXAMPLE'
<translate id="4" token_id="0" variant_id="1">contents............</translate>
blah blah blah
<translate token_id="22" variant_id="23" id="21">other contents...</translate>
ENDOFEXAMPLE;

preg_match_all("/<translate[^>]*\\ id=\\\"([^>]*)\\\"[^>]*>[^<]*<\\/translate>/imU",$b,$match_id);
preg_match_all("/<translate[^>]*\\ token_id=\\\"([^>]*)\\\"[^>]*>[^<]*<\\/translate>/imU",$b,$match_token_id);
preg_match_all("/<translate[^>]*\\ variant_id=\\\"([^>]*)\\\"[^>]*>[^<]*<\\/translate>/imU",$b,$match_variant_id);
preg_match_all("/<translate[^>]*>([^<]*)<\\/translate>/imU",$b,$match_contents);

print_r($match_id);
print_r($match_token_id);
print_r($match_variant_id);
print_r($match_contents);

?>

In a similar manner, the data can be read from the arrays starting at index = 1. Here's the output:

/*
Array
(
    [0] => Array
        (
            [0] => <translate id="4" token_id="0" variant_id="1">contents............</translate>
            [1] => <translate token_id="22" variant_id="23" id="21">other contents...</translate>
        )

    [1] => Array
        (
            [0] => 4
            [1] => 21
        )

)
Array
(
    [0] => Array
        (
            [0] => <translate id="4" token_id="0" variant_id="1">contents............</translate>
            [1] => <translate token_id="22" variant_id="23" id="21">other contents...</translate>
        )

    [1] => Array
        (
            [0] => 0
            [1] => 22
        )

)
Array
(
    [0] => Array
        (
            [0] => <translate id="4" token_id="0" variant_id="1">contents............</translate>
            [1] => <translate token_id="22" variant_id="23" id="21">other contents...</translate>
        )

    [1] => Array
        (
            [0] => 1
            [1] => 23
        )

)
Array
(
    [0] => Array
        (
            [0] => <translate id="4" token_id="0" variant_id="1">contents............</translate>
            [1] => <translate token_id="22" variant_id="23" id="21">other contents...</translate>
        )

    [1] => Array
        (
            [0] => contents............
            [1] => other contents...
        )

)
*/

Index 1 of the first array represents the id="" data.
Index 1 of the second array represents the token_id="" data
Index 1 of the third array represents the variant_id="" data
Index 1 of the last array represents the contents of the <translate> tag

thanks alot for your help edwinhermann, very helpful. it worked out real good

Be a part of the DaniWeb community

We're a friendly, industry-focused community of developers, IT pros, digital marketers, and technology enthusiasts meeting, networking, learning, and sharing knowledge.