I need regex for parsing phpbb forum style quote tags, general case with preserved items order. Like quoted, unquoted, quoted, quoted, quoted... etc... Also there are NOT NESTED quotes. I'll explain on example.

some unquoted text11
[quote="person1"]some quoted text11[/quote]
[quote="person2"]some quoted text22[/quote]
[quote="person3"]some quoted text33[/quote]
some unquoted text22
...
[quote="person4"]some quoted text44[/quote]
...

Resulting array should be:

Array   //PRESERVED ORDER
        (
            [0] => Array
                (
                    ['type'] => unquoted
                    ['name'] => ''
                    ['text'] => some unquoted text11
                )
            [1] => Array
                (
                    ['type'] => quoted
                    ['name'] => person1
                    ['text'] => some quoted text11
                )
            [2] => Array
                (
                    ['type'] => quoted
                    ['name'] => person2
                    ['text'] => some quoted text22
                )
            [3] => Array
                (
                    ['type'] => quoted
                    ['name'] => person3
                    ['text'] => some quoted text33
                )
            [4] => Array
                (
                    ['type'] => unquoted
                    ['name'] => ''
                    ['text'] => some unquoted text22
                )

                ...

            [5] => Array
                (
                    ['type'] => quoted
                    ['name'] => person4
                    ['text'] => some quoted text44
                )

                ...
        }

Recommended Answers

All 7 Replies

Here was the suggestion I had gave for the last person that asked about parsing bbcode.

http://jbbcode.com/

It may work for you as well, and you could just extend the class to add more of the functionality you're looking for.

It doesn't seem to work when I create [quote="person"] tag, (since there is not by default) with my example text from first post. Here is snippet I made from example in jbbcode.

require_once "../Parser.php";

$parser = new JBBCode\Parser();
$parser->addCodeDefinitionSet(new JBBCode\DefaultCodeDefinitionSet());



$text =    'some unquoted text11';
$text .=    '[quote="person1"]some quoted text11[/quote]';
$text .=    '[quote="person2"]some quoted text22[/quote]';
$text .=    '[quote="person3"]some quoted text33[/quote]';
$text .=    'some unquoted text22';
$text .=    '...';
$text .=    '[quote="person4"]some quoted text44[/quote]';
$text .=    '...';


    $parser->addBBCode("quote", '<div class="quote" name="{option}">{param}</div>');  
    //var_dump( $parser->codeExists('quote') );

    $parser->parse($text);

    print_r( $parser->getAsHTML() );

This also wont work.

require_once "../Parser.php";

$parser = new JBBCode\Parser();
$parser->addCodeDefinitionSet(new JBBCode\DefaultCodeDefinitionSet());

$text1 = "The default codes include: [b]bold[/b], [i]italics[/i], [u]underlining[/u], ";
$text1 .= "[url=http://jbbcode.com]links[/url], [color=red]color![/color] and more.";

$text =    'some unquoted text11';
$text .=    '[quote="person1"]some quoted text11[/quote]';
$text .=    '[quote="person2"]some quoted text22[/quote]';
$text .=    '[quote="person3"]some quoted text33[/quote]';
$text .=    'some unquoted text22';
$text .=    '...';
$text .=    '[quote="person4"]some quoted text44[/quote]';
$text .=    '...';



    $builder = new JBBCode\CodeDefinitionBuilder('quote', '<a href="{option}">{param}</a>');
    $builder->setUseOption(true)->setOptionValidator(new \JBBCode\validators\UrlValidator());
    $parser->addCodeDefinition($builder->build());

    $parser->parse($text);

    print $parser->getAsHTML() ;

I succeded to make custom quote tag with option, but there are only getAsText() getAsHTML() to output so there is no way to get that ordered array from first post, even with XML parser for outputed html.

Well, that takes care of the parsing. Most likely will need to extend the class and create another function that outputs the array.

If it is returning HTML it is most likely putting the data into an array at some point. Check out the function getAsHTML() you might be able to use that as a starting point.

I would take a more in depth look at it, but I just got busy at work with something.

I understand you, but then if I have to break the code and since I need only one quote tag its more appropriate to write one original regex function.

I wrote this extremely quickly because I was at work and really didn't feel like working and I was extremely bored, so it might not be plotted out the best and there may be some things that need tweaking. At least it will give you a bit of something to build on.

function parseQuote($string){

    $split_char = "^||^";
    $results = array();
    $i = 0;

    // add split char to string
    $trans = array("[quote" => $split_char . "[quote", "[/quote]" => "[/quote]" . $split_char);
    $string = strtr($string, $trans);

    // remove dup split char
    $trans2 = array($split_char.$split_char => $split_char);
    $string = strtr($string, $trans2);

    // remove split char from beginning and end of sting
    $string = trim($string, $split_char);

    // create an array from string
    $chopped = explode("^||^", $string);

    foreach($chopped as $line){

        $quoteStatus = preg_match('/\[quote\=\"(.*?)\"\]/', $line, $quoteTags);
        $results[$i]["name"] = $quoteTags[1];

        if($quoteStatus != 0){
            $results[$i]["type"] = "quoted";
        }else{
            $results[$i]["type"] = "unquoted";
        }

        if($quoteStatus != 0){

            $getQuote = preg_match('/\[quote\=\".*\"\](.*?)\[\/quote\]/', $line, $quote);
            $results[$i]["text"] = $quote[1];

        }else{
            $results[$i]["text"] = $line;
        }

        $i++;
    }

    return $results;
}

Merry Christmas?

Be a part of the DaniWeb community

We're a friendly, industry-focused community of developers, IT pros, digital marketers, and technology enthusiasts meeting, networking, learning, and sharing knowledge.