XML-style BBCode parser

itsjareds 0 Tallied Votes 148 Views Share

I answered a question a few days ago on Yahoo! Answers where I helped (did all the work for) the question asker. They were asking for a way to parse text from a <textarea> and search for HTML-like elements that were named in a database (or an array, in my case).

Took a few hours to write a script, and I'm happy with how it turned out. Right now, the script only displays information for what it parses, but you can take this script and insert different functions rather than displaying the match.

It is all packed in one PHP file.

If you need an example of what to put in the textarea, try typing this in:

<data id="blah">Test</data>
<img id="foo" src="http://www.daniweb.com/forums/myimages/statusicon/forum_new.gif"/>
<img src=http://www.daniweb.com/forums/myimages/statusicon/forum_new.gif info=no_quotes/>
<code id="bar">Test</code>

Hope you find this useful! I spent a lot of time on it, so I felt I owed it to myself to post it for others.

<?php
if (isset($_POST['bbcode'])) {
	$bbcode = stripslashes($_POST['bbcode']);

	// Array of tags to search for
	$searchTags = array("data",
			    "code",
			    "img");

	for ($a=0; $a<sizeof($searchTags); $a++) {
		// Searches for each tag
		$regexp = "<$searchTags[$a](?:\s(?:(?!\/>|>).)*)?(?:>(?:(?:(?!<\/$searchTags[$a]).)*)<\/$searchTags[$a]>|\/>)";

		// Returns a match for each attribute
		$regAtts = "(?:(?:(?!\s|=).)*)=\"?(?:(?:(?!\"|\s|\/>|>).)+)\"?";

		// Apply regular expression and begin search for tags
		preg_match_all("/$regexp/i", $bbcode, $matches);

		for ($i=0; $i<sizeof($matches); $i++) {
			echo "<div style=\"border:1px dashed black; padding:10px;\">\n";
			echo "<h3>Processed tag: $searchTags[$a]</h3>\n";

			for ($b=0; $b<sizeof($matches[$i]); $b++) {
				$cur = $matches[$i][$b];
				$escapedMatch = str_replace('<', '&lt;', $cur);
				echo "<li>Match: $escapedMatch</li><br/>\n";
				echo "<span>HTML Parsed: <div style=\"border:1px dotted blue; padding: 5px;\">$cur</div></span><br/>\n";

				// Begin parsing attributes
				preg_match_all("/$regAtts/i", $cur, $atts);

				for ($c=0; $c<sizeof($atts); $c++) {
					for ($d=0; $d<sizeof($atts[$c]); $d++) {
						echo "Attribute $d: " . $atts[$c][$d] . "<br/>\n";
					}
					echo "<br/><br/>\n";
				}
			}

			echo "</div><br/>\n";
		}
	}
}
else { // Post if the form has not been sent
?>

<html>
<body style="text-align:center;">

<form action="<?php echo basename($_SERVER['PHP_SELF']); ?>" method="post">
	<textarea name="bbcode" rows="15" cols="100"></textarea><br/>
	<input type="submit" value="Submit BBCode"/>
</form>

</body>
</html>

<?php
}
?>
itsjareds 29 Junior Poster

If you want the ability to change the opening and closing tags, disregard the snippet and use this instead. It defines your opening and closing tags in variables $o and $c (For $open and $close)

<?php
if (isset($_POST['bbcode'])) {
	$bbcode = stripslashes($_POST['bbcode']);
	$matchCount = 0;

	// <!-- BEGIN EDITABLE VARIABLES -->

	// Opening/closing tags for BBCode
	// $o stands for Open, $c stands for Close
	// No characters need to be escaped
	$o = "<";
	$c = ">";

	// Array of tags to search for
	// Add or remove depending on what you need
	$searchTags = array("data",
			    "code",
			    "img");

	// <!-- END EDITABLE VARIABLES -->

	// If open or close tag needs to be escaped, escape it.
	$escChars = "\\\^\.\$\|\(\)\[\]\*\+\?\{\}\,";
	$o = addcslashes($o, $escChars);
	$c = addcslashes($c, $escChars);

	for ($a=0; $a<sizeof($searchTags); $a++) {
		// Searches for each tag
		$regexp = "$o$searchTags[$a](?:\s(?:(?!/$c|$c).)*)?(?:$c(?:(?:(?!$o/$searchTags[$a]).)*)$o/$searchTags[$a]$c|/$c)";

		// Returns a match for each attribute
		$regAtts = "(?:(?:(?!\s|=).)*)=\"?(?:(?:(?!\"|\s|/$c|$c).)+)\"?";

		// Apply regular expression and begin search for tags
		preg_match_all("#$regexp#i", $bbcode, $matches);

		for ($i=0; $i<sizeof($matches[0]); $i++) {
			$matchCount++;
			echo "<div style=\"border:1px dashed black; padding:10px;\">\n";
			echo "<h3>Processed tag: $searchTags[$a]</h3>\n";

			for ($b=0; $b<sizeof($matches[$i]); $b++) {
				$cur = $matches[$i][$b];
				$escapedMatch = htmlspecialchars($cur);
				echo "<li>Match: $escapedMatch</li><br/>\n";
				echo "<span>HTML Parsed: <div style=\"border:1px dotted blue; padding: 5px;\">$cur</div></span><br/>\n";

				// Begin parsing attributes
				preg_match_all("#$regAtts#i", $cur, $atts);

				for ($k=0; $k<sizeof($atts); $k++) {
					for ($d=0; $d<sizeof($atts[$k]); $d++) {
						echo "Attribute $d: " . $atts[$k][$d] . "<br/>\n";
					}
					echo "<br/><br/>\n";
				}
			}

			echo "</div><br/>\n";
		}
	}
	if ($matchCount == 0)
		echo "No matches were found.";
}
else { // Post if the form has not been sent
?>

<html>
<body style="text-align:center;">

<form action="<?php echo basename($_SERVER['PHP_SELF']); ?>" method="post">
	<textarea name="bbcode" rows="15" cols="100"></textarea><br/>
	<input type="submit" value="Submit BBCode"/>
</form>

</body>
</html>

<?php
}
?>
Be a part of the DaniWeb community

We're a friendly, industry-focused community of developers, IT pros, digital marketers, and technology enthusiasts meeting, networking, learning, and sharing knowledge.