0

Hello!

I am fairly new to regular expressions (in PHP in particular, since there seem to be some special requirements/restrictions), and am especially stumped with my most recent attempt at parsing two particular strings.

The first one is a URL in which I want to retrieve the first variable:

http://somesite.come/page.html?var=1234&some=2

So, I want to retrieve "var"'s value explicitly (it will always be an integer). This is not from a URL to my site ("$_GET" isn't what I'm looking for); this is given as a string and nothing more.

The second one is a CSS (well, style) line, which goes something like this:

background: url("/images/dir/dir-20/0-0-0.gif") no-repeat scroll 0pt 0pt transparent; width: 0px; height: 0px;

In this case, I'd like to retrieve "0-0-0.gif" from the string.

Any method which you geniuses can conceive would be greatly appreciated!

2
Contributors
5
Replies
6
Views
6 Years
Discussion Span
Last Post by FlashCreations
1

For the first URL, you can retrieve the var parameter by exploding the string returned by parse_url. You can also use a Regex if you want.

$param = explode('&', parse_url($url, PHP_URL_QUERY));
if(is_int($param['var'])) {
//$param['var'] for the var parameter
}

//OR

preg_match("/var=([0-9]+)/i", $url, $matches);
//$matches[1][0] for the var parameter

EDIT: As a rule, everything is usually faster than a Regex...Keep that in mind :)

The URL from the CSS command can be found with a simple Regex:

preg_match('/url\("([^\/]+\/)*([a-z0-9-_\%\.]+)"\)/i', $css, $matches);
//$matches[2][0] contains the file name from the URL

If either Regexes need to be refined or you need clarification, just write back!
Cheers,
PhpMyCoder

Edited by FlashCreations: n/a

0

Hi FlashCreations,

Thanks for the response!

The documentation by the PHP group on "preg_match()" is fairly vague - could you point me in the right direction by which to use it?

I have integrated your code like such:

// The CSS parse

$this->avatar = $post->find('div[class=shell] td');
foreach($this->avatar as $avatar) {
	preg_match('/url\("([^\/]+\/)*([a-z0-9-_\%\.]+)"\)/i', $avatar->style, $matches); // Thanks "FlashCreations"/"PhpMyCoder"!
	$this->avatar = $matches[2][0];
	echo $this->avatar . 'Hello world!';
}

and:

preg_match("/var=([0-9]+)/i", $this->topic->url, $matches); // Thanks "FlashCreations"/"PhpMyCoder"!
$this->topic->remote_id = $matches[1][0];
echo $this->topic->remote_id;

Obviously the "echo" statements here are for debugging purposes.

A few things about these exerpts: they are both in seperate classes, the main one being "board", which the URL parser is in. This remotely retrieves a file and takes the URL of the discovered row (for each row it finds, it executes this code and makes a call to the topic) to find the ID. The URL's look something like what I described in my above post.

The CSS code is retrieved from the pages that are called via the "board" class, and the code which you kindly provided me is located in the "thread" class. This parses the page for necessary information (continuously dividing the page into smaller groups).

When echoing $avatar->style, I receive a string containing information as good as identical to what I posted above. However, when I echo $this->avatar, I receive a null or empty string. Same rules apply for the URL - I receive a non-existent (or undesired) response.

Odds are I'm doing something wrong, so your continued expertise would be gratefully welcomed.

Thanks!

1

EDIT: To DaniWeb: Please update your posting system so if my token has expired when I post, I don't have to hit the back button and re-type everything.

Alright, take two. The gist of it is that there are two problems. The simpler of the two to fix is the problem regarding the URL. To access the var parameter you should use $matches[1] instead.
The second problem involving the style can be fixed in a similar manner as above, but also requires a Regex fix. Here is your revised code:

$this->avatar = $post->find('div[class=shell] td');
foreach($this->avatar as $avatar) {
	preg_match('/url\("([a-z0-9-_\.\/]*\/)?([a-z0-9-_\%\.]+)"\)/i', $avatar->style, $matches); // Thanks "FlashCreations"/"PhpMyCoder"!
	$this->avatar = $matches[2];
	echo $this->avatar . 'Hello world!';
}

Hope this fixes everything! :)
-PhpMyCoder

0

Hey FlashCreations,

Sorry for the delayed response!

Getting there! I am now receiving only empty strings - which is definitely better than a null string. Unfortunately, neither the style string nor the URL are being correctly parsed.

If it makes it any easier, I'm more than happy to send you the full files (beware, the code is fairly messy!), assuming it would help in pinpointing the problem.

Thanks for your continued assistance in this... dilemma! :)

0

Yes,
If you could send me the full files that would help in the debugging process. Also, would you mind sending over a few more samples of the CSS styles & URL's so I can ensure I've tailored the Regexes to work with everything you throw at them.
Cheers,
PhpMyCoder

This topic has been dead for over six months. Start a new discussion instead.
Have something to contribute to this discussion? Please be thoughtful, detailed and courteous, and be sure to adhere to our posting rules.