0

Hey guys,

I am writing a bot class to scrape some information off of websites.

Here are the requirements.

  • Specify Url
  • Check for valid url
  • 'GET' contents of url with curl
  • check mime type & response status code
  • check for special url
    • parse special data
  • parse for standard data
  • return data as array or json

now i will explain each requirment:

Specify a url: the class would accept a url from a form to know what website to scrape.

check for valid url: Check if url is fully qualified.

'GET' contents of url with curl: perform a curl get request on specified url and return the websites contents.

check mine type & response status code: only allowing certain response codes, ex: 200. and mimes types of text/html.

check for special url: youtube url's use query vars to serve content. for example: http://youtube.com/watch?v=9ha98h
parse special data: parse the youtube video or other content unique to that site.

parse standard data: retrieve document keywords, description, images, etc...

return data as array or json: may be using ajax or a normal form, would like to return either type.

I have most of this functionality complete but im running into errors when i try using static methods.

problem:

I have a parent class with a method of parse and sub classes for special url's

ex: class Bot{}, and class Bot_Special_Youtube extends Bot{}

the child class has a parse function also. if the url is special, we will use the child class parse method which calls parent::parse();

im having trouble here.

here is some code

// Find out if it the url
		// is a special case
		$this->_find_special();
		
		if($this->is_special() !== false)
		{
			$class = 'Bot_Special_'.ucfirst($this->_special);
			
			$class::parse();
				
		}else
		{
			Bot::parse();
		}

I'm getting an error Using $this when not in object context Here is the offending code:

/**
	* Parses the returned content from url
	* into something useable
	*
	* @return array
	*/
	public static function parse()
	{

		$images = $this->_document->find('img');
				
		foreach($images as $image)
		{
			$this->_return['images'][] = $image->src;
		}
		
		$keywords = $this->_document->find('head meta[name=keywords]');
		$description = $this->_document->find('head meta[name=description]');
		
		if(isset($keywords))
		{
			$this->_return['meta']['keywords'] = $keywords->content;
			echo $keywords->content;
		}
		
		if(isset($description))
		{
			$this->_return['meta']['description'] = $description->content;
			echo $description->content;
		}
	}

If you guys could help me please that would be amazing.

Thanks in advance.

2
Contributors
2
Replies
3
Views
5 Years
Discussion Span
Last Post by sacarias40
0

I'm definitely not a PHP wizard but good old OOP rules says your function is declared as static. You can't use the "this" reference in a static context.

0

Yeah i have been doing a little research and i found that much out. But do you know how i could overcome this? Do you know what im trying to get at with the code above?

This topic has been dead for over six months. Start a new discussion instead.
Have something to contribute to this discussion? Please be thoughtful, detailed and courteous, and be sure to adhere to our posting rules.