I'm looking for a way to compare 2 lists of keywords and return the number of matches.

I'm currently working with PHP and MySQL, but I'm not sure how to go about this task. Any suggestions are appreciated.



(additional info)

I have not entered the data yet. So I'm open to just about anything. Here's what I'm planning.

(this part I have)
html form: user enters url
php: visits url and extracts the keywords and description meta tag values.

(I don't have this)
mysql:[table: 2 columns: column1 & column2] column1- phrase or one word
colum2- list of keywords relative to column1

Essentially, the php will visit the site, extract the keywords and compare them to the database keywords (column2) counting the number of matches. The php will return the column1 value for the column2 value that returned the most matches.

For simply comparing 2 lists of words for matches, I'd leave the database out of it. You can store your results or the keywords in the database if you want, but for the actual comparison, check out PHP's array_diff() function.

PS: For the part where you retrieve the headers and meta tags of the pages server-side, I know you said you had this working, but my class_http would make that job quite easy. It's a robust screen-scraping class and even supports making WebDAV requests. It is very easy to use.


Thanks a lot for the response. I was wondering if anyone was going to respond.

Some things have changed since my last post. I have about 155+ categories (individual names) and each has a list of keywords.

I'll want to compare one list of keywords with each of the category’s lists. I've already designed the database structure (normalized, I think), so I'm not so worried about that any longer.

As for the tag extractor, I have a very simple solution:

while (!feof ($fp))
	$buf = trim(fgets($fp, 4096));
	$cont .= $buf;


echo 'Title<br>';
echo $match[1];
echo '<br><br>';

echo 'URL<br>';
Be a part of the DaniWeb community

We're a friendly, industry-focused community of developers, IT pros, digital marketers, and technology enthusiasts meeting, networking, learning, and sharing knowledge.