data-grabbing & mining - need script-help

Reply

Join Date: Jul 2006
Posts: 6
Reputation: metabo_man is an unknown quantity at this point 
Solved Threads: 0
metabo_man metabo_man is offline Offline
Newbie Poster

data-grabbing & mining - need script-help

 
0
  #1
Jul 22nd, 2006
this is probably one of the best places to ask such questions. so i do it now.

first of - i have to explain something; I have to grab some data out of a phpBB in order to do some field reseach. I need the data out of a forum that is runned by a user community. I need the data to analyze the discussions.


to give an example - let us take this forum here. How can i grab all the data out of this forum - and get it local and then after wards put it in a local database - of a phpBB-forum - is this possible"?!"?


Nothing harmeful - nothing bad - nothing serious and dangerous. But the issue is.
i have to get the data - so what?

I need the data in a allmost full and complete formate. So i need all the data like

username .-
forum
thread
topic
text of the posting and so on and so on.

how to do that?

i need some kind of a grabbing tool - can i do it with that kind of tool. How do i sove the storing-issue into the local mysql-database.

Well you see that is a tricky work - and i am pretty sure taht i am getting help here. So for any and all help i am very very thankful

#many many thanks in advance

metabo_man http://images.devshed.com/fds/smilies/smile.gif
- a Ethno-reseracher
Last edited by metabo_man; Jul 22nd, 2006 at 6:18 pm.
Reply With Quote Quick reply to this message  
Join Date: Jul 2004
Posts: 494
Reputation: Puckdropper is an unknown quantity at this point 
Solved Threads: 21
Puckdropper Puckdropper is offline Offline
Posting Pro in Training

Re: data-grabbing & mining - need script-help

 
0
  #2
Jul 23rd, 2006
Ok, first you get a pencil and a piece of paper... Oh you want it done automatically.

If it's your forum, you can simply download the database and do your data mining in that. If it's not your forum, you may find a friendly admin that will help you.
www.uncreativelabs.net

Old computers are getting to be a lost art. Here at Uncreative Labs, we still enjoy using the old computers. Sometimes we want to see how far a particular system can go, other times we use a stock system to remind ourselves of what we once had.
Reply With Quote Quick reply to this message  
Join Date: Jul 2006
Posts: 6
Reputation: metabo_man is an unknown quantity at this point 
Solved Threads: 0
metabo_man metabo_man is offline Offline
Newbie Poster

Re: data-grabbing & mining - need script-help

 
0
  #3
Jul 23rd, 2006
hi


thanks for the reply.



Originally Posted by Puckdropper
Ok, first you get a pencil and a piece of paper... Oh you want it done automatically.

yes - but agreed

web automation can be a difficult task;at least it depends on countermeasures implemented. Some will ban accesses based on the user-agent you send, some will use HTTP_REFERER values to restrict 'bots, some will use session-based authorization schemes to keep 'bots from interacting with the site.

but for the automation task we can look at WWW::Mechanize as it encapsulates many of the low-level web automation tools provided by perl.

If it's your forum, you can simply download the database and do your data mining in that. If it's not your forum, you may find a friendly admin that will help you.
yes right - but i need to do it on the fly. I have no time to ask. And to do for some research i want to have some kind of data that is gained without long and controversial debates....

metabo
Reply With Quote Quick reply to this message  
Join Date: Aug 2005
Posts: 5,273
Reputation: iamthwee is a splendid one to behold iamthwee is a splendid one to behold iamthwee is a splendid one to behold iamthwee is a splendid one to behold iamthwee is a splendid one to behold iamthwee is a splendid one to behold iamthwee is a splendid one to behold iamthwee is a splendid one to behold 
Solved Threads: 378
Featured Poster
iamthwee's Avatar
iamthwee iamthwee is offline Offline
Posting Expert

Re: data-grabbing & mining - need script-help

 
0
  #4
Jul 23rd, 2006
Originally Posted by metabo_man
to give an example - let us take this forum here. How can i grab all the data out of this forum
Hmm, I'm not sure about how you would grab all the information from this forum, but I can see a way how to do it for individual threads.

All I would do is click the view source code button on your internet browser, then copy that info to a text file.

Then you can just write your own parser (in any language for that matter not just php) to extract the relevant info.

username .-
forum
thread
topic
text of the posting and so on and so on.
Last edited by iamthwee; Jul 23rd, 2006 at 6:26 am.
*Voted best profile in the world*
Reply With Quote Quick reply to this message  
Join Date: Jul 2006
Posts: 6
Reputation: metabo_man is an unknown quantity at this point 
Solved Threads: 0
metabo_man metabo_man is offline Offline
Newbie Poster

Re: data-grabbing & mining - need script-help

 
0
  #5
Jul 23rd, 2006
hello


many thanks that is of big help. I do now want to grab all the informations but only that of certain individual threads.



Originally Posted by iamthwee
Hmm, I'm not sure about how you would grab all the information from this forum, but I can see a way how to do it for individual threads.

All I would do is click the view source code button on your internet browser, then copy that info to a text file.

Then you can just write your own parser (in any language for that matter not just php) to extract the relevant info.

that sounds interesting - i will look how to solve it. I will dive into the thing and return here - in case of having more questions or ideas or just think that i need more help.

best regards
metabo
Reply With Quote Quick reply to this message  
Reply

This thread is more than three months old.
Perhaps start a new thread instead?
Message:




Views: 1674 | Replies: 4
Thread Tools Search this Thread



Tag cloud for PHP
About Us | Contact Us | Advertise | DaniWeb | Acceptable Use Policy | RSS Feed

©2003 - 2009 DaniWeb® LLC