| | |
data-grabbing & mining - need script-help
Please support our PHP advertiser: PostgreSQL or MySQL? Compare and contrast the two most popular open source databases
![]() |
•
•
Join Date: Jul 2006
Posts: 6
Reputation:
Solved Threads: 0
this is probably one of the best places to ask such questions. so i do it now.
first of - i have to explain something; I have to grab some data out of a phpBB in order to do some field reseach. I need the data out of a forum that is runned by a user community. I need the data to analyze the discussions.
to give an example - let us take this forum here. How can i grab all the data out of this forum - and get it local and then after wards put it in a local database - of a phpBB-forum - is this possible"?!"?
Nothing harmeful - nothing bad - nothing serious and dangerous. But the issue is.
i have to get the data - so what?
I need the data in a allmost full and complete formate. So i need all the data like
username .-
forum
thread
topic
text of the posting and so on and so on.
how to do that?
i need some kind of a grabbing tool - can i do it with that kind of tool. How do i sove the storing-issue into the local mysql-database.
Well you see that is a tricky work - and i am pretty sure taht i am getting help here. So for any and all help i am very very thankful
#many many thanks in advance
metabo_man http://images.devshed.com/fds/smilies/smile.gif
- a Ethno-reseracher
first of - i have to explain something; I have to grab some data out of a phpBB in order to do some field reseach. I need the data out of a forum that is runned by a user community. I need the data to analyze the discussions.
to give an example - let us take this forum here. How can i grab all the data out of this forum - and get it local and then after wards put it in a local database - of a phpBB-forum - is this possible"?!"?
Nothing harmeful - nothing bad - nothing serious and dangerous. But the issue is.
i have to get the data - so what?
I need the data in a allmost full and complete formate. So i need all the data like
username .-
forum
thread
topic
text of the posting and so on and so on.
how to do that?
i need some kind of a grabbing tool - can i do it with that kind of tool. How do i sove the storing-issue into the local mysql-database.
Well you see that is a tricky work - and i am pretty sure taht i am getting help here. So for any and all help i am very very thankful
#many many thanks in advance
metabo_man http://images.devshed.com/fds/smilies/smile.gif
- a Ethno-reseracher
Last edited by metabo_man; Jul 22nd, 2006 at 6:18 pm.
•
•
Join Date: Jul 2004
Posts: 494
Reputation:
Solved Threads: 21
Ok, first you get a pencil and a piece of paper... Oh you want it done automatically.
If it's your forum, you can simply download the database and do your data mining in that. If it's not your forum, you may find a friendly admin that will help you.
If it's your forum, you can simply download the database and do your data mining in that. If it's not your forum, you may find a friendly admin that will help you.
www.uncreativelabs.net
Old computers are getting to be a lost art. Here at Uncreative Labs, we still enjoy using the old computers. Sometimes we want to see how far a particular system can go, other times we use a stock system to remind ourselves of what we once had.
Old computers are getting to be a lost art. Here at Uncreative Labs, we still enjoy using the old computers. Sometimes we want to see how far a particular system can go, other times we use a stock system to remind ourselves of what we once had.
•
•
Join Date: Jul 2006
Posts: 6
Reputation:
Solved Threads: 0
hi
thanks for the reply.
yes - but agreed
web automation can be a difficult task;at least it depends on countermeasures implemented. Some will ban accesses based on the user-agent you send, some will use HTTP_REFERER values to restrict 'bots, some will use session-based authorization schemes to keep 'bots from interacting with the site.
but for the automation task we can look at WWW::Mechanize as it encapsulates many of the low-level web automation tools provided by perl.
yes right - but i need to do it on the fly. I have no time to ask. And to do for some research i want to have some kind of data that is gained without long and controversial debates....
metabo
thanks for the reply.
•
•
•
•
Originally Posted by Puckdropper
Ok, first you get a pencil and a piece of paper... Oh you want it done automatically.
yes - but agreed
web automation can be a difficult task;at least it depends on countermeasures implemented. Some will ban accesses based on the user-agent you send, some will use HTTP_REFERER values to restrict 'bots, some will use session-based authorization schemes to keep 'bots from interacting with the site.
but for the automation task we can look at WWW::Mechanize as it encapsulates many of the low-level web automation tools provided by perl.
•
•
•
•
If it's your forum, you can simply download the database and do your data mining in that. If it's not your forum, you may find a friendly admin that will help you.
metabo
•
•
•
•
Originally Posted by metabo_man
to give an example - let us take this forum here. How can i grab all the data out of this forum
All I would do is click the view source code button on your internet browser, then copy that info to a text file.
Then you can just write your own parser (in any language for that matter not just php) to extract the relevant info.
•
•
•
•
username .-
forum
thread
topic
text of the posting and so on and so on.
Last edited by iamthwee; Jul 23rd, 2006 at 6:26 am.
*Voted best profile in the world*
•
•
Join Date: Jul 2006
Posts: 6
Reputation:
Solved Threads: 0
hello
many thanks that is of big help. I do now want to grab all the informations but only that of certain individual threads.
that sounds interesting - i will look how to solve it. I will dive into the thing and return here - in case of having more questions or ideas or just think that i need more help.
best regards
metabo
many thanks that is of big help. I do now want to grab all the informations but only that of certain individual threads.
•
•
•
•
Originally Posted by iamthwee
Hmm, I'm not sure about how you would grab all the information from this forum, but I can see a way how to do it for individual threads.
All I would do is click the view source code button on your internet browser, then copy that info to a text file.
Then you can just write your own parser (in any language for that matter not just php) to extract the relevant info.
that sounds interesting - i will look how to solve it. I will dive into the thing and return here - in case of having more questions or ideas or just think that i need more help.
best regards
metabo
![]() |
Similar Threads
- transfer data from an asp .net page into a javasscript script (ASP.NET)
- data grabbing from html sites (Python)
Other Threads in the PHP Forum
- Previous Thread: Time seems to be off...
- Next Thread: new PHP user - help
Views: 1674 | Replies: 4
| Thread Tools | Search this Thread |
Tag cloud for PHP
.htaccess access ajax apache api array beginner binary broken cakephp checkbox class cms code compression cron curl database date directory display download dropdown dynamic echo email error file files folder form forms function functions google href htaccess html httppost image include insert integration ip java javascript joomla limit link login loop mail md5 menu methods mlm mod_rewrite multiple mysql oop parse paypal pdf php problem query radio random recursion regex remote script search secure select server sessions sms soap source space speed sql structure syntax system table tutorial update updates upload url validation validator variable video votedown web xml youtube






