•
•
•
•
What is DaniWeb IT Discussion Community?
You're currently browsing the Existing Scripts section within the Web Development category of DaniWeb, a massive community of 456,519 software developers, web developers, Internet marketers, and tech gurus who are all enthusiastic about making contacts, networking, and learning from each other. In fact, there are 2,806 IT professionals currently interacting right now! Registration is free, only takes a minute and lets you enjoy all of the interactive features of the site.
Please support our Existing Scripts advertiser: Web Code Converter
Views: 2325 | Replies: 0
![]() |
| |
•
•
Join Date: Oct 2007
Posts: 1
Reputation:
Rep Power: 0
Solved Threads: 0
I'm looking for an open source webcrawler which can be easily modified. Is anyone aware of a good open source crawler? I'm just enrolled in a dataminning course and would like to develop an algorithm which categorizes web pages. I need to be able to control which pages are indexed and how those pages are categorized, both based on the page content. Thanks.
![]() |
•
•
•
•
•
•
•
•
DaniWeb Existing Scripts Marketplace
•
•
•
•
Currently Active Users Viewing This Thread: 1 (0 members and 1 guests)
•
•
•
•
3d agplv3 apple blue screen cocoa computer crash dell development drivers email eudora firefox framework games gpl graphics ibm intel internet java linux microsoft microsystems mozilla news next open open source open-source openbsd opengl openoffice operating os penelope programming red hat software source step sun super system thunderbird ubuntu vista wesnoth windows xp
- Open Source Projects (C++)
- Looking for open source gateway (Java)
- Looking for open source message queue implementation (Java)
- The Eternal Sun - open-source MMORPG needs coders (C++)
- Voluntary-The Eternal Sun - open-source MMORPG needs coders (Software Development Job Offers)
- Open Source to Windows (C#)
Other Threads in the Existing Scripts Forum
- Previous Thread: Script for deleting Enteries in registry
- Next Thread: Need help with changing the index.php file


Hybrid Mode