User Name Password Register
DaniWeb IT Discussion Community
All
What is DaniWeb IT Discussion Community?
You're currently browsing the Database Design section within the Web Development category of DaniWeb, a massive community of 392,099 software developers, web developers, Internet marketers, and tech gurus who are all enthusiastic about making contacts, networking, and learning from each other. In fact, there are 3,815 IT professionals currently interacting right now! Registration is free, only takes a minute and lets you enjoy all of the interactive features of the site.
Please support our Database Design advertiser:
Views: 628 | Replies: 1
Reply
Join Date: Dec 2007
Posts: 3
Reputation: JEKYS is an unknown quantity at this point 
Rep Power: 0
Solved Threads: 0
JEKYS JEKYS is offline Offline
Newbie Poster

Optimizing tables to compare bilions of rows. How?

  #1  
Dec 5th, 2007
Dear all,

Currently I am working on a project that has to do with logging
visitors traffic.
Let's say, every time a visitor has visited the website, one row will
be inserted into a table.

Thing is, just like OneStat, Nedstat or whatever, this project
retrieves it's input from a large number of website. This may result
in let's say 1000 rows per minute from the start but probably a hell
of a lot more.

First question is: How many rows can be processed per minute, or
second?


Now my second question is more difficult. I will also need to compare
the results in a guide page, that will compare all the data collected
and for example show, which site has had the most visitors. For the
last day, week, year... whatever.

My problem is that if I would store all data in one table, this table
will very soon be very very large. I don't know the maximum number of
rows that are allowed, but this will sure influence the speed. The
more rows, the longer it will take of course to compare and show
results in the Guide Page. And finaly I will hit the maximum number of
rows anyhow.

What I am thinking of is to automatically generate a table for every
month but I am not sure if this is wise. Will I be able to compare
fast enough after a year? Let's say I have 12 tables and I want to
compare all rows, let's say there are one bilion rows in each
table...

What I actually like to know I think, is how the database of NedStat
is more or less structure. They probably must have over a milion rows
to store every minute... Can anyone tell me how the manage to store
and compare all this data?
_________________
Instant approval credit cards for bad credit bankroll management
AddThis Social Bookmark Button
Reply With Quote  
Join Date: Nov 2007
Posts: 51
Reputation: Nige Ridd is an unknown quantity at this point 
Rep Power: 1
Solved Threads: 9
Nige Ridd Nige Ridd is offline Offline
Junior Poster in Training

Re: Optimizing tables to compare bilions of rows. How?

  #2  
Dec 10th, 2007
One thing you need to think about is do you actually need to store a row per hit of the web-site? Could you just not have a small table which counts the hits per site?
If you do need each hit - you could hold each hit in a log type table, then have a batch process which overnight summarises this data onto a report table which you then use to generate the reports. You can then use another batch process which copies any data which is no longer needed to produce your results into an archive - perhaps even just a flat file to save space. If you needed very up to date stats - you could combine the summary table values with the current log data to give you truly up to the minute figures.

Nige
Reply With Quote  
Reply

Only community members can participate in forum threads. You must register or log in to contribute.

Currently Active Users Viewing This Thread: 1 (0 members and 1 guests)

 

DaniWeb Database Design Marketplace
Thread Tools Display Modes

Other Threads in the Database Design Forum

All times are GMT -4. The time now is 12:51 pm.
Forum system based on vBulletin Copyright ©2000 - 2008, Jelsoft Enterprises Ltd.
©2003 - 2008 DaniWeb® LLC