Dear all,

Currently I am working on a project that has to do with logging
visitors traffic.
Let's say, every time a visitor has visited the website, one row will
be inserted into a table.

Thing is, just like OneStat, Nedstat or whatever, this project
retrieves it's input from a large number of website. This may result
in let's say 1000 rows per minute from the start but probably a hell
of a lot more.

First question is: How many rows can be processed per minute, or
second?


Now my second question is more difficult. I will also need to compare
the results in a guide page, that will compare all the data collected
and for example show, which site has had the most visitors. For the
last day, week, year... whatever.

My problem is that if I would store all data in one table, this table
will very soon be very very large. I don't know the maximum number of
rows that are allowed, but this will sure influence the speed. The
more rows, the longer it will take of course to compare and show
results in the Guide Page. And finaly I will hit the maximum number of
rows anyhow.

What I am thinking of is to automatically generate a table for every
month but I am not sure if this is wise. Will I be able to compare
fast enough after a year? Let's say I have 12 tables and I want to
compare all rows, let's say there are one bilion rows in each
table...

What I actually like to know I think, is how the database of NedStat
is more or less structure. They probably must have over a milion rows
to store every minute... Can anyone tell me how the manage to store
and compare all this data?
_________________
Instant approval credit cards for bad credit bankroll management

One thing you need to think about is do you actually need to store a row per hit of the web-site? Could you just not have a small table which counts the hits per site?
If you do need each hit - you could hold each hit in a log type table, then have a batch process which overnight summarises this data onto a report table which you then use to generate the reports. You can then use another batch process which copies any data which is no longer needed to produce your results into an archive - perhaps even just a flat file to save space. If you needed very up to date stats - you could combine the summary table values with the current log data to give you truly up to the minute figures.

Nige

Be a part of the DaniWeb community

We're a friendly, industry-focused community of developers, IT pros, digital marketers, and technology enthusiasts meeting, networking, learning, and sharing knowledge.