This week I'm working on the data warehousing capabilities of my application. Now most people here who have built more than a couple of applications can tell you how useful it is to include a datestamp on records or requests as they come in. The advantages are numerous.

So much so that I'm now fighting the temptation to include a datestamp on every table. Won't that bloat my database?

I suppose what I'm looking for is guidance on making a per table decision to include or exclude the datestamp. Because if space were free I'd put one on every darn table! But space isn't free and my app is likely to generate a factor of five to ten times the amount of metadata per data and so datestamps would put me in danger of doubling that again, or trebling it. That's fine up to a point but will impact load times.

How big is each record?

If it's like say a customer record with maybe a dozen fields and taking up several hundred bytes of space, adding a single 32-bit timestamp isn't going to impact you.

https://en.wikipedia.org/wiki/Entity%E2%80%93relationship_model
Draw one of these for your database schema then consider which of the perimeter tables would benefit from a timestamp.

commented: Thanks you are right again, Salem! I answered below. +2

@Salem

My application will be an analytics system and data warehousing system (because of the former) for network information. Most network information worth saving is about one kilobyte per record, or some fraction thereof. So usually it's safe to run with the assumption of about 1k/record.

An example record might be:

ip route add 10.0.3.0/24 via 10.0.3.1

To support data analysis, I'm obviously going to parse that into about ten times the volume that my customer ships records at. For example, if someone wants to look up a list of routes, part (but only part) of the above record will be stored in a separate format. And if you want to know how many times a host in that particular Class A changed, I'd have part of the above stored in a separate format. And so and and so forth.

But you're right, even a smaller string will still need about 10 bytes of space allocated to it, a single 32-bit volume is inconsequential, both in the amount of disk space required for storage as well as the amount of delay introduced in drive access time, even on older plate/spindle systems. Heck 32 bits is only half of a CPU register these days, even accounting for a security bit and a virtualization bit

Thanks! You've made it a really easy decision. I might as well go ahead and just datestamp everything. No real reason not to!

Including datestamps can be very useful for tracking changes and managing data, but you are right to consider the potential impact on database size and performance. A good approach is to assess the specific use case for each table. If the data will benefit from tracking creation, then a datestamp is worthwhile.

commented: Thanks! +2
Be a part of the DaniWeb community

We're a friendly, industry-focused community of developers, IT pros, digital marketers, and technology enthusiasts meeting, networking, learning, and sharing knowledge.