Hi all,

First post on this forum so a BIG hello from me to all you IT Gurus out there!

Need some advice solving a little problem I have with my database!

Current setup:

I have a person table that is made up of 39 columns. I also allow for person records to be deleted but I do this by having another table I call LogicallyDeletedrecords. This table is made up of the PersonId, Reason for deletion/suppression and a date time stamp. To access Live records I created a view based on my Person table which contains a WHERE clause to exclude records that exist in the LogicallyDeletedrecords. Similarly, I have another view DeadPersonData which contains Person records that have been removed. Hope it all makes sense so far! Now on to my worries!

The problem:

My Person table contains 9+ million records. The LogicallyDeletedrecords table has 500k+ but I anticipate further growth over the coming weeks/months. My worry is that my LivePersonData view will be too slow to access as my LogicallyDeletedrecords table grows. What’s more, as part of my Load routine, I have to make sure that Person data loaded on to the system is excluded if that same person exists as a deleted member. Both of these actions could slow down my system as the deleted table grows.

My thoughts:

I’ve been thinking of physically deleting dead Person records from my person table (possibly creating an archive table to hold them). But then if I delete them how do I cross check the details when new Person details get loaded?! As I said, my current LogicallyDeletedrecords table holds the PersonId, ReasonDeleted and CreationStamp. The only way is to add further columns which I use to match Person Details?

Any design suggestions would be welcome!

Recommended Answers

All 6 Replies

The normal way to do this is to have a logicallyDeleted flag on the main table.

You would normally have a trigger fire when this flag is set to true to copy the row to your "logicallyDeleted" table for you.
Deleting a row can then be done by simply setting logicallyDeleted to true.

Not having to join to the "logicallyDeletedRecords" table every time you do a select will make your queries an order of magnitude more efficient.

The normal way to do this is to have a logicallyDeleted flag on the main table.

You would normally have a trigger fire when this flag is set to true to copy the row to your "logicallyDeleted" table for you.
Deleting a row can then be done by simply setting logicallyDeleted to true.

Not having to join to the "logicallyDeletedRecords" table every time you do a select will make your queries an order of magnitude more efficient.

I’m not sure I explained myself properly so I will try again. Your suggestion doesn’t seem to solve my dilemma!

Here let me explain in more detail:

Let’s say my current Person table contains 1000 rows. I decide to suppress 100 of them. The way I do this at the moment is populate a LogicallyDeletedRecord table with the PersonId. I then create 2 views. One view for LivePerson details (i.e. all PersonIds where they don’t exist in my LogicallyDeletedRecord table) and a DeadPersonData view (i.e. all PersonIds that do exist in my LogicallyDeletedrecords table).

This works fine if both the Person and the LogicallyDeletedrecords table contain relatively small number of rows. But imagine if my person table has 20 million records and my LogicallyDeletedRecords table contains 2 million. To access my 2 views would slow down the system. What’s more, when I come to load new Person data, I have to ensure that any new Person data doesn’t already exist in my Person table! This is done by matching Postcode, Address Line 1, Forename and Surname. I can only this on the real Person table as the LogicallyDeletedRecords table only contains PersonIds which is a unique number within my system.

Does this make sense?

From your description, blater's solution will do what you want. If you want to 'delete' a record, why move it to another table - just delete it. However for some reason you want to retain the data, but not display it, so just add another field for 'Display' and set it to either 0 or 1. Then use a WHERE clause to filter what you want.

You keep mentioning deleting and archiving in the same sentence. What exactly do you want to do?

From your description, blater's solution will do what you want. If you want to 'delete' a record, why move it to another table - just delete it. However for some reason you want to retain the data, but not display it, so just add another field for 'Display' and set it to either 0 or 1. Then use a WHERE clause to filter what you want.

You keep mentioning deleting and archiving in the same sentence. What exactly do you want to do?

My Person table holds live and dead records. When somebody asks for their details to be removed from the database, I don't want to remove them for good. There is a good reason for this by the way. If that same person details come in a new file, there is no need to load them again as they have been removed before. This is why I need to audit which records get deleted.

Hope this makes more sense.

If that is your only reason, then again, the suggested solution of setting a 'Display' field to True or False should suffice for your needs. If not can you explain why not? It seems to be quite a waste of space and maintenance to retain all those 'deleted' records, when all you need is an extra field of 1 character.

I agree, I will be implementing this very soon. Thanks to you both for your input

Be a part of the DaniWeb community

We're a friendly, industry-focused community of developers, IT pros, digital marketers, and technology enthusiasts meeting, networking, learning, and sharing knowledge.