There are some things that I am in a dilemna with about general database/table design issues. Each has its pros and cons. What do you think are the best practices? Here are some problems.

1. You have 2 tables (or sets of tables) that are exactly/almost exactly the same (in fields and conceptually they are also similar). Do you combine them into 1 table (or 1 set of tables) with an additional field to differentiate/classify them as to w/c type they are (more like the Object-Oriented approach)? Or keep them separate?
To what degree to do you have common tables? Because you can do it completely like OO and have a hierarchy that starts from Object and have all tables descendants from Object and so on.

some general examples:
* Inventory Receipts/Inventory Issuances/Inventory
Purchases/Inventory Adjustments/Sales/anything related to the ins and outs of inventory items.
- Do you have common table(s) for all or do you completely separate each transaction into its own set of tables?

* Job Order/Purchase Order
- Some fields may be different like the PO number and Job Order Number w/c might have different sequences. And the detail may contain different contents where one is about labor/work and the other is about inventory.

What do you think are the best practices? And the pros and cons of each in terms of:
a. ease of usage (coding/reporting)
b. performance
c. flexibility (can it easily adapt/grow to future changes/enhancement)
d. others

2. When combining similar tables, do you completely separate non-common fields into their own descendant tables? or just put the fields in and allow NULL values? If you decide to put nullable fields, how many fields do you think is a good number to decide when separation into different descendant tables is appropriate?


These are common problems most database designers encounter. I hope some of you can shed light into these.

Recommended Answers

All 8 Replies

The best design for any database is normalization, u need to design your table in such a way that, you should not have any redundancy, that is repeating values coz it will consume the user time and space of your hard disk. you need to design your table in such a way that you should not face any consistent or integration problems.
consider you are designing a table for a company, so you design a table for employees and departments, so you have to design two tables employee and department table and link a relationship between those two tables with the foriegn keys so that its easy for you to reduce redundancy in database. so its better to hava a separate tables for each entity and link a relationship between them.
I hope i answered your query, if you have any doubts or suggestions, please do reply me

So, do you mean that you are against fields like "type". Like example, for sales, you have a "type" field that identifies if it is take-out or dine-in or for delivery. And if it is for delivery, you would require an additional delivery address. Are you saying you should create table(s) for sales just to separate the "for delivery"? Or do you create a descendant table just for the address (with a foreign key to the sales table, of course)? Or do you just put in the Delivery Address in the Sales Table as a nullable field.
In this example, I think the best solution would be the last one. Although it is denormalized, it is simpler, easier, faster and requires less storage space.

But, how do you decide when you should put them as nullable fields or not? How many fields before it is time to put them in separate tables?

Another good example would be an inventory system. All types of transactions will end up to an IN and an OUT to your inventory. So, all sales, purchases, adjustment, issuance--although they may have slightly different fields in the header (or in the detail) they have a lot in common in the detail, like:
- the item that was involved
- the qty
- location/warehouse of item
- cost/value of the inventory

Do you create an ancestor table for all the detail tables? Also the header tables?
And do we create separate tables for descendants?

Pros and Cons of each?

What I do know is that if you don't create an ancestor table, reporting will be more difficult. It will involve a lot of UNIONs instead of just joins. Like, i want to get the total qty on hand, I would have to get the sum from different tables using UNION.

Normally I'd recommend completely normalizing your tables. It would be overhead (as you stated) when you start out and may seem inefficient. However, note that maybe some day your need to extend your database, or someone else needs to do your job, then a normalized table is the easiest to read and adjust. You can always trace back what's what. This also helps you if haven't looked at it in a while and need to do some work on it.

Normally I'd recommend completely normalizing your tables. It would be overhead (as you stated) when you start out and may seem inefficient. However, note that maybe some day your need to extend your database, or someone else needs to do your job, then a normalized table is the easiest to read and adjust. You can always trace back what's what. This also helps you if haven't looked at it in a while and need to do some work on it.

i create software for councils and for general testing we have a database with over 1100 tables. i believe that you should always normalise as it will reduce any headaches you have later, but if you dont have the time or you simply havent been paid enough to do the job :D then just slap one up trigger happy style. i have seen performance increase 3x when we did an update on another companies software (this included normalising multiple databases and tables into one)

There are cases when there are different ways to normalize. This is related to my question 1 in the first message.

For example, we have a simple inventory database tracking the quantities that go in and go out of our stock. (Let's make this a simple example. Assume the following basic fields only. And let's assume there is only 1 item). There will be only 3 types of transactions: Purchase, Sale, Adjustment. For purchases, we want to know who the supplier is. For sales, we want to know who the customer is. That's it. So, w/c of the following is the best approach:

(Let's leave out the other master data like Supplier/Customer. I think u can get the point)

SOLUTION 1: (each transaction in their own table)

Purchase (table)
PurchaseID,Date,Quantity,SupplierID
1,6/18/2007,10,A
2,6/19/2007,5,B
3,6/20/2007,1,C

Sale (table)
SaleID,Date,Quantity,CustomerID
1,6/18/2007,2,D
2,6/19/2007,1,E
3,6/20/2007,3,F

Adjustment (table)
AdjustmentID,Date,Quantity
1,6/20/2007,-1 (1 unit was lost)


SOLUTION 2: (common fields in 1 table. OO approach)

StockTransaction (table)
StockTransactionID,Type,Date,Quantity
1,Purchase,6/18/2007,10
2,Purchase,6/19/2007,5
3,Purchase,6/20/2007,1
5,Sale,6/18/2007,-2
6,Sale,6/19/2007,-1
7,Sale,6/20/2007,-3
8,Adjustment,6/20/2007,-1

Purchase (table)
PurchaseID,StockTransactionID,SupplierID
1,1,A
2,2,B
3,3,C

Sale (table)
SaleID,StockTransactionID,CustomerID
1,5,D
2,6,E
3,7,F

Adjustment (table)
AdjustmentID,StockTransactionID
1,8
(NOTE: This table could be eliminated)

END

So Solution 1 or 2? My thoughts on both solutions:

1. Both are normalized. It's just a matter of different object orientated hierarchy design.
2. Solution 2 makes it easier to get the total quantity on hand. Solution 1 would require unions to get the total quantity on hand or other reports involving quantities. Solution 2 also makes it a lot easier to combine a summary of all transactions.
3. Solution 2 can still be denormalized if necessary for performance reasons.
4. Solution 2 can have a problem where later, business rules may change and what you thought was a common field is no longer a common field. And you might have to resort to Nullable fields (w/c may cause denormalization because they are not dependent on the primary key--they are dependent on the Type). And it might not be easy to transfer the common field to it's descendants to make it uncommon. Solution 1 suffers less from this.

i dont see what fields would change that would affect this structure. you would never change primary key or foreign keys throughout a db after it has been designed (unless it has been designed badly) anyway.

decision 2 is far more efficient

The best design for any database is normalization

In this case yes, normalisation is correct. Your statement though is not entirely true; especially when datawarehousing and complex searches come into play.

Be a part of the DaniWeb community

We're a friendly, industry-focused community of developers, IT pros, digital marketers, and technology enthusiasts meeting, networking, learning, and sharing knowledge.