I'm designing a database for a company that retails computer products, phones, faxes and so on.
As you can imagine, there are a number of different types of products, eg. hard drives, CPUs, routers, optical drives and so on.
One of the requirements is that for each category there must be a list of features, eg. Hard drives would have an int value stored under a column named capacity, a real value for seek time and so on. Unfortunately, it is not possible to store these values as strings since the user should be able to do a category specific search such as "find me all hard drives with a capacity greater/less than x".
Firstly, I have already created a products table which stores details for all products such as brand, cost, selling price. This is in turn linked to a category table such as hard drives.
I have come up with two possible solutions:
1) Create tables for each category that has unique features. So I would need to create a table for hard drives with columns such as capacity, seek time, number of platters, interface and so on.
The problem with this approach is maintainability: a DBA is required in order to add/edit/delete category specific features. The dev time for such an approach will also be high.
2) Create a table named CategoryFeatures which stores the following data:
The first two columns are self-explanatory. The third column stores the name of the feature, ie. Capacity, the 4th stores a default value for this particular category feature, ValueType stores the type of value stored for the particular feature (such as Int or Real) and rank is an ordinal which allows ordering and reordering of category features.
Create a table named ProductCategoryFeatures which stores the following information:
This particular table simply stores the values as sql_variants. For example, the Seagate ST3400014a would have a capacity of 40 so the Product ID would correspond to this product, the CategoryFeatureID would correspond to the Capacity CategoryFeature for the Category of Hard Drives.
The obvious problem here is loose typing. Perhaps the best way to illustrate is by example. App calls sp_getProduct with ID corresponding to ST340014a. The following two datasets are returned:
Product <ID, CatID, Name,...>
ProductCategoryFeatures <CategoryFeatureID, FeatureName, FeatureValue, ValueType, Rank>
The problem here is data type fidelity. Conversion is required but not suitable to be done at the database level. Where should the conversion be done? Can it be done at the DAL level? Perhaps in the business logic layer.
Fortunately, sql server stores metadata with each sql_variant datatype. Consider the following search query submitted by client app: "select all hard drives where capacity is > x). Now since all of the underlying values of the sql_variants compared in this query are of type int there should not be a problem (quick test should verify).
I'm not sure how indices are ordered for columns of type sql_variant, however; order by clauses seem to return sql_variants grouped by type. Perhaps additional overhead will be manageable; I don't forsee any situations in which I will compare sql_variant values with different underlying basetypes.
I'm definitely inclined to opt for the second option... I don't want to be repeatedly contacted to update this category or that category. It really is something the client should be able to do themselves.
In any case, if you have any advice or suggestions I would really appreciate it.