Are you allowing the logical expressions XOR and NOT to be included in your model?
No, since the expressions translate to SQL statements (the WHERE clause to be more specific), I'm sticking to the usual AND and OR operators for the moment.
Will there ever be more than 3 variables (A,B,C)?
Yes, in the current model there is no limit to the number of AND rules, therefore no limit to the number of variables.
I'm not clear on your use of A=A' (A prime). Can you explain a little more?
I use this notation to illustrate in a generic form the two operands of an equivalence test. In practice, I could have said Field = Value, since...
A, B, C... are field names in a table out of the thousand fields we are mining into. For example, if I have a table X in my database with fields X1, X2 and X3, variable A would represent one of those fields (X.X1), B another field (X.X3) and C could be Y.Y65 in a different table. I'm mining into about 1000 fields spread across 100 tables as it is right now.
A', B', C'... are values we are testing against the selected fields. For example, if X1's type is numerical, the expression A = A' could translate to X.X1 = 10. This is how it would be stored in the AndRule table:
A | Operator | A'
-------------------------
X.X1 | = | 10
X.X3 | >= | 10 Feb 1971
Y.Y65 | = | 'apple'
You'll notice that I can test against numerical values, dates and strings. Of course to do this I must track each and every field in the database and know its data type.
So, there are 3 things to track (logical operators, variables, equivalency operators). In my mind this indicates 3 tables.
I would add a fourth dimension to track: the precedence set by the use of parentheses. These two expressions are not logically identical:
(A AND B) OR C
A AND (B OR C)
To me, that is the most difficult part and the main reason why I'm looking for a shortcut
Thanks for the help and the book reference, I'll look it up.