Hi, folks. Need some help to develop a method/function. No problem if there is no code but I would like to start discussing the best way of doing this even in high level language. First of all would like to provide you with some information. The set that I use here is very small, but the data is on a table in mysql and there will be over a million rows. Its important because if I make a complex loop code, maybe there will be a lot of selects in there. The problem:

I have a set of numbers, all in a table called res_sup (code,value,date,type):

ROW1 = N1 | 100 | 06/16/2011 | R
ROW2 = N1 | 150 | 06/15/2011 | R
ROW3 = N1 | 170 | 06/14/2011 | R
ROW4 = N1 | 200 | 06/13/2011 | R
and so on...

The result that I need: save in a table called res_sup_per (code,date1,date2,value1,value2,perc,type) the following results:

ROW1 = N1 | 06/16/2011 | 06/15/2011 | 100 | 150 | 50% | R
ROW2 = N1 | 06/16/2011 | 06/14/2011 | 100 | 170 | 70% | R
ROW3 = N1 | 06/16/2011 | 06/13/2011 | 100 | 200 | 100% | R
ROW4 = N1 | 06/15/2011 | 06/14/2011 | 150 | 170 | 13% | R
ROW5 = N1 | 06/15/2011 | 06/13/2011 | 150 | 200 | 33% | R
ROW6 = N1 | 06/14/2011 | 06/13/2011 | 170 | 200 | 18% | R

feel free to coment and give your ideas. tks

It looks like you are comparing each row to every other row. For 1,000,000 records this will result in 499,999,500,000 rows (that's almost 500 billion) and comparisons. That's a lot of time required there.

Also, at a minimum size of 11 bytes per record (you don't need to store the percent, it's a calculated field) you'll be using around 6TB for the database.

yeah! thats why I post it here. To find a way. Any idea ? I could for example, calculate de % in run time, but I think it would go for the same problem. Imagine I have 900 codes. each user passing one or any codes to find the percentage ?

It is almost impossible to reduce physical space that you need to store those result unless you change how your program works.

Your problem can be written within a single SQL query

INSERT INTO res_sup_per 
SELECT res1.code,res1.`date`,res2.`date` ,
FROM res_sup AS res1 INNER JOIN res_sup AS res2 
WHERE res2.value > res1.value;

There is a way to improve your program performance by simply add new result as you add new code.

this worked just fine. but as said before, it will be impossible to store all data... i wonder if there is a another way of doing that... I will have thousands of clients requesting for a specific percentage. for instance, i wanna know which dates have 40% of difference between the values. but I thought that calculating them at the end of the day it would be faster. now i face the storage problem... any ideas ?

I'm getting outside my area of expertise here, so this could be completely rubish...

Would handling this data (code, date, value) as dimensions of a cube solve this need?

it could work but the problem is the client side... how it would work in a simple client/server instalation ?

You could query the cube and get the results as you would from the table you were planning to create.
The fact that you are working with a cube doesn't mean that you can't show the data on the client. It's just that you'll have to work "harder" on the server side.

and which free tool would you recomend ?

hi invisal, I finnaly could try your suggestion, but i'm getting an error. its saying that im having duplicated values for primary key. can you imagine what could be wrong ? thank you in advanced

desc table res_sup:

field type null key default

symbol varchar(50) NO PRI
value decimal(30,5) YES
date date NO PRI
type char(1) YES

desc table res_sup_perc

field type null key default
symbol varchar(50) NO PRI
date1 date NO PRI 0000-00-00
date2 date NO PRI 0000-00-00
value1 decimal(30,5) YES
value2 decimal(30,5) YES
per decimal(30,5) YES
type char(1) YES

query im using:

SELECT res1.symbol,res1.`date`,res2.`date`,
WHERE res2.value > res1.value;

error im getting:

Error Code: 1062. Duplicate entry '$PTAX-2002-10-23-2002-10-22' for key 'PRIMARY'