I have a table called "mytable". The columns are

Time_Stamp (datetime) PK
Time_stamp_ms (int) PK
data1 (int)
data2 (int)
data3 (int)
data4 (int) 
data5 (int)
data6 (int)
cycle (int)
name (varstring)

I want to order by Time_Stamp and Time_stamp_ms and then each time cycle reaches 1, I want to get the Time_Stamp and Time_Stamp_ms from the previous row. Cycle is 1,2,3,4......n Means it will always increment by 1.

This table will problably have millions and millions of rows.

Also no PHP.

There is a sample of my table:

Time_Stamp              Time_Stamp_ms   d1      d2      d3      d4      d5      d6      cycle   name    

2014-04-24 09:09:37         765         5555    4444    3333    2222    1111    123     1       name
2014-04-24 09:09:37         845         5555    4444    3333    2222    1111    123     2       name
2014-04-24 09:09:37         925         5555    4444    3333    2222    1111    123     3       name
2014-04-24 09:09:38         5           5555    4444    3333    2222    1111    123     4       name
2014-04-24 09:09:38         85          5555    4444    3333    2222    1111    123     5       name
2014-04-24 09:09:38         165         5555    4444    3333    2222    1111    123     6       name
2014-04-24 09:09:38         245         5555    4444    3333    2222    1111    123     7       name
2014-04-24 09:09:38         325         5555    4444    3333    2222    1111    123     8       name
2014-04-24 09:09:38         405         5555    4444    3333    2222    1111    123     9       name
2014-04-24 09:09:38         485         5555    4444    3333    2222    1111    123     10      name
2014-04-24 09:09:38         565         5555    4444    3333    2222    1111    123     11      name
2014-04-24 09:09:38         645         5555    4444    3333    2222    1111    123     12      name
2014-04-24 09:09:38         725         5555    4444    3333    2222    1111    123     13      name
2014-04-24 09:09:38         805         5555    4444    3333    2222    1111    123     1       name
2014-04-24 09:09:38         885         5555    4444    3333    2222    1111    123     2       name
2014-04-24 09:09:38         965         5555    4444    3333    2222    1111    123     3       name
2014-04-24 09:09:39         45          5555    4444    3333    2222    1111    123     4       name
2014-04-24 09:09:39         125         5555    4444    3333    2222    1111    123     5       name
2014-04-24 09:09:39         205         5555    4444    3333    2222    1111    123     6       name
2014-04-24 09:09:39         285         5555    4444    3333    2222    1111    123     1       name
2014-04-24 09:09:39         365         5555    4444    3333    2222    1111    123     2       name
2014-04-24 09:09:39         445         5555    4444    3333    2222    1111    123     3       name
2014-04-24 09:09:39         525         5555    4444    3333    2222    1111    123     4       name
2014-04-24 09:09:39         605         5555    4444    3333    2222    1111    123     5       name
2014-04-24 09:09:39         685         5555    4444    3333    2222    1111    123     6       name
2014-04-24 09:09:39         765         5555    4444    3333    2222    1111    123     1       name
2014-04-24 09:09:39         845         5555    4444    3333    2222    1111    123     2       name
2014-04-24 09:09:39         925         5555    4444    3333    2222    1111    123     3       name

Should return me:

    Time_Stamp              Time_Stamp_ms   d1      d2      d3      d4      d5      d6      cycle   name    


2014-04-24 09:09:38         725         5555    4444    3333    2222    1111    123     13      name
2014-04-24 09:09:39         205         5555    4444    3333    2222    1111    123     6       name
2014-04-24 09:09:39         685         5555    4444    3333    2222    1111    123     6       name

How do I do this?

Recommended Answers

All 45 Replies

Member Avatar for diafol

You seem to be returning the last row of each cycle set is that right?

You seem to be returning the last row of each cycle set is that right?

Each cycle ends with the row BEFORE the 1. 1 starts a cycle set.

Member Avatar for diafol

I think you'll have to go through the whole set, testing as you go along, using SQL variables to hold data from the previous row. I'll have to think about that. There are other ways I'm sure, but I would imagine that whichever method is chosen, it will be quite slow, especially on large tables without indexed columns.

Member Avatar for diafol
(SELECT * FROM times WHERE id IN (SELECT onerows
FROM 
(SELECT (@rownum:=@rownum+1) -1 AS onerows, a.* 
FROM times AS a, (SELECT @rownum:=0) r) AS t
WHERE cycle = 1)) UNION (SELECT * FROM times ORDER BY Time_Stamp DESC, Time_Stamp_ms DESC, cycle DESC LIMIT 1)

That seems to work for me, for a table called 'times'

SOrry, may not work - I added an autoinc 'id' PK field. I'll try again

commented: Three thumbs up. +12

Thank you. Seems like a very complicated query indeed....

Member Avatar for diafol

OK - this to get the ROW NUMBERS required, but now to retrieve those without having an 'id' PK is the issue. The following is a bit odd as MySQL does not allow an 'offset' without a 'limit'. The only advice (official MySQL) I could see was to use the BIGINT value - really ugly!

(SELECT onerows FROM 
        (SELECT (@rownum:=@rownum+1) -1 AS onerows, a.* FROM times AS a, (SELECT @rownum:=0) r) AS t
        WHERE cycle = 1) UNION (SELECT COUNT(*) AS onerows FROM times) LIMIT 1, 18446744073709551615

So that will give you the row numbers...13,19,25,28

Just a question of retrieving those row records - single value easy enough with a LIMIT clause, but this is giving me a nosebleed. Heh heh. I'm sure it's something really simple - but "wood and trees" at the mo.

//EDIT

Maybe use a cursor loop ?

Member Avatar for diafol

OK - got it to work!!

SET @i=0;
SELECT parent.* FROM (SELECT *, @i:=@i+1 AS i FROM times) AS parent INNER JOIN 
((SELECT onerows FROM 
        (SELECT (@rownum:=@rownum+1) -1 AS onerows, a.* FROM times AS a, (SELECT @rownum:=0) r) AS t
        WHERE cycle = 1) UNION (SELECT COUNT(*) AS onecount FROM times) LIMIT 1, 18446744073709551615)
AS gubbins ON gubbins.onerows = parent.id

Change the parent.* to specific fields you need as the calculated 'i' field is also displayed.

You know, it'd be sooo much easier if you'd just have an autoincrement PK instead of the datetime/ms as this solution basically replicates it anyway - but has to do it every time you run the query, which is pretty pointless.

I would have to do two queries there and Im not sure I can (I can do two but Im not sure if that variable would be persistant)

BTW, Im kinda of confused as you change field and table names.

Im going to try to apply it.

Yeah, tried and it has confused me. Could you please change the names? Thank you

Member Avatar for diafol

The only tablename you need to change is 'times'.

Why the constraint of one query? It's not homework is it? :)

Member Avatar for diafol

Scratch the last one, I realised the 'id' field crept in again. I gotta delete that. :(

//EDIT

Here it is:

SET @i=0;
SELECT parent.* FROM (SELECT *, @i:=@i+1 AS i FROM times) AS parent INNER JOIN 

((SELECT onerows FROM 
        (SELECT (@rownum:=@rownum+1) -1 AS onerows, a.* FROM times AS a, (SELECT @rownum:=0) r) AS t
        WHERE cycle = 1) UNION (SELECT COUNT(*) AS onecount FROM times) LIMIT 1, 18446744073709551615)
AS gubbins

ON gubbins.onerows = parent.i

Ok, if you really must run one query, perhaps calling a stored procedure may be better, however from what I remember, no subqueries allowed - so would be a majorly different to this. My ability with SP is severely limited :(

The "Times" table has a column named Time_Stamp and Time_Stamp_ms

Thats why Im not sure about "onerows" :S

Offtopic but changing a table to INNODB that had 2842501 records took 18 HOURS 38 minutes and 31.44 seconds. Biggest query Ive EVER done.

Member Avatar for diafol

'onerows' is just a name for the intermediate rownumbers equal to the row before the row where cycle=1, which ultimately show up in the 'i' column.

Out of interest - do you need all the 'lead-up' data? Could you not just store the last record of each set? Should decrease your record numbers by about 10-fold possibly?

'onerows' is just a name for the intermediate rownumbers equal to the row before the row where cycle=1, which ultimately show up in the 'i' column. Out of interest - do you need all the 'lead-up' data? Could you not just store the last record of each set? Should decrease your record numbers by about 10-fold possibly?

So "onerows" is a variable?

And yes, I need all the "lead-up" data.

Im gonna test this out now. Im still not completely sure if I can do it with that @i variable. Is it possible to do it without a variable?

Um:

Error Code: 126. Incorrect key file for table '/tmp/#sql_141c_1.MYI'; try to repair it

What does this even mean?

Member Avatar for diafol

So "onerows" is a variable?

Well it's a calculated field based on a variable.

s it possible to do it without a variable?

Possibly, but I can't think how. Is this a problem for you? Have you been told that you're not allowed to use them?

Error Code: 126. Incorrect key file for table '/tmp/#sql_141c_1.MYI'; try to repair it

Did you search? Seems an issue after big MyISAM->InnoDB

Possibly, but I can't think how. Is this a problem for you? Have you been told that you're not allowed to use them?

The program Im using to make a MySQL query wont allow them. I have some code below (from exterior help) that might make it easier...

Did you search? Seems an issue after big MyISAM->InnoDB

OK, so I should backup, destroy and remake the table but this time with InnoDB (I plan to use InnoDB)?

Here is the new code:

I had to add a additional field (decimal) to the table called "Time_Stamp_Full" . It takes the Time_Stamp field (datetime) converts it to UNIX_TIME and then the Time_Stamp_ms (int) get added to it.

So the query is this:

SELECT z.*
FROM
(
    SELECT a.time_stamp_full, MAX(b.time_stamp_full) AS latest_prev_record
    FROM mytable a
    INNER JOIN mytable b
    ON a.time_stamp_full > b.time_stamp_full
    WHERE a.cycle = 1
    GROUP BY a.time_stamp_full
) Sub1
INNER JOIN mytable z
ON z.time_stamp_full = Sub1.latest_prev_record

Opinions? The problem is that the query seems to also take forever and last time I ran it, it gave me a lost connection to mysql server during query...

Member Avatar for diafol

Joins create temp tables, so I'm assuming that you need to supply more memory to their creation.

The program Im using to make a MySQL query wont allow them.

My queries run OK on phpMyAdmin and SQLyog, so perhaps you should think about getting a better GUI. Or are you talking about something else?

Joins create temp tables, so I'm assuming that you need to supply more memory to their creation.

My queries run OK on phpMyAdmin and SQLyog, so perhaps you should think about getting a better GUI. Or are you talking about something else?

How can I supply more memory?

I have the following:

'innodb_additional_mem_pool_size', '1048576'
'innodb_buffer_pool_size', '268435456'
'innodb_log_buffer_size', '1048576'
'innodb_log_file_size', '5242880'

Its not a GUI based program. Something else.

Another option would be this:

    Time_Stamp              Time_Stamp_ms   d1      d2      d3      d4      d5      d6      cycle   name    cycle_id

    2014-04-24 09:09:37         765         5555    4444    3333    2222    1111    123     1       name        1
    2014-04-24 09:09:37         845         5555    4444    3333    2222    1111    123     2       name        1
    2014-04-24 09:09:37         925         5555    4444    3333    2222    1111    123     3       name        1
    2014-04-24 09:09:38         5           5555    4444    3333    2222    1111    123     4       name        1
    2014-04-24 09:09:38         85          5555    4444    3333    2222    1111    123     5       name        1
    2014-04-24 09:09:38         165         5555    4444    3333    2222    1111    123     6       name        1
    2014-04-24 09:09:38         245         5555    4444    3333    2222    1111    123     7       name        1
    2014-04-24 09:09:38         325         5555    4444    3333    2222    1111    123     8       name        1
    2014-04-24 09:09:38         405         5555    4444    3333    2222    1111    123     9       name        1
    2014-04-24 09:09:38         485         5555    4444    3333    2222    1111    123     10      name        1
    2014-04-24 09:09:38         565         5555    4444    3333    2222    1111    123     11      name        1
    2014-04-24 09:09:38         645         5555    4444    3333    2222    1111    123     12      name        1
    2014-04-24 09:09:38         725         5555    4444    3333    2222    1111    123     13      name        1
    2014-04-24 09:09:38         805         5555    4444    3333    2222    1111    123     1       name        2
    2014-04-24 09:09:38         885         5555    4444    3333    2222    1111    123     2       name        2
    2014-04-24 09:09:38         965         5555    4444    3333    2222    1111    123     3       name        2
    2014-04-24 09:09:39         45          5555    4444    3333    2222    1111    123     4       name        2
    2014-04-24 09:09:39         125         5555    4444    3333    2222    1111    123     5       name        2
    2014-04-24 09:09:39         205         5555    4444    3333    2222    1111    123     6       name        2
    2014-04-24 09:09:39         285         5555    4444    3333    2222    1111    123     1       name        3
    2014-04-24 09:09:39         365         5555    4444    3333    2222    1111    123     2       name        3
    2014-04-24 09:09:39         445         5555    4444    3333    2222    1111    123     3       name        3
    2014-04-24 09:09:39         525         5555    4444    3333    2222    1111    123     4       name        3
    2014-04-24 09:09:39         605         5555    4444    3333    2222    1111    123     5       name        3
    2014-04-24 09:09:39         685         5555    4444    3333    2222    1111    123     6       name        3
    2014-04-24 09:09:39         765         5555    4444    3333    2222    1111    123     1       name        4
    2014-04-24 09:09:39         845         5555    4444    3333    2222    1111    123     2       name        4
    2014-04-24 09:09:39         925         5555    4444    3333    2222    1111    123     3       name        4

Would this make things easier? If so, how can I fill it up right now?

update table set cycle_id= ???

So it can fill up the table this way

Member Avatar for diafol

I suggested an 'id' PK column at the start and for me, that would be the easiest option. WHat I fail to understand is why you're giving the set the same cycle_id - when adding to the db, you'd have to run a query to check the max value of cycle_id before inserting. I suggest just an autoincrement field long as it is sequential - no missing values, this should work...

SELECT times.* FROM times INNER JOIN 
((SELECT * FROM (SELECT id - 1 AS tid FROM times WHERE cycle = 1 AND id > 1) AS dt2) 
UNION 
(SELECT id AS tid FROM times ORDER BY id DESC LIMIT 1)) AS dt3
ON dt3.tid = times.id

Just add a PK int field called id to the table. I've called the datatable 'times' - the rest of the terms are just aliases and names for derived tables.

Its not a GUI based program. Something else.

Nothing like being secretive.

Having said that, if you do use the cycle_id as you suggest, then GROUP BY functions will allow a simple return. Have fun with the update sql :)

Memory: some info here - http://www.mysqlperformanceblog.com/2007/11/01/innodb-performance-optimization-basics/
But that's just a Google search - you can do that yourself.

I suggested an 'id' PK column at the start and for me, that would be the easiest option. WHat I fail to understand is why you're giving the set the same cycle_id - when adding to the db, you'd have to run a query to check the max value of cycle_id before inserting. I suggest just an autoincrement field long as it is sequential - no missing values, this should work...

I might not understand you correctly.....

In the case Im purposing I have two columns: "cycle" and "cycle_id". "cycle" is going to increment each time I take a sample (every 20 ms) of a complete cycle. "cycle_id" just makes sure those samples are of the same cycle.

Just add a PK int field called id to the table. I've called the datatable 'times' - the rest of the terms are just aliases and names for derived tables

Right now, just to clarify, I have 3 PKs: "Time_Stamp" which is a DATETIME, "Time_Stamp_ms" which is a INT, and "Time_Stamp_full" which is a DECIMAL.

Memory: some info here - http://www.mysqlperformanceblog.com/2007/11/01/innodb-performance-optimization-basics/
But that's just a Google search - you can do that yourself.

Im doing this off a simple NAS such as a Synology.

I dont think maybe Im describing the problem correctly because I think Im overcomplicating the problem.

Is there a way to find a 1 in the cycle column and just get the row before it? Yes, I know they are not ordered, Im just asking if there is a way to do this out of curiosity...

Member Avatar for diafol

I'm confused as to why you're using 3 PK cols - one of which is just a combination of the other two.

Is there a way to find a 1 in the cycle column and just get the row before it? Yes, I know they are not ordered, Im just asking if there is a way to do this out of curiosity...

That's what I've been showing you! In all the examples, I've done this...

WHERE cycle = 1

I'm simply suggesting that you use an autoincrement 'id' field - not cycle_id as your PK. If this field has no missing values in its sequence, i.e. nothing like 3,4,5,8,9,10 then my latest sql should serve you well.

The way it works is that

SELECT times.* FROM times INNER JOIN 
((SELECT * FROM (SELECT id - 1 AS tid FROM times WHERE cycle = 1 AND id > 1) AS dt2) 
UNION 
(SELECT id AS tid FROM times ORDER BY id DESC LIMIT 1)) AS dt3
ON dt3.tid = times.id

You select the 'id' field - 1 (the previous id) to where cycle value = 1.
But you forget the first one, because that's the first row and you can't have / don't need row=0 - therefore the id >1 in the WHERE clause.
The selecting of previous values does not include the last value in the table, so for that we need to extract the very last value and then join these via UNION, to create a single derived table (dt3).

A simple Inner Join then extracts records from the main table using the id (tid) values from dt3. Hope that makes sense.

Your other option of using cycle_id also has merit, and may be better.

That should be able to produce...

SELECT times.* FROM times 
    INNER JOIN (SELECT *, MAX(cycle) AS maxcycle FROM times 
    GROUP BY cycle_id) AS grouped 
    ON times.cycle_id = grouped.cycle_id 
        AND times.cycle = grouped.maxcycle

Which also works for me. So for that one, if you can update your table to have cycle_id as you mentioned, then the above should be fine.

Member Avatar for diafol

In order to set up your cycle_id values, you can do this...

SET @cycle_id:=0;
UPDATE times SET cycle_id = @cycle_id := IF(cycle = 1,@cycle_id+1,@cycle_id)

I know you don't like variables, but that's just to update your table. Do it in phpmyadmin or something similar, if your programme won't allow them.

Well, how do I currently fill up "cycle_id" with 1,1,1,1,1,1,1,1,1;2,2,2,2,2,2,2,2,2,2,2,2;3,3,3,3;4,4,4,4,4

In my current table data?

Also by a autoincremental, do you mean that each time I enter a new row, I increment a AI column? I guess I COULD use a BIGINT to do this....I dont ever thing Ill reach 9 223 372 036 854 775 807 rows. How do I fill it up for my current data?

BTW, I just want to explore all options right now and see which is easiest (also returns a answer quickly so we can implement them all)

The selecting of previous values does not include the last value in the table

Not sure what you ment by that.

Thank you for all your help. I would need to know how to currently fill up cycle_id with 1,1,1,1,1,1,1,1,1;2,2,2,2,2,2,2,2,2,2,2,2;3,3,3,3;4,4,4,4,4 to test it out.

Member Avatar for diafol

Well, how do I currently fill up "cycle_id" with 1,1,1,1,1,1,1,1,1;2,2,2,2,2,2,2,2,2,2,2,2;3,3,3,3;4,4,4,4,4

I just posted the SQL you need to do that!

SET @cycle_id:=0;
UPDATE times SET cycle_id = @cycle_id := IF(cycle = 1,@cycle_id+1,@cycle_id)

OK, let's get this straight.

Options:

1) use the multiple SQL queries using variables
2) Add an autoincrement id column
3) Add a cycle_id column

Let's not confuse the three different approaches.

The 1) option doesn't seem to fly as you don't like variables.
The 2) option is the one I suggested using the UNION
The 3) option is the one YOU suggest with the cycle_id for each set of data

I think I've covered all three sufficiently so that any of them should work.

The selecting of previous values does not include the last value in the table
Not sure what you ment by that.

Simply that in the dataset sample you provided, row 28 should be returned too, but because we're returning 'previous' values to rows where cycle = 1, the last value in the table can't be retrieved by this, hence the requirement to have the last row chosen separately.

WRT option3 - you may need an autoincrement id field anyway, as there's no real way to order your data before running the update as timestamps could overlap and therfore sets are all over the place.
Once you've updated, then order can be set via cycle_id. But as you can appreciate, this may be a problem for you.

Member Avatar for diafol

Here's a screenshot of the table, which is now ordered by PK Full timestamp - messed up

d1c8a906375092ea0b1687d35ec2b446

if your data is anything like this, I don't think you can do anything with it. Perhaps start again?

Didnt see that SQL sorry...

Running it now. Its gonna take a while to update 5 million records.

Option 1 is not possible.
Option 2 is possible but using a BIGINT to list all the records in a incremental mannner...Im not a fan. Also, wouldnt this pose a small problem if I delete a record?
Option 3 is IMO the one I most like. I think I perfer to go down that route as at the end of the day, I think it is VERY likely that the final production table will have a format exactly like this.

So lets go with option 3. Its fill up that column as we speak so....Lets see how long it takes to update all 3 million plus of those records.

Done. Took 37 minutes :P

Trying now:

1.SELECT times.* FROM times 
2.    INNER JOIN (SELECT *, MAX(cycle) AS maxcycle FROM times 
3.    GROUP BY cycle_id) AS grouped 
4.    ON times.cycle_id = grouped.cycle_id 
5.        AND times.cycle = grouped.maxcycle
Be a part of the DaniWeb community

We're a friendly, industry-focused community of developers, IT pros, digital marketers, and technology enthusiasts meeting, networking, learning, and sharing knowledge.