0

Dear friends,
I have a set of files created at certaing time. I would like to group the result where consecutive time difference is more than 1 minute.
For example

A 10:01:25
B 10:01:29
C 10:01:52
D 10:01:58
E 10:02:12
F 10:02:22
G 10:11:01
H 10:11:18
I 10:30:00

The above should result in
Group1
A 10:01:25
B 10:01:29
C 10:01:52
D 10:01:58
E 10:02:12
F 10:02:22

Group2
G 10:11:01
H 10:11:18

Group3
I 10:30:00

This is because the difference between a and b, b and c,...e and f is lower than 1 minute. F and G are about 9 minutes apart, so I need another group here.
I mean, the difference should be done each with the next in alphabetic order (and values are already provided in alphabetic order).
What would be the best approach?
Thanks

2
Contributors
1
Reply
46
Views
2 Years
Discussion Span
Last Post by Gribouillis
1

You need to add the group number, for example

import datetime as dt
import itertools as itt
import operator as op

def todatetime(string):
    return dt.datetime.strptime(string, '%H:%M:%S')

def enhanced_sequence(seq):
    """yields pairs (group number, time string)

    Arguments:
        seq: sequence of ordered time string 'HH:MM:SS'
    """
    seq = iter(seq)
    prev = next(seq)
    dtprev = todatetime(prev)
    group = 0
    yield (group, prev)
    for s in seq:
        dts = todatetime(s)
        if (dts - dtprev).total_seconds()/60.0 >= 1
            group += 1
        yield (group, s)
        prev, dtprev = s, dts

def time_groups(seq):
    """Yield list of grouped time strings

    Arguments:
        seq: sequence of ordered time string 'HH:MM:SS'
    """
    for key, g in itt.groupby(enhanced_sequence(seq), key=op.itemgetter(0)):
        yield [x[1] for x in g]

Edit: bugfix

Edited by Gribouillis

This topic has been dead for over six months. Start a new discussion instead.
Have something to contribute to this discussion? Please be thoughtful, detailed and courteous, and be sure to adhere to our posting rules.