Hi, I was hoping someone could help out a python novice like me.

I want to find in my data when there is three or more 1s in a row. This is a sample of what my input data would look like below. With the first number in each column being a position and the rest of the numbers being either not significant (0) or significant (1).

78	79	80	81	82	83
0	0	0	0	0	0
1	0	0	1	1	0
0	0	1	1	1	0
0	1	1	1	1	1

This is my code so far:

def sigfind(in_file,num_repeats):
         numbers = []
	 for line in in_file:
	    numbers = line.split()
            i = 0
            while i < len(numbers)-1:
               if [numbers[i]] == [numbers[i+1]] == num_repeats:
                  i += num_repeats
	          i += 1

But I am having trouble finding the consecutive numbers as well as outputting it in the right format.

The way in which I want to output it is something like below, where it will tell me the row number (with the row numbering not including the first number --> the position), when the consecutive 1s started and where they ended; if there was three or more 1s in a row.


row  start_pos  end_pos
3    80         82
4    79         83

I hope this makes sense....Anyone have any ideas?

7 Years
Discussion Span
Last Post by griswolf

You want to start testing at the third element & checking the prior two. What you have will error because when the last element is reached, there is no +1 (also do not use "i", "l", or "O" as single digit variables as they can look like numbers).

## convert to integer
numbers = [int(x) for x in numbers]
for ctr in range(2, len(numbers)):
    if numbers[ctr] == 1:
        if numbers[ctr] == numbers[ctr-1] == numbers[ctr-2]:
            print numbers[ctr], "occurs three times at positions",
            for x in range(3):
                print numbers[ctr-x],

Edited by woooee: n/a


Consider using a pattern. What you want is to match pattern = "1\t1\t1" and then translate the index in the row into the column number.

Note that because of the tabs,

column 1 is index 0
    column 2 is index 3
    column 3 is index 5
... column N is index 2*(N-1)

so your code does something like

pattern = "1\t1\t1"
for row in file:
        i = row.index(pattern)
        column = 2*(i-1)
    except ValueError:
        pass # don't mind if there's no match

Figuring out how to find the column "name" from the column number is as easy as split()...

This topic has been dead for over six months. Start a new discussion instead.
Have something to contribute to this discussion? Please be thoughtful, detailed and courteous, and be sure to adhere to our posting rules.