I am trying to count the number of hits a value in one file(column) falls between an interval from another file (two columns).

I am completely stuck on how to map it.

I tried something like this:

for line in file1:
if line[0]=line2[0] and line2[1]<line[1]<line2[2]:
print line

I'm not sure if this is correct.

file 1:
elem1 39887
elem1 72111

file 2:
elem1 1 57898
elem1 57899 69887
elem2 69888 82111

In file1 elem1 is an element in my project. the value 39887 is the start coordinate.

In file2 elem1 is still an element in my project, but the values are start and end coordinates. File2 is only a reference file.

For every line in file2, I want to see if the "elem#"=="elem#" in file 1. If the elem# in file1 is equal to elem# in file2, then I want to continue in this loop and see if the corresponding value in file1 is between the start and end positions in file2.

For instance, in the first line of file1, elem1==elem1 in the first line of file2. Since they are equal, is 39887 between 1 and 57898? Yes it is, therefore count it. I need to do this for every line in file2.

In the end, I want to see how many elements are within each group of coordinates from file2.

3
Contributors
7
Replies
8
Views
6 Years
Discussion Span
Last Post by Stackheuw

Thanks pyTony. I will definitely give this a shot. I'm going crazy trying to figure this out.

I want to see how many elements are within each group of coordinates from file2

Does this mean you want to count them or print/copy them?

You only have to store the first file in a container, and check each record from the second file against it. The following uses a dictionary and the test data submitted. To keep track of the number of records found, you can either change the dictionary to point to a list that also contains a counter, or if you think it is easier to understand, use a second dictionary using the same key pointing to a counter. Either way, post your code for more assistance.

``````file_1 = ["elem1 39887", "elem2 72111"]
file_1_dict = {}
for rec in file_1:
rec_split = rec.split()
key = rec_split[0]
if key not in file_1_dict:  ## allow for possible duplicte entries
## compare integers as strings sort from left to right
file_1_dict[key]=int(rec_split[1])

file_2 = ["elem1 1 57898", "elem1 57899 69887", "elem2 69888 82111"]
for rec in file_2:
rec_split = rec.split()
key = rec_split[0]
if key in file_1_dict:
low=int(rec_split[1])
high=int(rec_split[2])
print "testing key", key, low, high,
if low < file_1_dict[key] < high:
print "Found"
else:

Edited by woooee: n/a

Sorry for the late reply, they all went to my spam folder. I will post what I have, which is very similar to what you have. Yes, I want to track the number of records found in each region, if and only if it belongs to that element. Some could have same region, but different elements.

Does this mean you want to count them or print/copy them?

You only have to store the first file in a container, and check each record from the second file against it. The following uses a dictionary and the test data submitted. To keep track of the number of records found, you can either change the dictionary to point to a list that also contains a counter, or if you think it is easier to understand, use a second dictionary using the same key pointing to a counter. Either way, post your code for more assistance.

``````file_1 = ["elem1 39887", "elem2 72111"]
file_1_dict = {}
for rec in file_1:
rec_split = rec.split()
key = rec_split[0]
if key not in file_1_dict:  ## allow for possible duplicte entries
## compare integers as strings sort from left to right
file_1_dict[key]=int(rec_split[1])

file_2 = ["elem1 1 57898", "elem1 57899 69887", "elem2 69888 82111"]
for rec in file_2:
rec_split = rec.split()
key = rec_split[0]
if key in file_1_dict:
low=int(rec_split[1])
high=int(rec_split[2])
print "testing key", key, low, high,
if low < file_1_dict[key] < high:
print "Found"
else: