Edited 4 Years Ago by Despairy: fix
Im trying to parse a file which contains some random text.
How can i match a case which seperates "garbage" (anything but digit/letter) char??
for e,g: 25.5.5 will produce . (the 2nd dot, because 25.5 is a number (rational))
----3.82 will produce --- (because -3.82 is a rational)
but it wont fit the cases i mentioned above.
Edited 4 Years Ago by Despairy: fix
pattern = (?P<data>[+\-]?(?:(?:\d+\.\d+)|\w+))|(?P<garbage>.) for m in re.finditer(pattern, garbled_text): print m.group('data'), m.group(garbage) # m.groupdict() will also work
Your code did not quite run and produce expected result, this was what did give it, after some debug:
import re garbled_text= '----3.82' pattern = r'(?P<data>[+\-]?(?:(?:\d+\.\d+)|\w+))|(?P<garbage>.)' g = '' for m in re.finditer(pattern, garbled_text): if m.group('data'): print m.group('data'), repr(g) # m.groupdict() will also work g = '' else: g += m.group('garbage')
Hmmmm i still have a problem.
i tried that pattern but it didnt catch alot of stuff.
what i need is:
AAAA188.8.131.52.2.2.AAAjraw AJR53 ++--15.041%58#*&%# &.# &.*.#
the output of the junk collector should be :
. . . ++- % #*&%# &.# &.*.#
this pattern almost solves everything except the dots between the 184.108.40.206.2. i think
could rly use abit more help thanks :)
The code is fine, albeit I missed the quotes on
garbage here while posting.
import re pat = '(?P<text>[+\-]?(?:(?:\d+\.\d+)|\w+))|(?P<garbage>.)'; for m in re.finditer(pat, "+25.5.5 ---3.82sscs+220.127.116.11"): print m.group('text'), m.group('garbage') #m.groupdict()
Which is the output I intended (seggregate data from garbage). The OP can do with these values as he pleases.
@Despairy: The pattern is doing fine, except it is Not matching the first digit since I used
\w (matches [0-9] also) in my regex. Substitue it by
[a-zA-Z] and I forgot to make the decimal part optional and the DOT Matches newline part . Here is my sample code, custom fitted for your need:
pat = '(?P<text>[+\-]?(?:(?:\d+(?:\.\d+)?)|[A-Za-z]+))|(?P<garbage>.)'; # \w -> [A-Za-z] data =  garbage= for m in re.finditer(pat, sample, re.S): print m.group('text'), m.group('garbage') #m.groupdict(); Debug Output to see seggregation if m.group('text') is not None: # Do /your/ processing here data.append(m.group('text')) if m.group('garbage') is not None: garbage.append(m.group('garbage')) print "Data %s = %s" % (data, ''.join(data)) print "Junk %s = %s" % (garbage, ''.join(garbage))
AAAA None 1AAAA None 15.2 None None . 2.2 None None . 2.2 None None . AAAjraw None None AJR None 53 None None None + None + None - -15.041 None None % 58 None None # None * ... ... Data ['AAAA', '15.2', '2.2', '2.2', 'AAAjraw', 'AJR', '53', '-15.041', '58'] = AAAA18.104.22.168AAAjrawAJR53-15.04158 Junk ['.', '.', '.', ' ', ' ', '+', '+', '-', '%', '#', '*', '&', '%', '#', ' ', '&', '.', '#', '\n', '&', '.', '*', '.', '#'] = ... ++-%#*&%# &.# &.*.#
Edited 4 Years Ago by nbaztec: Output updated
Hi. so this is actually a continuation from another question of mineHere but i was advised to start a new thread as the original question was already answered.
This is the result of previous question answered :
code for the listbox - datagridview interaction
At the top of the code ...
I have a 2d matrix with dimension (3, n) called A, I want to calculate the normalization and cross product of two arrays (b,z) (see the code please) for each column (for the first column, then the second one and so on).
the function that I created to find the ...
Hi. I have a form with list box : lst_product, datagridview : grd_order and button: btn_addline. lst_product has a list of product ids selected from database (MS Acess 2013) , grd_order is by default empty except for 2 headers and btn_addline adds rows to grd_order.
Private Sub btn_addline_Click(ByVal ...