swinchen 0 Newbie Poster

I am working on a project that requires me to clean data from a series of RS-232 instruments. These instruments can have a different version number, and output format (both of which can change the expected ouput). On top of that there are flags that determine if certain fields are included in the output. Here is what I have now and it _is_ working but I am wondering if there is a cleaner, more efficient, "better" way to do this. I am quite new to python and don't know all the ins-and-outs of the language yet. I know NFA are notorious for being CPU hogs...


So the way this works: data is a raw data string, and config is a ConfigParser object that contains information such as the version number of the instrument, output format, and any relevant output flags. The validation_map stores fragments of regular expressions (keyed on the version,output_format) along with the optional output_flag and flag value required for regex fragment to be included. Now I form the pattern to match the data in a simple for loop that builds a much more complex regex. I like this much better than multiple if/elif/else statements... but there may be a better way that I don't know.

def validate(data, config):
    '''
    ttt.tttt,cc.ccccc, pppp.ppp, sss.ssss, vvvv.vvv, mm-dd-yyyy, hh:mm:ss (2.6b, Format=2)
    '''
    version = config.get('instrument', 'Version')
    format = int(config.get('instrument', 'Format'))
    #Key on version, format.  Data is stored as match pattern, config file field, required pattern for inclusion.
    validation_map = {('2.6b', 2) : [(r"-?(?:0|[1-9][0-9]{,2})\.[0-9]{4}", '', r""),
                                     (r",-?(?:0|[1-9][0-9]?)\.[0-9]{5}", '', r""),
                                     (r", -?(?:0|[1-9][0-9]{,3})\.[0-9]{3}",'Pressure', r"^[yY]$"),
                                     (r", -?(?:0|[1-9][0-9]{,2})\.[0-9]{4}",'OutputSal', r"^[yY]$"),
                                     (r", -?(?:0|[1-9][0-9]{,3})\.[0-9]{3}",'OutputSV', r"^[yY]$"),
                                     (r", (?:0[1-9]|1[0-2])-(?:[0-2][0-9]|3[0-1])-[0-9]{4}", '', r""),
                                     (r", (?:[01][0-9]|2[0-3]):[0-5][0-9]:[0-5][0-9]",'StoreTime', r"^[yY]$")]}
       
    if not validation_map.has_key((version,format)):
        return []
    
    pattern = r""
    for field in validation_map[(version,format)]:
        fragment, flag, condition = field
        if flag:
            if not re.match(condition, config.get('instrument', flag)):
                continue
        pattern += fragment
                
        results = []
        matches = re.finditer(pattern, data)
        for match in matches:
            results.append(match.span())

    return results

Thanks for looking at this!

Sam