Extract Blocks of Info from Log File
New learner..this would be my first try at writing in Python that I can actually put to use:

Have log file content that looks like this

Basically - I only want to keep "slots" with the indicator "/* mysql */" - the blocks in RED BOX, e.g. Line 13~15 and line 20~22, discard everything else.

Please tell how best to do this and I will try to hack something out
(or if you have something ready, that'd be great )

7 Years
Discussion Span
Last Post by Namibnat

Pretty sure this will work. If you have any questions just ask.

from string import find

def main():
    infile = open("logfile.txt", "r")
    pastLine = infile.readline()
    cont = 0
    for line in infile:
        if cont == 1:
            if find(line,'-------') > -1: #find returns -1 if not found
                cont = 0
                print ''
                print line[:-1]
        elif find(line,'/* mysql */') > -1:
            print pastLine[:-1]
            print line[:-1]
            cont = 1
        pastLine = line
#! /usr/bin/python

# parse_logfile.py

import re

class GetMySQLLog:
    """Search a logfile for mysql entries blocks"""
    def __init__(self):
        self.logfile = open('/path/to/test/file/logfile', 'r')  # Correct this path
        self.logLi = self.logfile.readlines()
        self.counter = 0
        self.plus = ""
        self.storeli = {}
        self.isMysql = []
    def getSql(self):
        """Set each block to a dictionary, 
        and then create a list
        which has strings
        of the blocks containing the mysql sections"""

        for a in self.logLi:
            if re.search('^-', a):
                self.counter = self.counter + 1
                self.storeli[self.counter] = []
        for k, v in self.storeli.items():
            for joiner in v:
              self.plus = self.plus + joiner
            if re.search('.+mysql.+', self.plus):
            self.plus = "" # To reset the string
        for printOut in self.isMysql:
            div ="-"*30 + "\n"
            print div + printOut + div

if __name__ == "__main__":
    test = GetMySQLLog()

Just another way. I am sure that it could be cleaned up a bit, but it does work (I have been playing around with it.) By putting it in a class you can do various things with it - print it out (as I have done here), write it to a file, or create a database entry with it (which I assume you would want to do?)

Basically here each section is stored as a string in a dictionary - self.isMySQL.

I used Regular Expression to find the right blocks.

In the Python command line, you can 'from parse_logfiel import GetMySQLLog' and then create an instance of the class, say 'sql_inst = GetMySQLLog'. Then you can get the output with 'sql_inst.getSql()'

Or you could run it from the command line (depending on your os) with 'python ./parse_logfile.py' so long as you are in the right directory.

Votes + Comments
excellent, complete solution

Thanks so much, Namibnat,
It works great !

What I want to do next with the code is to
1) parameterize the File Path (and maybe rename, zip)
2) Write extraction to file.

Edited by rmatelot: n/a


I modified line#32 to

if re.search('/[*] mysql [*]/', self.plus):

Namibnat, can you tell why you coded Lines 30~31 instead of just this:

self.plus = "".join(v)

Edited by rmatelot: n/a


No, I can't?? It must have come from trying things and just not cleaning it all up enough? Perhaps I wanted to check for something in there? I don't know. Certainly looking at it now, the self.plus = "".join(v) would be better.

This question has already been answered. Start a new discussion instead.
Have something to contribute to this discussion? Please be thoughtful, detailed and courteous, and be sure to adhere to our posting rules.