noobie asking for assistance with file parsing...

Reply

Join Date: Jun 2005
Posts: 6
Reputation: optomystique is an unknown quantity at this point 
Solved Threads: 0
optomystique optomystique is offline Offline
Newbie Poster

noobie asking for assistance with file parsing...

 
0
  #1
Jun 7th, 2005
greetings,

i'm brand new to perl and trying to write a script to read a logfile for our weblogic server and write certain entries into a database table. the log file is a log4j weblogic log. here is a sample:

####<Jun 7, 2005 2:46:38 PM EDT> <Info> <Enterprise> <ga003sds> <tms1> <ExecuteThread: '12' for queue: 'weblogic.kernel.Default'> <setup> <> <000000> <Calculating Rates for all Routes (Time = 1299 ms)>
####<Jun 7, 2005 2:46:38 PM EDT> <Info> <Enterprise> <ga003sds> <tms1> <ExecuteThread: '12' for queue: 'weblogic.kernel.Default'> <setup> <> <000000> <Time to get routes: 4105>
####<Jun 7, 2005 2:46:40 PM EDT> <Info> <Enterprise> <ga003sds> <tms1> <Thread-17> <anonymous> <> <000000> <HUDCache: Retrieve new value (Time = 12636 ms)>
####<Jun 7, 2005 2:47:09 PM EDT> <Warning> <EJB> <ga003sds> <tms1> <ExecuteThread: '13' for queue: 'weblogic.kernel.Default'> <anonymous> <> <BEA-010096> <The Message-Driven EJB: EcommerceOrderManager is unable to connect to the JMS destination: integration.eCommerce.orderCreation.request. Connection failed after 1,088 attempts. The MDB will attempt to reconnect every 10 seconds. This log message will repeat every 600 seconds until the condition clears.>
####<Jun 7, 2005 2:47:09 PM EDT> <Warning> <EJB> <ga003sds> <tms1> <ExecuteThread: '13' for queue: 'weblogic.kernel.Default'> <anonymous> <> <BEA-010061> <The Message-Driven EJB: EcommerceOrderManager is unable to connect to the JMS destination: integration.eCommerce.orderCreation.request. The Error was:
[EJB:011010]The JMS destination with the JNDI name: integration.eCommerce.orderCreation.request could not be found. Please ensure that the JNDI name in the weblogic-ejb-jar.xml is correct, and the JMS destination has been deployed.>
####<Jun 7, 2005 2:47:10 PM EDT> <Info> <Enterprise> <ga003sds> <tms1> <Thread-17> <anonymous> <> <000000> <HUDCache: Retrieve new value (Time = 11907 ms)>
####<Jun 7, 2005 2:47:40 PM EDT> <Info> <Enterprise> <ga003sds> <tms1> <Thread-17> <anonymous> <> <000000> <HUDCache: Retrieve new value (Time = 11983 ms)>
####<Jun 7, 2005 2:48:10 PM EDT> <Info> <Enterprise> <ga003sds> <tms1> <Thread-17> <anonymous> <> <000000> <HUDCache: Retrieve new value (Time = 11961 ms)>
####<Jun 7, 2005 2:48:39 PM EDT> <Info> <Enterprise> <ga003sds> <tms1> <Thread-17> <anonymous> <> <000000> <HUDCache: Retrieve new value (Time = 11850 ms)>
####<Jun 7, 2005 2:49:10 PM EDT> <Info> <Enterprise> <ga003sds> <tms1> <Thread-17> <anonymous> <> <000000> <HUDCache: Retrieve new value (Time = 11949 ms)>
####<Jun 7, 2005 2:49:40 PM EDT> <Info> <Enterprise> <ga003sds> <tms1> <Thread-17> <anonymous> <> <000000> <HUDCache: Retrieve new value (Time = 12519 ms)>
####<Jun 7, 2005 4:35:14 PM EDT> <Error> <Enterprise> <ga003sds> <tms1> <Thread-12> <<anonymous>> <> <000000> <EUC948732945 - Tue Jun 07 16:35:14 EDT 2005 - unknown - java.net.SocketException - Broken pipe
java.net.SocketException: Broken pipe
at java.net.SocketOutputStream.socketWrite0(Native Method)
at java.net.SocketOutputStream.socketWrite(SocketOutputStream.java:92)
at java.net.SocketOutputStream.write(SocketOutputStream.java:136)
at weblogic.servlet.internal.ChunkUtils.writeChunkTransfer(ChunkUtils.java:247)
at weblogic.servlet.internal.ChunkUtils.writeChunks(ChunkUtils.java:223)
at weblogic.servlet.internal.ChunkOutput.flush(ChunkOutput.java:298)
at weblogic.servlet.internal.ChunkOutput.checkForFlush(ChunkOutput.java:373)
at weblogic.servlet.internal.ChunkOutput.print(ChunkOutput.java:258)
at weblogic.servlet.internal.ChunkOutputWrapper.print(ChunkOutputWrapper.java:126)
at weblogic.servlet.jsp.JspWriterImpl.print(JspWriterImpl.java:282)
at jsp_servlet._routing._nextflightout.__template._writeText(__template.java:76)
at jsp_servlet._routing._nextflightout.__template._jspService(__template.java:279)
at weblogic.servlet.jsp.JspBase.service(JspBase.java:33)
at weblogic.servlet.internal.ServletStubImpl$ServletInvocationAction.run(ServletStubImpl.java:996)
at weblogic.servlet.internal.ServletStubImpl.invokeServlet(ServletStubImpl.java:419)
at weblogic.servlet.internal.ServletStubImpl.invokeServlet(ServletStubImpl.java:315)
at weblogic.servlet.internal.RequestDispatcherImpl.include
>
The entries in the log have a standard format: ####<date><log_level><type><server_hostname><servername><category><service><unknown><message_id><message_text>

The problem is that the message_text can consists of multiple lines of stack trace when the log_level = "Error". These happen to be the main records I'm interested in for now, but there are also some single-lined warnings we want to capture, as well.

I had started to write the following (please forgive, it's my first time using perl and the code isn't much :o ), but got hung up on the multi-line stuff.

[PHP]#!/usr/bin/perl -w
# get_errors.plx
# get errors from log file

use strict;

while (<>) {
my ($date, $log_level, $type, $serverhostname, $servername,
# $category, $service, $unknown, $message_id, $message_text) = split(/> </, $_);
$category, $service, $unknown, $message_id, $message_text) =
`m/^####<([ ,:A-Za-z0-9]+?)> <(.*?)> <(.*?)> <(.*?)> <(.*?)> <(.*?)> <(.*?)> <(.*?)> <(.*?)> <(.*?)>$/`;
print join "|", $date,$log_level, $type, $serverhostname, $servername,
$category, $service, $unknown, $message_id, $message_text;
print "\n";
}[/PHP]

I'm assuming the while(<>) will only read one line at a time? the 'm' that i tried to put in the expression errors out, so i'm not even sure if my expression to capture the lines is correct...

How can I properly read this file to capture a log entry at a time, whether it's single or multiple lines? any assistance is greatly, greatly appreciated!! :mrgreen:
Reply With Quote Quick reply to this message  
Join Date: Dec 2004
Posts: 2,413
Reputation: Comatose is a jewel in the rough Comatose is a jewel in the rough Comatose is a jewel in the rough Comatose is a jewel in the rough 
Solved Threads: 211
Team Colleague
Comatose's Avatar
Comatose Comatose is offline Offline
Taboo Programmer

Re: noobie asking for assistance with file parsing...

 
0
  #2
Jun 8th, 2005
I have a nice page for you to look at regarding "regular expressions" or "regex". Now, Regex is nice, but can be a bit confusing. Line Noise, as it's been called, can do magic, however, with a correctly set up expression, or expressions. I see that you tried with "m" which is the regular expression for "matching". In Perl, it defaults to m, so you could have done the same with just // instead of m//, but it's always better for readable to use the m. Here is a great page to learn and understand regex:

http://www.troubleshooters.com/codec...rl/perlreg.htm

I hope this helps some. If you are still having mad troubles with it, I'll be glad to take a look at your code, and offer what help I can. Also, I don't see you opening a file to read the input from. The while (<>) { actually is trying to read input from STDIN, unless you have changed that somewhere previously.
  1. open(FH "/home/mydir/somefile.txt");
  2. while (<FH>) {
  3.  
  4. }
  5. close(FH);

The Above is the best way IMO to go about this. It helps in readability... you are opening the file, using the filehandle FH. Then, In The While Loop, You Are Reading from FH (<FH>), line by line until it finds the EOF (end of file) character.

Also something to consider, is using the split function to get each of the information in the file in such <stuff here in your file>. For example, when I parse an HTML Page with Perl I do something like this:
  1. @tags = split(/</, $_);
  2. foreach $tag (@tags) {
  3. if (lc($tag) eq "b>") {
  4. print "Found Bold Tag\n";
  5. }
  6. }
And You Could Use Similar code to read your file, and then check which "tag" you are on.... it doesn't like look what you want to do is going to be an easy task, but let me know if I can help any further.
Reply With Quote Quick reply to this message  
Join Date: Jun 2005
Posts: 6
Reputation: optomystique is an unknown quantity at this point 
Solved Threads: 0
optomystique optomystique is offline Offline
Newbie Poster

Re: noobie asking for assistance with file parsing...

 
0
  #3
Jun 8th, 2005
hi comatose,

thanks for the link . i've got some experience with regular expressions, though the syntax in perl is a little different than what i'm used to. here is the basic regular expression i came up with to match one line:

[PHP]^[#]{4}<.*?> <.*?> <.*?> <.*?> <.*?> <.*?> [<]{1,2}.*?[>]{1,2} <.*?> [<]{1,2}.*?[>]{1,2} [<]{0,1}.*?[>]{0,1}$[/PHP]

but, the problem i'm having is that i don't just want the one line if the last field has a java stack trace. that is, can my regular expression continue to pick up information from lines that follow as in:

[PHP]^[#]{4}<.*?> <.*?> <.*?> <.*?> <.*?> <.*?> [<]{1,2}.*?[>]{1,2} <.*?> [<]{1,2}.*?[>]{1,2} [<]{0,1}.*?[>]$[/PHP]

? Here, i specified that the end of the line must be '>'. I'm wondering if it's going to fail that condition if it's not all on one line.

beyond that, i would like to get at least a few lines of the stack trace (or at least 255 characters) and remove any newlines from that section. ultimately, i want to read a log file and write out a file where each line is the pipe-delimited fields (and have the last field from the log file translated to a single line if it's multiple).

as for not specifying a file, i'm actually running the perl script with a file directed in as standard input for now: get_errors.plx < testfile.log

i'm going to be running this perl script on multiple log files, so i was going to have another script with a while loop to call in this manner on each file. i will probably change it to have the file name as a parameter. it would increase readibility as you said.

I'm still playing around with it, but would welcome any assistance and highly appreciate what you've offered so far.
Reply With Quote Quick reply to this message  
Join Date: Dec 2004
Posts: 2,413
Reputation: Comatose is a jewel in the rough Comatose is a jewel in the rough Comatose is a jewel in the rough Comatose is a jewel in the rough 
Solved Threads: 211
Team Colleague
Comatose's Avatar
Comatose Comatose is offline Offline
Taboo Programmer

Re: noobie asking for assistance with file parsing...

 
0
  #4
Jun 8th, 2005
http://www1.cs.columbia.edu/~lennox/perlre.html is a great site to learn about using multi-line regex's with Perl. There are a few methods there that can be used, the older, depreciated method is to set $*, but the newer methods, as of Perl 5, use an m and or s modifier.

Let me know what you come up with.... so that other people with similar problems can find the resolve here.
Reply With Quote Quick reply to this message  
Join Date: Jun 2005
Posts: 6
Reputation: optomystique is an unknown quantity at this point 
Solved Threads: 0
optomystique optomystique is offline Offline
Newbie Poster

Re: noobie asking for assistance with file parsing...

 
0
  #5
Jun 8th, 2005
thanks again for your assistance., i'll certainly share whatever my findings are.
Reply With Quote Quick reply to this message  
Join Date: Jun 2005
Posts: 16
Reputation: kordaff is an unknown quantity at this point 
Solved Threads: 0
kordaff kordaff is offline Offline
Newbie Poster

Re: noobie asking for assistance with file parsing...

 
1
  #6
Jun 10th, 2005
Here's a bit of code that prints the multi-line error message from that log sample:

  1. #!/usr/bin/perl -w
  2. use strict;
  3. my $log="log";
  4. my $partial="";
  5. my $state=1;
  6. if ( $ARGV[0] ) { $log=$ARGV[0] }
  7. open ( FILE,$log ) or die "Failed to open $log:$!\n";
  8. while ( <FILE> )
  9. {
  10. if ( $state ) # watching two states, this one is waiting on new log line
  11. { check_for_complete_line($_) }
  12. else # 2nd state is waiting on another piece of previous partial line
  13. { $state=1; $partial .= $_ ; check_for_complete_line($partial); }
  14. }
  15. close FILE;
  16. sub check_for_complete_line
  17. { # enters with $state=1 for new line, $state=0 for partial line previous
  18. my $line = shift;
  19. if ($line =~ /^####<(.*)> <(.*)> <(.*)> <(.*)> <(.*)> <(.*)> <(.*)> <(.*)> <(.*)> <.*>$/ms)
  20. {
  21. if ($2 eq 'Error')
  22. { # only 9 positional variables - probably would be better to use split here
  23. print "$1|$2|$3|$4|$5|$6|$7|$8|$9";
  24. $line =~ s/^####<.*> <.*> <.*> <.*> <.*> <.*> <.*> <.*> <.*> <(.*)>$/$1/ms;
  25. print "|$line\n";
  26. }
  27. else {} # handle info/warning
  28. $partial="";
  29. }
  30. else # incomplete log line, save the part we have, indicate state change
  31. { $state=0; $partial = $line }
  32. }
Reply With Quote Quick reply to this message  
Join Date: Dec 2004
Posts: 2,413
Reputation: Comatose is a jewel in the rough Comatose is a jewel in the rough Comatose is a jewel in the rough Comatose is a jewel in the rough 
Solved Threads: 211
Team Colleague
Comatose's Avatar
Comatose Comatose is offline Offline
Taboo Programmer

Re: noobie asking for assistance with file parsing...

 
0
  #7
Jun 10th, 2005
Nice Work There.... Even Commented Some of the code
Reply With Quote Quick reply to this message  
Join Date: Jun 2005
Posts: 16
Reputation: kordaff is an unknown quantity at this point 
Solved Threads: 0
kordaff kordaff is offline Offline
Newbie Poster

Re: noobie asking for assistance with file parsing...

 
0
  #8
Jun 10th, 2005
Thx =) Hmm my first comment in the subroutine is wrong though
It always enters the subroutine in state=1. The comment should have been

# always enters in $state=1 ie: whether it was a fresh line or it had a partial, before it gets to the subroutine, a partial is joined with the current line and it's considered a fresh line.

Kordaff
Reply With Quote Quick reply to this message  
Join Date: Jun 2005
Posts: 6
Reputation: optomystique is an unknown quantity at this point 
Solved Threads: 0
optomystique optomystique is offline Offline
Newbie Poster

Re: noobie asking for assistance with file parsing...

 
0
  #9
Jun 13th, 2005
awesome, kordaff!! that helps a lot and is much appreciated! :mrgreen:
Reply With Quote Quick reply to this message  
Join Date: Jun 2005
Posts: 16
Reputation: kordaff is an unknown quantity at this point 
Solved Threads: 0
kordaff kordaff is offline Offline
Newbie Poster

Re: noobie asking for assistance with file parsing...

 
0
  #10
Jun 13th, 2005
Anything I can do to help =)
Reply With Quote Quick reply to this message  
Reply

This thread is more than three months old.
Perhaps start a new thread instead?
Message:



Similar Threads
Other Threads in the Perl Forum
Thread Tools Search this Thread



About Us | Contact Us | Advertise | DaniWeb | Acceptable Use Policy | RSS Feed

©2003 - 2009 DaniWeb® LLC